Tuesday, June 17, 2014

Steps to reactivate Redhat Proxy Server

If you are getting following error in your Redhat Proxy Serve log /var/log/rhn/rhn_proxy_broker.log, you need to re-activate your Redhat Proxy Server (provided you have enough license)

proxy/rhnProxyAuth.login('ERROR', '
Server\\n    

Step to Re-activate Redhat Proxy Server

>> On Redhat Proxy: Get SYSTEM_ID of Redhat Proxy Server

# grep ID /etc/sysconfig/rhn/systemid
ID-1234567890

>> On Redhat Proxy: Remove all child channels (leave parents)

# rhn-channel -l
custom-rhel-x86_64-server-7
custom-rhel-x86_64-7-myorg
custom-rhel-x86_64-7-mydb
custom-rhel-x86_64-7-myapp
# rhn-channel --remove -c custom-rhel-x86_64-7-myapp -u user_name

>> On redhat Satellite: change parrent channel to rhel-x86_64-server-7. ( use correct SYSTEM_ID)

# spacewalk-api --server=satellite.company.com system.setBaseChannel "%session%" 1234567890  rhel-x86_64-server-7 -u user_name 

>> On Web UI, subscribe system to "RHN Tools" software  channel.

>> On Redhat Proxy: Ensure spacewalk-proxy-installer is installed and install required ssl certificates

# rpm -qa |grep spacewalk-proxy-installer
# ls -l /usr/share/rhn/RHN-ORG-TRUSTED-SSL-CERT

>> On RPS: Activate satellite

# rhn-proxy-activate --server=satellite.company.com --version=5.4 --ca-cert=/usr/share/rhn/RHN-ORG-TRUSTED-SSL-CERT
--server (RHN parent):  satellite.company.com
--http-proxy:
--http-proxy-username:
--http-proxy-password:
--ca-cert:              /usr/share/rhn/RHN-ORG-TRUSTED-SSL-CERT
--no-ssl:               false
--version:              5.4

>> On any client using Redhat Proxy, run below

# yum check-update

>> On Redhat Proxy : check logs to confirm it works

# tailf /var/log/rhn/rhn_proxy_broker.log

References

* Redhat KB
* Proxy Installation Document




Wednesday, June 11, 2014

dmsetup and multipath did not resume multipath device paths after a LUN failure

Problem

> System logs shows that all paths of a multiapth LUNs failed at some point of time (unknown reason)
> After sometime all paths were available and accessible. Below commands on each paths shows each device path as running.

#  cat /sys/block/sdyg/device/state
    running
#  cat /sys/block/sdabm/device/state
    running

> multiapth output disk was used by orable ASM. Though all paths were available now, multipath was still sowing all paths as failed.

# multipath -l a_multipath_device_01
a_multipath_device_01 (51230050123000b1230001e000212120000) dm-158 HP,HSV210
[size=250G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:1:62  sdabm 134:576 [failed][undef]
 \_ 1:0:3:62  sdadc 8:992   [failed][undef]
 \_ 0:0:1:62  sdyg  129:512 [failed][undef]
 \_ 0:0:2:62  sdzb  130:592 [failed][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:2:62  sdach 135:656 [failed][undef]
 \_ 0:0:3:62  sdzw  131:672 [failed][undef]
 \_ 0:0:4:62  sdaar 132:752 [failed][undef]
 \_ 1:0:4:62  sdadx 66:816  [failed][undef]


Observation

> dmsetup status show all paths as failed ( F next to each path). Thought all paths have come back in running state after failure.

# dmsetup status a_multipath_device_01
0 524288000 multipath 2 3 0 0 2 1 E 0 4 0 134:576 F 3377 8:992 F 3377 129:512 F 3378 130:592 F 3378 E 0 4 0 135:656 F 3377 131:672 F 3377 132:752 F 3377 66:816 F 3378

>> dmsetup info still looks good  !

# dmsetup info a_multipath_device_01
dmsetup info a_multipath_device_01
Name:              a_multipath_device_01
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      2039343
Major, minor:      253, 158
Number of targets: 1
UUID: mpath-51230050123000b1230001e000212120000

>> Multipath device a_multipath_device_01 is not accessible by application  and system logs shows error when a_multipath_device_01 is being  accessed.


Solution

> Below commands file with message - device map is in use

# multipath -f a_multipath_device_01
 device map in use

> dd command on each path successful (means each path are good)

# dd if=/dev/sdabm of=/dev/null  # no error
# dd if=/dev/sdzw of=/dev/null    # no error

> Generated table of a working device

# dmsetup status  another_mpath_device_02 > /tmp/table.txt

> Replace major:minor number in file with major:minor number of faulty device paths

# vi /tmp/table.txt
# cat /tmp/table.txt
0 524288000 multipath 1 queue_if_no_path 0 2 1 round-robin 0 4 1 134:576 1000 8:992 1000 129:512 1000 130:592 1000 round-robin 0 4 1 135:656 1000 131:672 1000 132:752 1000 66:816 1000

> Load table for device from file

# dmsetup reload  a_multipath_device_01 /tmp/table.txt

>> dmsetup status still show error

# dmsetup status a_multipath_device_01
0 524288000 error

> dmsetup info shows table state as INACIVE 

# dmsetup info a_multipath_device_01
Name:              a_multipath_device_01
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE & INACTIVE
Open count:        1
Event number:      2039343
Major, minor:      253, 158
Number of targets: 1
UUID: mpath-51230050123000b1230001e000212120000

> Resume device to make its table active

# dmsetup resume  a_multipath_device_01

> And device will be good to use now !

# dmsetup status a_multipath_device_01
0 524288000 multipath 2 0 0 0 2 1 E 0 4 0 134:576 A 0 8:992 A 0 129:512 A 0 130:592 A 0 E 0 4 0 135:656 A 0 131:672 A 0 132:752 A 0 66:816 A 0

# multiapth -l a_multipath_device_01
 [size=250G][features=1 queue_if_no_path][hwhandler=0][rw]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:1:62  sdabm 134:576 [active][undef]
 \_ 1:0:3:62  sdadc 8:992   [active][undef]
 \_ 0:0:1:62  sdyg  129:512 [active][undef]
 \_ 0:0:2:62  sdzb  130:592 [active][undef]
\_ round-robin 0 [prio=0][enabled]
 \_ 1:0:2:62  sdach 135:656 [active][undef]
 \_ 0:0:3:62  sdzw  131:672 [active][undef]
 \_ 0:0:4:62  sdaar 132:752 [active][undef]
 \_ 1:0:4:62  sdadx 66:816  [active][undef]

# dmsetup info a_multipath_device_01
Name:              a_multipath_device_01
State:             ACTIVE
Read Ahead:        256
Tables present:    LIVE
Open count:        1
Event number:      2039343
Major, minor:      253, 158
Number of targets: 1
UUID: mpath-51230050123000b1230001e000212120000


Ideally, once all path of devices are available, dmsetup should change F (failed ) status to A (active) status  and multipath should show all path as active. Why it has not happened automatically ?

Monday, June 9, 2014

Steps to setup Veritas Volume Replication - VVR

Below are steps to setup volume replication from one system (singapore) to another (india)

Do following on both systems

> Install SF, latest patches, SF and VVR licenses

> You need at least 2 disks on each nodes - one for data another for SRL. On both nodes, disks size should be same.

> Create data and srl volume

vxdisk list
vxdisksetup -i data_disk
vxdisksetup -i srl_disk
vxdg init app_dg  app_data_1=data_disk
vxdg -g app_dg adddisk app_srl_1=srl_disk
# Create data and srl volume of same size as remote system
vxprint |grep ^v  # on remote system
vxassist -g app_dg make app_vol 1g app_data_1
vxassist -g app_dg make srl_vol 10g app_srl_1
vxprint |grep ^v  # on both system - compare size
/opt/VRTS/bin/mkfs -f vxfs /dev/vx/rdsk/app_dg/app_vol
mount -t vxfs /dev/vx/dsk/app_dg/app_vol /mnt
df -hP /mnt ; umount /mnt


> Start VVR and exusre below 3 ports are up on boths ystem

/usr/sbin/vxstart_vvr start
ps -ef|grep vradmind
netstat -nlp|grep 4145 # udp and tcp both - below 2 ports are only ycp
netstat -nlp|grep 8199
netstat -nlp|grep 8989

> Bring up VVR IP on nodes ( Ideally, we should be using  dedicated NIC interface for data replication)

> Do cross ping and and ensure ports are reachable. Repeat it for each port and test it in both direction

singapore# ping india
india# ping singapore
singapore# nc -l 4145
india# telnet singapore 4145

Ctrl ]
quit

> Add DG IDs on remote nodes (a nodes will have remote node dg id)

vxlist app_dg
vi /etc/vx/vras/.rdg

On Primary Site

> Create RVG (Singappore)
vradmin -g app_dg createpri app_rvg app_vol srl_vol

> Add secondary
vradmin -g app_dg addsec app_rvg singapore india

> Start replication
vradmin -g app_dg -a startrep app_rvg india  # Use -a first time start - auto replication
vradmin -g app_dg repstatus app_rvg
vxprint -Pl
vxprint -Vl

> Move replication role to secondary
umount /mount/point
vradmin -g app_dg migrate app_rvg india
vradmin -g app_dg repstatus app_rvg

On Secondary 

> Mount filesystem
mount -t vxfs /dev/vx/dsk/app_dg/app_vol /mnt

> Increase data voume ( it will extend on remote system as well)
vradmin -g app_dg -f resizevol app_rvg app_vol 999g




Data migration from ext3 filesystem on LVM using Multipath Disk to VxFS filesystem on Veritas DMP LUN

Ideally, a LUN should not be in control of two multipathing software. In fact, we do not need to run multiple multipathing software on same system except when we are migrating one to another.

While migrating Linux Multipath to Veritas DMP, you may need to to copy data from ext3 filesystem on LVM using Multipath Disk to VxFS filesystem on Veritas DMP LUN.

Below are steps to copy data from ext3 /app_fs  to vxfs /app_fs_vxfs

> Assign new LUNs to be used for DMP

> Blacklist DMP LUNs in multiapthd.conf. Multipath will ignore these disks.

vi /etc/multipath.conf
    blacklist {
           wwid 7600508b4000bd0070000b00005320000
           wwid 7600508b4000bd0070000b00009820000
    }
multipathd reload

> Install Veritas SF ( it will isntalld DMP)

> Exclude local disks and Linux Multipath LUNs from DMP
vxdisk -p list
vxdmpadm listctlr
vxdmpadm exclude ctlr=c360xxxx
vxdmpadm exclude dmpnodename=disk_XX
cat /etc/vx/vxvm.exclude
vxdisk list

> Create VxFs filessytem on DMP LUN
vxdisk list
vxdisksetup -i disk_XX
vxdg init app_dg  app_data_1=data_disk
vxassist -g app_dg make app_vol 1g app_data_1
/opt/VRTS/bin/mkfs -f vxfs /dev/vx/rdsk/app_dg/app_volume_name
mount -t vxfs /dev/vx/dsk/app_dg/app_volume_name /app_fs_vxfs
df -hP /mnt

> Copy data from ext3 to vxfs filesystem

rsync -av /app_fs  /app_fs_vxfs

> Rename filesystem
umount /app_fs  /app_fs_vxfs
- add below in /etc/fstab
vi /etc/fstab
/dev/vx/dsk/app_dg/app_volume_name   /app_fs    vxfs   defaults 0 1
mount /app_fs

> Flush multiapth LUN and delete all paths of multiapth LUN ( where n is last number of LUN displayed in multipath -l output x:y:z:n)

lvchange -an lv_name # If there is Ligical Volume on multipath LUN
multipath -f mpathN   # mpathN == multipath device name
LUN=n
 for scsi_id in $(ls /sys/bus/scsi/drivers/sd/ | grep ":$LUN$"); do 
echo 1 > /sys/bus/scsi/drivers/sd/$scsi_id/delete; done

> Stop multiapth and remove package
service multiapthd stop
yum remove device-mapper-multipath