Saturday, January 9, 2016

SCSI driver Error Handaling (EH) - how multipath behave when few paths of a device fail

Question? 
It is challenge even for most seasoned system admin to explain how the IO will behave when few paths of device fail but some paths are still active. What is your answer ….?
.
.
.
.
.
.
.
.
.
.
.
.  
.
.
.
.
.
.
.
.
.
.
Most people answer this – IO will continue via remaining active paths.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
And…….. that is wrong !!


Let me explain, it with extreme case – say one of remote storage port (rport) of storage frame failed.  So if, there were 2 active paths of device, now device have only one good path and one failed path.

As soon as Linux kernel scsi (sc) driver will detect a failed path, it will quiesce HBA (yes, complete HBA) and wait for all outstanding IO to complete or timeout [1]. Then SCSI layer will activate error handaler. NO IO WILL BE SUBMITTED until error recovery completes – even though one path is failed and still one good paths available. This design is there to avoid any data corruption.

Device Recovery Steps

11-      Abort the command after specified scsi device timeout value defined in /sys/block//device/timeout [1]

When the error handler is triggered, it attempts the following operations in order (until one successfully executes or all options exhausts):

22-      Wait several seconds to hope that remote port become online (if device is Fiber Channel Device – not applicable for SCSI device)
33-      Activate Error Handler and do following in sequence
a.       Reset the device
b.      Reset the bus
c.       Reset the Host

First, scsi driver try to reset device and then bus. If it is not successful, and adapter firmware and device drive decide that adapter has not completed full recovery, adapter will be hard reset. It means, all paths of a disk via that adapter will be unavailable for few moment – irrespective, if they have failed are healthy. Hard reset happens when the I/O is black-holed with NOP response in the fabric.  Since, IO had been frozen by scsi drive, there is no change of IO request drop or data corruption.

Case-1: If all of above will fail, device will be set to offline. It means, complete device is not available via any path. It need manual intervention to look at system/storage logs, find problem, fix it , scan HBA to detect active  paths and make device online/running state.

Case-2: If recovery succeed,  path check heuristics of multipath will mark dead paths as failed. Now, IO to device will continue via remaining active path.


Reduce device recovery time

To reduce overall recovery time, upgrade kernel (to version 2.6.18-371.6.1 or higher) and  device_mapper (to version 0.4.7-63 or higher) to latest release to leverage on time related parameters such as :

11-      scsi driver Error Handaling (EH) timeout – eh_timeout (from default 10 second to 5 seconds)  [4]
22-      HBA port reset time - eh_deadline ( from disable/0 to 5 seconds) [5]
33-      Adpater reset time e.g. Qlogic reset time [2]
Add the following to /etc/modprobe.conf an recreate initrd
options qla2xxx ql2xextended_error_logging=1 qlport_down_retry=10 ql2xloginretrycount=10
44-      Multipath check_timeout  (reduce to 10 seconds from default 60 seconds) [3]

Reference 

[3] /usr/share/doc/device-mapper-multipath-0.4.7/multipath.conf.annotated


Additional Reference

3-      Redhat KB  this and this


No comments:

Post a Comment