too early bus timeout resets / data curruption instead of completing block-fail retries

Details

Message ID: <a710e79709ac4c66b51b39af6846e505@vodafonemail.de>
DKIM signature: missing




Hello,

are there alpine developers that would like to ship a solution to this
problem with alpine?

I've found a lot of discussion [1] around redundant setups (md, zfs, btrfs,
rolling backup disk syncs) but the default 30 second bus timout will also
completely reset about all standard single drives, before they would 
return proper single-block reading retry success or error.

The solution seems to just be a matter of shipping udev rules that make
sure the bus timeout (/sys/block/sdX/device/timeout) is longer than the
drive's retry timeout.

Looks like the latest proposed udev rules have been uploaded as attachment
here:

https://www.smartmontools.org/ticket/658
"Many (long) HDD default timeouts cause data loss or corruption
(silent controller resets)"

---

[1] For example:
RAID, SMART timeout (STCERC) and drives.
https://forum.openmediavault.org/index.php?thread/25398-raid-smart-timeout-stcerc-and-drives-how-to-set-them-correctly-in-omv/

~alpine/devel

too early bus timeout resets / data curruption instead of completing block-fail retries