I saw this article today about rebooting a frozen system without having to hit the power button, which looks like it might be useful. We do have this happen once in a while - usually for no readily apparent reason, and since I tend to go by the “once is an anomaly, twice in quick succession is time to investigate” rule, I confess I don’t look too hard.

Also very useful today was step-by-step guide to installing IRAF on Debian (literally tells you exactly what to type). IRAF is an astronomy program, so this information is probably of use to a very small subset of people, but still. I have found it a PITA to install, partly because it insists on belonging to its own user rather than root. This install (of the latest beta) went very smoothly, although I haven’t heard back from the requesting user yet.

Finally, a note on RAID5. A fortnight ago two drives on my RAID5 array failed (I was out of the office at the time & can’t find anything in the logs to indicate why). Much beeping ensued, drive appeared dead (suspicion was that the hotspare had failed partway through the rebuild, i.e. all data would be lost). The very competent support engineer managed to fix it after several hours (the old drives resurrected themselves; neither of us have any idea why), and I’ve now replaced both faulty drives and done the required rebuilds.

However: it reminded me of concerns I’ve had in the past about the reliability of SATA drives, and the deceptive appearance of redundancy that RAID5 has. Whilst in theory you can lose up to two drives and still not lose your data, in practice a) drives may tend to fail around the same time, and b) the thrashing caused by a rebuild may be enough to trigger failure in another drive (in which case you lose data).

I’m not sure if there are any much better alternatives, though; I guess the lesson is ALWAYS BACK UP. (Yes, I did have backups, even if I didn’t need them in the end!)