Date:   2003-11-05 02:16:55
From:   anonymous2
Response to: an important piece is missing.....

There is now way that a external software can always find a intermittent memory fault.
You can do some things that is acceptable on a workstation that do graphics but if you do science or manufacturing work were every result is important you basically have to recalculate everything twice on different nodes, this will lower the peak performance with 50% but is the only way to know that the result is the right one.

This is a fantastic system but for organization were the result has to be correct they better look at a system with ECC.

    Wow, anonymous, it's too bad all of those supercomputing guys didn't ask you! You could have set them straight before they wasted all that time and money. I guess you'd better let the folks at know, quick - they must have missed it before now.

    OR, just MAYBE, you have no idea what you're talking about. It's a relatively simple exercise in software development to identify spurious results via multipl iterations. ECC memory won't protect you from processor faults and other glitches anyway, so the software has to be robust enough to allow for bad results even with expensive memory.

    But thanks for playing.