We have had problems with bulging capacitors. After some research, I have found that many different systems have this issue (Dell, Intel, etc. motherboards). Open the BFH (box from hell) and check the caps. They may be bulging on top or even leaking. This is from an electrolyte formula that was stolen from Japan. Good luck! Walt On Thu, 2005-10-27 at 11:55 -0400, John Stoffel wrote:
Andy> I think I somehow built the computer from Hell. Recall my dual Andy> Opteron box with SuSE 9.2 and 1 GB of memory.
Bummers, I've been thinking about going the opteron route as well with a dual box like this. Thoughts off the top of my head:
1. Is the BIOS updated? 2. Can you tune/tweak the BIOS settings to more conservative values? 3. Try running kernel 2.6.14-rc5, there were various problems found and fixed with AMD stuff. 4. Try booting a UP kernel -or- Try booting an SMP kernel with 'nosmp' boot option. 5. Try booting with the 'noapic' option. 6. Boot with a serial console, log all output to another system. Try to do magic SysRq from that console when the system hangs.
Andy> I removed every SATA device from my system, and now it doesn't Andy> hang as often, but it still hangs periodically.
Hmm... this points to the problem not being SATA then. Can you disable the SATA chipset from the BIOS level as a test?
Andy> This time I got 10 days of uptime before it hung - a new world's Andy> record on this particular box. Strangely enough, I could Andy> demonstrate more uptime with the "old" SATA code than with Andy> libata - go figure.
How did the system hang? Do you have sysreq enabled? It would be good to know what type of lockups you're getting here. Does it lockup when the system is idle or in use?
Andy> I am suspicious of the 2.6 kernel. We run dual Opteron servers Andy> at work with the 2.4 kernel series with no problems at all (on Andy> RedHat 7.3). I am wondering what would happen if I took SuSE Andy> 9.2 and replaced the 2.6 kernel with a 2.4 kernel. Am I asking Andy> for a heap of trouble?
Not really, though you might have issues if you have binary modules for closed source drivers. I'd grab the 2.4.31 kernel and try it out. It should work fairly well on there, though you might have issues if you use udev for your devices.
Someone else suggested a LiveCD, which I think is a good idea. If you can, trash one of your SATA disks and do a debian/ubuntu install using the 2.4 kernel and see what happens.
We run Rackable Dual/Quad opterons at work with RHEL 3 and they just work. But we're only stressing them with CPU bound jobs, not with devices or graphics or audio or anything like that.
Good luck! John _______________________________________________ Wlug mailing list Wlug@mail.wlug.org http://mail.wlug.org/mailman/listinfo/wlug