Fires of Heaven Guild Message Board  

Go Back   Fires of Heaven Guild Message Board > General forums > General
User Name
Password
ForumSpy Register FAQ Members List Calendar Search Today's Posts Mark Forums Read

Reply
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
Old 09-15-2007, 09:52 PM   #1 (permalink)
AladainAF
Registered User
 
AladainAF's Avatar
 
Join Date: Aug 2002
Location: Texas
Posts: 1,966
funky stress test computer issue

I am burning a system in a stress test in linux using mprime on each of the 4-cores (Q6600 G0 proc). If I run mprime on 3 cores (no matter which three) then it runs fine, but as soon as I start it on the 4th, one of the 4 instances at random will error out. Then the other 3 will run forever.

I am using the stock cooler with this CPU, but doing absolutely no overclocking. I'm using an Asus P5N-E SLI board (newest BIOS), 500W power supply, 8GB of RAM and a super-cheap video card (7200GS) running fedora 7 64-bit (newest kernel).

I don't think its thermal because the CPU doesnt even reach 55c. I don't think its memory because memtest86+ runs all night with no errors. Its not CPU because if I swap it with another Q6600 the same problem occurs (I have the B3 and the G0 stepping, both do this).

Anyone have any ideas? Perhaps an mprime error?
AladainAF is offline   Reply With Quote
Old 09-15-2007, 09:55 PM   #2 (permalink)
Requiem
Site Administrator
 
Join Date: Jan 2002
Location: Cambridge, MA
Posts: 910
+37 Internets
Sounds like it's probably a software problem. The only base you haven't covered is the motherboard. I'd recommend coming up with your own synthetic tests that can get you up to 400% CPU usage. Hey, I know! Join the uberguilds rosetta@home team! =P
__________________
Requiem
Alloria Mistweave
Uberguilds.org, fohguild.org Site Administrator
requiem@fohguild.org
Requiem is offline   Reply With Quote
Old 09-15-2007, 10:00 PM   #3 (permalink)
AladainAF
Registered User
 
AladainAF's Avatar
 
Join Date: Aug 2002
Location: Texas
Posts: 1,966
heh well this machine is going to be multiplied times 40 and put into a cluster doing number crunching on all 4 cores 24/7. Thats why I'm somewhat concerned, but don't have the exact SW they will be using. I don't know what else will peg all 4 cores other than mprime.

Also the failure is not immediately -- it might do 1-2 more tests and then error, but sometimes its happened immediately.

I'll keep toying with it.
AladainAF is offline   Reply With Quote
Old 09-15-2007, 10:55 PM   #4 (permalink)
Requiem
Site Administrator
 
Join Date: Jan 2002
Location: Cambridge, MA
Posts: 910
+37 Internets
...If you're getting 40 of them, might I suggest xeons and supermicro rackmount systems(or dell servers really are worth it if you can afford them)? Using some shitty consumer motherboard with non-registered/ecc ram for that sort of application is nuts. WAY more trouble than it's worth in savings.
__________________
Requiem
Alloria Mistweave
Uberguilds.org, fohguild.org Site Administrator
requiem@fohguild.org
Requiem is offline   Reply With Quote
Old 09-15-2007, 11:14 PM   #5 (permalink)
AladainAF
Registered User
 
AladainAF's Avatar
 
Join Date: Aug 2002
Location: Texas
Posts: 1,966
Thats the plan when the business gets bigger. Right now, on a limited budget. We can get more processing power out of consumer crap for the money than getting some high end shit.

Needless to say, after a few hours, I figured it out. The memory in the system needs to be clocked to 2.2v. I left it at its stock (2.1v) since I was doing no overclocking, but apparently it wanted more. Been running longer now than ever. Oh well.
AladainAF is offline   Reply With Quote
Reply


Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

vB code is On
Smilies are On
[IMG] code is On
HTML code is On
Trackbacks are On
Pingbacks are On
Refbacks are On
uberguilds network



All times are GMT -7. The time now is 03:47 PM.


Powered by vBulletin® Version 3.6.3
Copyright ©2000 - 2008, Jelsoft Enterprises Ltd.
SEO by vBSEO 3.0.0 RC6