Advanced search

Message boards : Number crunching : Sudden Issue with GTX570

Author Message
Profile K1atOdessa
Send message
Joined: 25 Feb 08
Posts: 249
Credit: 392,702,681
RAC: 1,417,376
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27285 - Posted: 12 Nov 2012 | 4:49:05 UTC

I've been running this host for a long time, but as of a few hours ago it began encountering error after error. Given these are all Nathan tasks I've never had issues with, I imagine something went wrong on my system.

After a reboot, I still get the same error:

<core_client_version>7.0.28</core_client_version>
<![CDATA[
<message>
The system cannot find the path specified. (0x3) - exit code 3 (0x3)
</message>
<stderr_txt>
MDIO: cannot open file "restart.coor"
SWAN : FATAL : Cuda driver error 999 in file 'swanlibnv2.cpp' in line 1574.
Assertion failed: a, file swanlibnv2.cpp, line 59

This application has requested the Runtime to terminate it in an unusual way.
Please contact the application's support team for more information.

</stderr_txt>
]]>


Any ideas of what I should try? I switched this over to some other projects and have completed WU's successfully, running with the same settings that have worked since I got the card. I would have expected some weird error to be fixed by the restart, so now I'm not sure. Nothing changed on my system, as it completed numerous WU's successfully since the last reboot until this point when everything hit the fan.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 27288 - Posted: 12 Nov 2012 | 17:19:20 UTC - in response to Message 27285.
Last modified: 12 Nov 2012 | 17:40:25 UTC

I haven't experienced anything similar, yet.
While I checked a few other systems that are known to be good and it doesn't seem to be a common error, most of the tasks that failed for you failed for other users, and these other users have numerous recent failures, but many of these systems worked ok until recently. So perhaps there is a malformed batch or something that's blitzing a few systems?

This system belonging to Tombi for example is failing lots of WU's.
Looks like runaway failures on some systems. I would suggest a cold-boot; Shut down, remove power cable for a min, reconnect and start up.

You might want to try a driver reinstall, checking the firewall, doing a disk check, check fan-control software is still working, GPU temps and that fans are still turning, and reset project.
My impression is that this problem impacts some systems but not others.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Post to thread

Message boards : Number crunching : Sudden Issue with GTX570

//