Author |
Message |
farSend message
Joined: 5 Jan 09 Posts: 32 Credit: 1,412,042,305 RAC: 0 Level
Scientific publications
|
Hi
I don't switch to the different machines I have grid computing particularly often
and I just noticed an error message for a task that had failed - it had put a little window up needing OK to be clicked.
In Boinc manager I could see the GPUGrid task had been running for 3 days waiting for someone to click OK its failed, lets move on. As soon as I clicked OK, the task status shifted to computing error, and the next task started.
Is there any way this can be avoided? I assume I just lost 3 days of GPU processing and hopefully not electricity too.
Thanks,
Far |
|
|
|
I had the same issue when I was using the 295 nvidia drivers. CUDA would crash when the monitor went to sleep and a new task was started. Either downgrade or upgrade the drivers.
____________
XtremeSystems.org - #1 Team in GPUGrid |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Far, I had the same issue recently. I expect this happened on one of your XP systems?
Anyway, a restart is in order. Also, change the monitor to never turn itself off, and just turn it off manually.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
|
Recently I was creating quite a few errors here (my fault) and this never happened on 2 hosts. So it's definitely some special case on your side. Don't know what causes it, though. Trying a never driver is a good idea. And maybe you recently installed some GPU programming tools which activated some debug mode, which causes this message?
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
There is a way to restart your pc whenever an error message pops up. I've wrote a little batch program when I had similar error messages.
Here it is.
You have to modify (or add) the filenames of the GPUGrid applications in the first two lines, like this. |
|
|
farSend message
Joined: 5 Jan 09 Posts: 32 Credit: 1,412,042,305 RAC: 0 Level
Scientific publications
|
Thanks for all the suggestions guys. I'm running the latest (non-beta) drivers 306.81, on XP. There are no GPU programming tools installed.
The monitor is already set to never sleep (as is anything else under the energy profile).
I can't reboot the machine as it has other processes running which require a pw/uid logon and I can't store the info in a file anywhere.
It sounds like the simplest option is just for me to try to remember to connect to the machines to checkup on them more often. I'm going to have a look at the eFMer boinc app at some point when I can find time, in case I can spot from my phone that something is taking an unduly long amount of time and go check it out. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Use the auto-logon and if you want to run some app, put it in the startup folder using a run as batch file.
I guess you could also use Zoltan's script and set up an administrator alert by email, or disable and re-enable the card/driver, but exactly how to get the alert going could be tricky and it's a fair bit of work.
____________
FAQ's
HOW TO:
- Opt out of Beta Tests
- Ask for Help |
|
|
|
Instead of checking each machine manually you could look at your hosts in GPU-Grid, under your account, and see when they last contacted the server.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|