Author |
Message |
AndrewSend message
Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level
Scientific publications
|
Hey, just thought I'd let the developers know,
I was running Folding@home GPU, and while that was still going, resumed my GPUGRID WU. My GPUGRID WU then immediately had a computation error. And after 12 hours of work. Oops.
Perhaps this is a rare error, but perhaps there is some mishandling of CUDA exceptions, or could just be drivers.
FYI I did once start Folding@Home GPU while GPUGrid was going, which didn't result in errors (however I think F@H stole all the GPU rather than sharing!)
I only tried because I was curious btw - I wouldn't recommend running things in parallel as i suspect it would lead to lots of cache misses in the GPU's caches. |
|
|
|
Not only cache misses but "CUDA" misses as well.
Since the GPU can only do one task at a time it will constantly be switching tasks which is a very slow operation on the GPU.
I didn't even bother testing it here at GPU grid but at seti the tasks took 20x longer while running F@H and there was also a 50% ppd reduction on F@H.
So unless someone wants things to run over 20x slower... don't run both at the same time.
Bob |
|
|
AndrewSend message
Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level
Scientific publications
|
Haha, looks like it wasn't just me who was curious :P
Still shouldn't have had the error I think, but you're right, not a good idea! I was wondering since Vista virtualizes the GPU as a shared resource.
In fact, I started off with GPUGrid on BOINC, but back then the timestep was too big and the desktop (Aero) was kind of unusable (btw switching to software desktop seemed to be more jittery). So I switched to F@H for a while, before trying BOINC again. I tend to do more GPUGrid now 'cos it uses the GPU less efficently, so I can run my 8800GT overclocked while the fan is still quiet. |
|
|
|
Was it this WU? It's an "out of memory" error, which happens when some app (e.g. a game) occupies so much GPU memory that there's not enough left for GPU Grid. It's not a nice way to error out in such cases, but it's a known problem. If it's indeed this WU (you also had 2 other errors, which may be caused by the OC) then it looks like F@H reserves quite a lot of GPU memory, as GPU-Grid itself doesn't need that much (~70 MB with old WUs).
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
AndrewSend message
Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level
Scientific publications
|
Ok, must have been out of memory I guess (does have 512MB, but can't monitor GPU mem usage in vista).
Is it the other two you think may have been OC errors:
http://www.gpugrid.net/result.php?resultid=572872
http://www.gpugrid.net/result.php?resultid=588056
However, while the second one is overclocked by 20%, the first one is actually underclocked (62% of stock). So I'm not sure about OC error.
Quite high claimed credit for some of the aborted ones, not sure why that would have happened as i'm sure i wouldn't cancel nearly-finished WUs!
Whatever though, although seeing a line pointing up is nice, it's really for the science, so I'll monitor to check it's working fine in the future. |
|
|
|
Didn't check individually, but you only have 2 other errors. And you're right, 920 MHz is not exactly an excessive speed for your card ;) It was only a speculation anyway, so as long as you don't get more errors never mind these two.
And the credits per WU are fixed so they don't depend on crunching time, that's why the claim appear high for the errors.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
AndrewSend message
Joined: 9 Dec 08 Posts: 29 Credit: 18,754,468 RAC: 0 Level
Scientific publications
|
Cheers |
|
|