Author |
Message |
ziorigaSend message
Joined: 30 Oct 08 Posts: 46 Credit: 494,132,425 RAC: 3,861,070 Level
Scientific publications
|
I suspended the GPUGRID project for a while and, when I restarted, the WU crashed with a compute error
12/20/08 06:59:58|GPUGRID|Restarting task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 using acemd version 655
12/20/08 07:00:01|GPUGRID|Computation for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 finished
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_1 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_2 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_3 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:03|GPUGRID|Started upload of kq21298-SH2_US_1-4-40-SH2_US_1510000_1_0
|
|
|
K1atOdessaSend message
Joined: 25 Feb 08 Posts: 249 Credit: 387,028,788 RAC: 1,197,795 Level
Scientific publications
|
I suspended the GPUGRID project for a while and, when I restarted, the WU crashed with a compute error
12/20/08 06:59:58|GPUGRID|Restarting task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 using acemd version 655
12/20/08 07:00:01|GPUGRID|Computation for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 finished
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_1 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_2 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:01|GPUGRID|Output file kq21298-SH2_US_1-4-40-SH2_US_1510000_1_3 for task kq21298-SH2_US_1-4-40-SH2_US_1510000_1 absent
12/20/08 07:00:03|GPUGRID|Started upload of kq21298-SH2_US_1-4-40-SH2_US_1510000_1_0
I noticed is that your video card changed during the most recent failed WU Task 167292. It switches from being identified an 8600GT to a 9800GT. Did you switch video cards in the middle of the WU? I can't be sure but that seems like a good reason the WU might have crapped out on you.
# Device 0: "GeForce 8600 GT"
# Clock rate: 1188000 kilohertz
# Number of multiprocessors: 4
# Number of cores: 32
# Using CUDA device 0
# Device 0: "GeForce 9800 GT"
# Clock rate: 1512000 kilohertz
# Number of multiprocessors: 14
# Number of cores: 112 |
|
|
ziorigaSend message
Joined: 30 Oct 08 Posts: 46 Credit: 494,132,425 RAC: 3,861,070 Level
Scientific publications
|
K1atOdessa, You're right.
I switched from 8600GT to 9800GT in the middle of a WU.
I promise, I'll never do it again !!!! |
|
|
K1atOdessaSend message
Joined: 25 Feb 08 Posts: 249 Credit: 387,028,788 RAC: 1,197,795 Level
Scientific publications
|
K1atOdessa, You're right.
I switched from 8600GT to 9800GT in the middle of a WU.
I promise, I'll never do it again !!!!
LOL. It's always good to set BOINC to no new tasks and finishing everything up before adding or changing a video card. Takes another variable out of the equation. |
|
|
|
Just got an error in this WU; 174101
22/12/2008 11:24:05 PM|GPUGRID|Computation for task WM20233-SH2_US_1-7-40-SH2_US_1210000_0 finished
22/12/2008 11:24:05 PM|GPUGRID|Output file WM20233-SH2_US_1-7-40-SH2_US_1210000_0_1 for task WM20233-SH2_US_1-7-40-SH2_US_1210000_0 absent
22/12/2008 11:24:05 PM|GPUGRID|Output file WM20233-SH2_US_1-7-40-SH2_US_1210000_0_2 for task WM20233-SH2_US_1-7-40-SH2_US_1210000_0 absent
22/12/2008 11:24:05 PM|GPUGRID|Output file WM20233-SH2_US_1-7-40-SH2_US_1210000_0_3 for task WM20233-SH2_US_1-7-40-SH2_US_1210000_0 absent
caused a BSOD under the dxgkrnl. I have the memory dump file but it's 568Mb.
Pat |
|
|
|
Now I have this problem on Host 9545, a C2Q Q9450 with GeForce 9800GT running Win XP x64. I did not change anything, not even a reboot, but all WUs have crashed since this morning. Reset did not help. :(
26.12.2008 13:41:58|GPUGRID|Starting task CZG9573-SH2_US-7-40-SH2_US200000_1 using acemd version 655
26.12.2008 13:41:59|GPUGRID|Computation for task CZG9573-SH2_US-7-40-SH2_US200000_1 finished
26.12.2008 13:41:59|GPUGRID|Output file CZG9573-SH2_US-7-40-SH2_US200000_1_1 for task CZG9573-SH2_US-7-40-SH2_US200000_1 absent
26.12.2008 13:41:59|GPUGRID|Output file CZG9573-SH2_US-7-40-SH2_US200000_1_2 for task CZG9573-SH2_US-7-40-SH2_US200000_1 absent
26.12.2008 13:41:59|GPUGRID|Output file CZG9573-SH2_US-7-40-SH2_US200000_1_3 for task CZG9573-SH2_US-7-40-SH2_US200000_1 absent
26.12.2008 13:42:01|GPUGRID|Started upload of CZG9573-SH2_US-7-40-SH2_US200000_1_0
26.12.2008 13:42:03|GPUGRID|Finished upload of CZG9573-SH2_US-7-40-SH2_US200000_1_0 |
|
|
|
The task output says you get the error
"Cuda error in file 'deviceQuery.cu' in line 59 : out of memory."
which means some app reserved so much GPU memory that there's not enough left for GPU-Grid. That's a common error on 64 Bit win after a certain runtime, but your situation is different (already tried the reboot).
I'd do 2 things: shut down, remove the power cord for >10 mins and try again. And I'd install the 180.84 driver, if not already done, which fixes the memory leak.
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
|
Thank you. It's running again after driver update, no problems so far. :) |
|
|