Author |
Message |
Zydor Send message
Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level
Scientific publications
|
The final line - unknown error - probably gives a subtle clue :) However, anything about this one strike anyone? I had some dramas 2 weeks ago as motherboards melted around me, but thats sorted now and the errent PCs rebuilt. The GPUGrid WUs are remarkably stable as far as I am concerned - plaudits to the devs - so its unusual for me to see this. Just curious in case there is something obvious I did to cause it.
Regards
Zy
<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9800 GTX/9800 GTX+"
# Clock rate: 1915000 kilohertz
# Total amount of global memory: 536870912 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9800 GTX/9800 GTX+"
# Clock rate: 1915000 kilohertz
# Total amount of global memory: 536870912 bytes
# Number of multiprocessors: 16
# Number of cores: 128
Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error.
</stderr_txt>
]]>
|
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
Do you have another client like s@h on cuda running also ?
Because when i stopped the simultanous run with the other cuda client it seems to run ok again now.
But time will tell i am on second unit |
|
|
Zydor Send message
Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level
Scientific publications
|
I do run S@H CUDA, but not at the same time. I split the time the card is used between SETI & GPUGrid, at present its about 50% to each. I will crunch a GPUGrid WU, then when its done, run SETI CUDA for 12 hours or so. From time to time I might end up suspending the GPRGrid WU for a couple of hours or so (set to unload from memory not suspend held in memory), for many reasons, but that does not seem to have caused an issue in the past.
The hassle a couple of weeks ago in the crunching - self evident from my task list - was due to multiple hardware failures on two PCs, ending up rebuilding both - new motherboards, cpus, cards etc - just Murphy's Law hitting all at once. The GPUGrid WU is - from where I sit - remarkably stable [thank you to the Devs - nice one :) ], and I can safely say, never had any problem attributed to the WU itself, issues were caused by me or my equipment. Thats why I was curious on this one, as all appears ok this end, and its so rare to have a WU issue, I thought there may be something obvious I have missed.
At the end of the day, the world has not ended, just a natural follow up in case there is conventional wisdom I need to be aware of, and have missed.
Regards
Zy |
|
|
|
At the end of the day, the world has not ended, just a natural follow up in case there is conventional wisdom I need to be aware of, and have missed.
There may be some wisdom be to found here, but I don't think it's common ;)
MrS
____________
Scanning for our furry friends since Jan 2002 |
|
|
Bymark Send message
Joined: 23 Feb 09 Posts: 30 Credit: 5,897,921 RAC: 0 Level
Scientific publications
|
I have almost the same error and not a clue what’s it's about:
Regards Thomas
-----------------------------------------------------------
Outcome Client error
Client state Compute error
Exit status 1 (0x1)
Computer ID 28987
Report deadline 16 Mar 2009 10:18:06 UTC
CPU time 492.5469
stderr out <core_client_version>6.4.5</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce GTX 260"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 27
# Number of cores: 216
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce GTX 260"
# Clock rate: 1242000 kilohertz
# Total amount of global memory: 939196416 bytes
# Number of multiprocessors: 27
# Number of cores: 216
Cuda error: Kernel [frc_sum_nb_forces] failed in file 'force.cu' in line 244 : unknown error.
</stderr_txt>
]]>
Validate state Invalid
Claimed credit 2478.98611111111
Granted credit 0
application version 6.62
The final line - unknown error - probably gives a subtle clue :) However, anything about this one strike anyone? I had some dramas 2 weeks ago as motherboards melted around me, but thats sorted now and the errent PCs rebuilt. The GPUGrid WUs are remarkably stable as far as I am concerned - plaudits to the devs - so its unusual for me to see this. Just curious in case there is something obvious I did to cause it.
Regards
Zy
<core_client_version>6.5.0</core_client_version>
<![CDATA[
<message>
Incorrect function. (0x1) - exit code 1 (0x1)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 9800 GTX/9800 GTX+"
# Clock rate: 1915000 kilohertz
# Total amount of global memory: 536870912 bytes
# Number of multiprocessors: 16
# Number of cores: 128
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 9800 GTX/9800 GTX+"
# Clock rate: 1915000 kilohertz
# Total amount of global memory: 536870912 bytes
# Number of multiprocessors: 16
# Number of cores: 128
Cuda error: Kernel [frc_sum_kernel_bond] failed in file 'force.cu' in line 283 : unknown error.
</stderr_txt>
]]>
____________
"Silakka"
Hello from Turku > Åbo. |
|
|
Zydor Send message
Joined: 8 Feb 09 Posts: 252 Credit: 1,309,451 RAC: 0 Level
Scientific publications
|
I've had a couple of BSOD's lately, there is a driver clash somewhere, so I came to the (guessing) conclusion it was probably related to the BSODs, and put it down to Life's Sweet Pattern :) If it happens again in a short space of time, I'll "perk up" and dig a little.
Regards
Zy |
|
|
Bymark Send message
Joined: 23 Feb 09 Posts: 30 Credit: 5,897,921 RAC: 0 Level
Scientific publications
|
My error was a remote desktop type, and now all working fine.
Do newer use windovs remote desktop in gpu projekts !
Regards
Silakka
|
|
|
uBronan Send message
Joined: 1 Feb 09 Posts: 139 Credit: 575,023 RAC: 0 Level
Scientific publications
|
lol agreed i have learned that very soon after running seti
it crashed several times when doing rdc :D
|
|
|