Advanced search

Message boards : Graphics cards (GPUs) : Why did this result fail?

Author Message
Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 2317 - Posted: 14 Sep 2008 | 9:58:22 UTC

Acording to Boinc this result completed fine, and uploaded fine.

Yet when looking at the workunit task id it reports it as failed with a Compute error?

Boinc Says:


14/09/2008 08:07:03|PS3GRID|Computation for task PUi5887-GPUTEST3-1-10-acemd_0 finished
14/09/2008 08:07:05|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_1
14/09/2008 08:07:05|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_2
14/09/2008 08:07:25|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_2
14/09/2008 08:07:25|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_3
14/09/2008 08:07:26|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_1
14/09/2008 08:07:28|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_3
14/09/2008 10:34:04|PS3GRID|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks


Yet the task id shows it as failed and displays:




<core_client_version>6.3.10</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA
</stderr_txt>
<message>
<file_xfer_error>
<file_name>PUi5887-GPUTEST3-1-10-acemd_0_0</file_name>
<error_code>-108</error_code>
</file_xfer_error>

</message>
]]>


Any reason as to why, Boinc says it was ok, yet server says it was faulty?
Due to that file transfer error?

____________

Down with the Kredit Kops!!!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 2318 - Posted: 14 Sep 2008 | 10:07:00 UTC - in response to Message 2317.

Acording to Boinc this result completed fine, and uploaded fine.

Yet when looking at the workunit task id it reports it as failed with a Compute error?

Boinc Says:


14/09/2008 08:07:03|PS3GRID|Computation for task PUi5887-GPUTEST3-1-10-acemd_0 finished
14/09/2008 08:07:05|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_1
14/09/2008 08:07:05|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_2
14/09/2008 08:07:25|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_2
14/09/2008 08:07:25|PS3GRID|Started upload of PUi5887-GPUTEST3-1-10-acemd_0_3
14/09/2008 08:07:26|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_1
14/09/2008 08:07:28|PS3GRID|Finished upload of PUi5887-GPUTEST3-1-10-acemd_0_3
14/09/2008 10:34:04|PS3GRID|Sending scheduler request: Requested by user. Requesting 0 seconds of work, reporting 1 completed tasks


Yet the task id shows it as failed and displays:




<core_client_version>6.3.10</core_client_version>
<![CDATA[
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
MDIO ERROR: cannot open file "restart.coor"
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1756000 kilohertz
# Using CUDA
</stderr_txt>
<message>
<file_xfer_error>
<file_name>PUi5887-GPUTEST3-1-10-acemd_0_0</file_name>
<error_code>-108</error_code>
</file_xfer_error>

</message>
]]>


Any reason as to why, Boinc says it was ok, yet server says it was faulty?
Due to that file transfer error?



The server reports a file open error.
gdf

DeleteNull
Send message
Joined: 28 Aug 08
Posts: 10
Credit: 142,385,295
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 2322 - Posted: 14 Sep 2008 | 11:08:34 UTC - in response to Message 2318.


The server reports a file open error.
gdf


Hi, i am using BOINC 6.3.10 with OpenSuse 11.0.

It seems, that:
1. for every WU boinc is downloading the application (6.44) once again, why?
2. the file acemd_6.44_x86_64-pc-linux-gnu__cuda ist not (!!) executable (rw-r--r--) (6.42 was), so the system reports "a file open error".
3. after setting it executable (chmod +x), it will be changd to (rw-r--r--) with the next WU (and gets a new timestamp), so this WU fails again. (see 1.)
4. changing the ownership of this file to "root" doesn't help!!! with the next WU the file is changed again. i don't know how this is possible, BOINC and its sub-processes are running as a non root user, perhaps this is a bug of OpenSuse11.0, Boinc, PS3Grid-application....

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 2323 - Posted: 14 Sep 2008 | 11:18:51 UTC

A good report, but IMO it belongs into the thread for the 6.44 app.

MrS
____________
Scanning for our furry friends since Jan 2002

Post to thread

Message boards : Graphics cards (GPUs) : Why did this result fail?

//