Advanced search

Message boards : Graphics cards (GPUs) : Linux 64 application 6.38 for cuda2.0

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1664 - Posted: 23 Aug 2008 | 22:24:13 UTC

I have tested with BOINC 6.3.8, driver 177.67, Gefore 8800 GT and fedora 8.

gdf

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1666 - Posted: 23 Aug 2008 | 23:48:09 UTC - in response to Message 1664.
Last modified: 24 Aug 2008 | 0:02:49 UTC

I have tested with BOINC 6.3.8, driver 177.67, Gefore 8800 GT and fedora 8.

gdf


It is likely that people with older drivers will have to update.

gdf

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1670 - Posted: 24 Aug 2008 | 4:50:58 UTC - in response to Message 1666.

It's still constantly crashing on the system with the 8800GTX

All crashed with the same error:

<core_client_version>6.3.8</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GTX"
# Clock rate: 1350000 kilohertz
Cuda error: Kernel [copy_mul] failed in file 'com.cu' in line 43 : invalid device function .

</stderr_txt>
]]>


Using Ubuntu x64 8.04, Boinc v6.3.8 and the Nvidia 177.67 drivers.

____________

Down with the Kredit Kops!!!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1674 - Posted: 24 Aug 2008 | 10:24:45 UTC - in response to Message 1670.

It's still constantly crashing on the system with the 8800GTX

All crashed with the same error:

<core_client_version>6.3.8</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GTX"
# Clock rate: 1350000 kilohertz
Cuda error: Kernel [copy_mul] failed in file 'com.cu' in line 43 : invalid device function .

</stderr_txt>
]]>


Using Ubuntu x64 8.04, Boinc v6.3.8 and the Nvidia 177.67 drivers.


I am sorry but your card has compute capability 1.0 instead of 1.1. So, it is not supported by the current application. Note that 1.1 is the the CUDA toolkit, but the version of your GPU (in practice it does not have atomic operations).

I will add a better error handling into next version. The previous code to handle 1.0 does not work for us under CUDA2.

gdf


Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 221,007,857
RAC: 4,333,521
Level
Leu
Scientific publications
watwatwatwatwatwatwatwat
Message 1675 - Posted: 24 Aug 2008 | 10:38:03 UTC
Last modified: 24 Aug 2008 | 10:39:16 UTC

Which means if I have crunched all 6.37 WUs, I can throw away my 8800 GT which was working pretty fine until now and which I bought for the inital GPU tests, because CUDA 2.0 doesn't support my card?
Great...
____________

pixelicious.at - my little photoblog

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 1676 - Posted: 24 Aug 2008 | 10:59:18 UTC - in response to Message 1674.

The previous code to handle 1.0 does not work for us under CUDA2.


Do you mean code to handle the errors or alternative code to do the calculations? I'd suspect the latter and thus G80 based cards would still be able to contribute after the fix is in place.

MrS
____________
Scanning for our furry friends since Jan 2002

Profile [XTBA>XTC] ZeuZ
Send message
Joined: 15 Jul 08
Posts: 60
Credit: 108,384
RAC: 0
Level

Scientific publications
wat
Message 1677 - Posted: 24 Aug 2008 | 10:59:37 UTC

8800GT/8800GTSv2 (512mo) support compute capability 1.1, so it's compatible with CUDA2.0 as GF9 and GTX260/280

However, 8400/8600, 8800GTS320/640 , 8800GTX/ultra are not compatible, I think

Profile koschi
Avatar
Send message
Joined: 14 Aug 08
Posts: 124
Credit: 792,979,198
RAC: 17,226
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 1679 - Posted: 24 Aug 2008 | 11:16:32 UTC

Where can I find such information? I tried on google, but maybe with the wrong key words...

Now with the 6.38 using driver 177.68 on a 9800GT it seems the unit will take 60400 seconds, compared to the other recent apps that is an improvement after all:

6.25 -> 48.000s
6.34 -> 74.500s
6.37 -> 64.500s


But unfortunatelly not back to the 6.25 level...

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1681 - Posted: 24 Aug 2008 | 15:43:15 UTC
Last modified: 24 Aug 2008 | 15:45:13 UTC

It seems an upgrade to the Linux drivers are necessary for the 6.38 application.

On this host all WU run with 6.38 app and 173.14 driver failed in the first two seconds with the following error...

<core_client_version>6.3.8</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1458000 kilohertz
Cuda error in file '../force/vdw_texture.cu' in line 127 : feature is not yet implemented.

</stderr_txt>
]]>

After a manual upgrade to the 177.67 driver and the WU runs OK - Note Envy does not load the 177.xx driver, a manual installation is necessary.

System Spec:
Dual core E4300 overclocked @2.4GHz
8800GT (I am unable to change clocks on the card using nvclock when running under 177.67 driver)
Ubuntu 8.04 x64
Boinc 6.3.8
Application switch 6.37->6.38



/edit
the WU failed after 18minutes with the following....

<core_client_version>6.3.8</core_client_version>
<![CDATA[
<message>
process exited with code 1 (0x1, -255)
</message>
<stderr_txt>
# Using CUDA device 0
# Device 0: "GeForce 8800 GT"
# Clock rate: 1728000 kilohertz
MDIO ERROR: cannot open file "restart.coor"
Cuda error: Kernel [frc_sum_kernel_dihed] failed in file 'force.cu' in line 516 : unspecified launch failure.

</stderr_txt>
]]>

Seems unrelated to the earlier error, maybe the increase in clocks on the card makes it unstable? Does anyone know how to reduce the clocks under 177.67 drivers?
____________

Profile Stefan Ledwina
Avatar
Send message
Joined: 16 Jul 07
Posts: 464
Credit: 221,007,857
RAC: 4,333,521
Level
Leu
Scientific publications
watwatwatwatwatwatwatwat
Message 1682 - Posted: 24 Aug 2008 | 15:46:18 UTC - in response to Message 1677.

8800GT/8800GTSv2 (512mo) support compute capability 1.1, so it's compatible with CUDA2.0 as GF9 and GTX260/280

However, 8400/8600, 8800GTS320/640 , 8800GTX/ultra are not compatible, I think


Phew, seems I was lucky...
I upgraded to the latest 177.67 drivers on the computer with the 8800GT, and it is crunching the first WU with 6.38 - seems to work fine as far as I can tell now...
____________

pixelicious.at - my little photoblog

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 1683 - Posted: 24 Aug 2008 | 16:39:01 UTC - in response to Message 1681.

# Device 0: "GeForce 8800 GT"
# Clock rate: 1728000 kilohertz


1.73 GHz is way above the stock clock speed of any 8800 GT, so that might cause instability. Did you clock it that high or did it happen automatically after the software update?

MrS
____________
Scanning for our furry friends since Jan 2002

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1684 - Posted: 24 Aug 2008 | 18:58:53 UTC - in response to Message 1683.
Last modified: 24 Aug 2008 | 19:08:51 UTC

My 8800GT runs at 1.7Ghz (shader), but it is a ZOTAC AMP! edition, so it's overclocked by default.
Even though the gpugrid app reports it as ""GeForce 8800 GT"; Clock rate: 1674000 kilohertz"
Using GPUz it reports it as 1700Mhz shader....

And it runs gpugrid fine, and it's about 90 minutes faster compared to the stock speed 8800GT that I also have.
____________

Down with the Kredit Kops!!!

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1685 - Posted: 24 Aug 2008 | 19:32:45 UTC - in response to Message 1683.

# Device 0: "GeForce 8800 GT"
# Clock rate: 1728000 kilohertz


1.73 GHz is way above the stock clock speed of any 8800 GT, so that might cause instability. Did you clock it that high or did it happen automatically after the software update?

MrS


It's the default setting on that card. It's a BFG 8800GT. Previously, with 173.xx drivers I was able to use nvclock to reduce the GPU and memory clocks: see the first set of results: # Clock rate: 1458000 kilohertz. However, with 177.67 nvclock crashes the machine if I try to adjust the GPU clocks. The Nvidia-Settings panel has adjusters for the clocks, but no "apply" button. I think someone had some xorg.conf settings that may allow me to adjust, but I can't recall the details just now.

That box trashed a further 4 WU earlier this evening at reduced CPU clocks, so it may well be the GPU clock.


____________

Profile Venturini Dario[VENETO]
Send message
Joined: 26 Jul 08
Posts: 44
Credit: 4,832,360
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwatwatwatwat
Message 1688 - Posted: 24 Aug 2008 | 22:04:51 UTC - in response to Message 1681.

Note Envy does not load the 177.xx driver, a manual installation is necessary.


Then I will wait until Envy makes them available.

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1689 - Posted: 24 Aug 2008 | 22:24:11 UTC - in response to Message 1688.

You need to edit the config file and in the section that lists the videocard, should be like:

Section "Device"
Identifier "Videocard0"
Driver "nvidia"

just above the EndSection put in
Option "Coolbits" "1"

Save and either re-start X or reboot system.
____________

Down with the Kredit Kops!!!

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1690 - Posted: 24 Aug 2008 | 22:27:12 UTC
Last modified: 24 Aug 2008 | 22:31:16 UTC

Is anyone else encountering these errors.....

Cuda error: Kernel [frc_sum_kernel_angle] failed in file 'force.cu' in line 487 : unspecified launch failure.

Since upgrade to 177.xx drivers, every WU on this box has failed after 10 to 20 minutes with the above error.

It has been suggested that the card overclock is to blame, but I can't reduce the clock under 177.xx drivers.

With 173.xx drivers I had occasional errors of this type, but my card was clocked at 650/950 as opposed to 700/1000 under 177.xx driver.

Another machine is running 6.38 under 177.13 driver without fault at 700/1000.

So, does anyone have this error? How did you fix it? Has anyone got a suggestion to reduce the clock under 177.xx without crashing the machine?

Thanks!

/edit

You need to edit the config file and in the section that lists the videocard, should be like:


Sorry I should have explained more. I have the coolbits options set and using either the sliders, manually entering the clock speeds, or using nvclock results in a rapid white flashing screen and unresponsive system :(
____________

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1691 - Posted: 24 Aug 2008 | 23:10:28 UTC - in response to Message 1690.

Sorry I should have explained more. I have the coolbits options set and using either the sliders, manually entering the clock speeds, or using nvclock results in a rapid white flashing screen and unresponsive system :(


I get that also, on both the Zotac AMP! and Stock 8800GT :(
Flashing white screen (not good when yer Epileptic!), then it would reboot...

That's with just trying to move the clock speed 1MHz up or down.....
____________

Down with the Kredit Kops!!!

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1692 - Posted: 24 Aug 2008 | 23:32:17 UTC - in response to Message 1691.
Last modified: 24 Aug 2008 | 23:33:35 UTC


I get that also, on both the Zotac AMP! and Stock 8800GT :(
Flashing white screen (not good when yer Epileptic!), then it would reboot...

That's with just trying to move the clock speed 1MHz up or down.....


Well, I'm glad it's not just me then....(you know what I mean!).

I'm not sure that the factory overclock is the cause of the Cuda error: Kernel [frc_sum_kernel_angle] failed in file 'force.cu' in line 487 : unspecified launch failure, but it is a good candidate. When I managed to lower the clocks under 173.xx I had only a few like this, but now it's every WU.

I notice in this thread Mr S reports some WU similar (Windows I think).

It's a bit odd that another box here with a Zotac 8800GT is ok at 700/1000.

....that's another WU just gone, this time it got to over an hour I think. Oh well, I'm not going to fix it tonight and I have work in the morning. Off to get some sleep.
____________

Profile UBT - NaRyan
Avatar
Send message
Joined: 16 Jul 08
Posts: 68
Credit: 1,242,980
RAC: 0
Level
Ala
Scientific publications
watwatwatwatwat
Message 1693 - Posted: 25 Aug 2008 | 0:42:11 UTC - in response to Message 1692.

Have you tried just increasing the fan speed?
I have the fans running at 100% on both 8800GT's, As I noticed that the cards do tend to warm up a bit.

Mind you that also depends on the noise of the fan, and what you can tolerate noise wise.
I'm one of those peeps that could quite easily fall asleep with a Dyson running :)
____________

Down with the Kredit Kops!!!

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 1696 - Posted: 25 Aug 2008 | 10:49:50 UTC - in response to Message 1693.

The fact that the application is slower depends on some driver issue related with latest versions 177

http://forums.nvidia.com/index.php?showtopic=75466

It will be sorted alone with new drivers in the future.

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 1737 - Posted: 25 Aug 2008 | 20:29:51 UTC - in response to Message 1692.

I notice in this thread Mr S reports some WU similar (Windows I think).


Only the first error message was from my windows box, the others were from my wingmen. There was no one called Nightlord though ;)

MrS
____________
Scanning for our furry friends since Jan 2002

Profile Nightlord
Avatar
Send message
Joined: 22 Jul 08
Posts: 61
Credit: 5,461,041
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwat
Message 1741 - Posted: 25 Aug 2008 | 21:51:47 UTC - in response to Message 1737.


Only the first error message was from my windows box, the others were from my wingmen. There was no one called Nightlord though ;)

MrS


ROTFLMAO :))

Well, it seems I fixed the problem....or maybe just avoided it.

Thanks to the new Windows app, I switched the card to a box running 32 bit Vista. Using the Windows drivers and Ntune, I can adjust the GPU clocks and the card is happy (for at least 20 minutes now!).

This also means I have switched my GTX260 card to the Linux Box and thnaks to the 6.38 application this too is now crunching happy.

For now, everything here is running well...hooray!

____________

Post to thread

Message boards : Graphics cards (GPUs) : Linux 64 application 6.38 for cuda2.0

//