Author |
Message |
|
I have:
Boinc 6.10.17
AuthenticAMD AMD Athlon(tm) 64 X2 Dual Core Processor 3800+ [Family 15 Model 35 Stepping 2] (2 processors) NVIDIA GeForce 8800 GT (255MB) Linux 2.6.27-17-generic
all task say error :
http://www.gpugrid.net/results.php?hostid=42512
any idea ?
Paolo |
|
|
|
That usually signifies a memory error.
Have you rebooted your machine and tried another WU?
____________
Thanks - Steve |
|
|
|
Yes, I have try to reboot...
All problem start the 6 march
but I haven't change anything !!!
My card have only 256Mb maybe ACEMD - GPU molecular dynamics v6.04 (cuda)
need more ram ? |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
We have fixed that in the upcoming release, considering special cases where the GPU card does not have enough memory.
gdf |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
My GT240s use between 290MB and 320MB (according to GPUZ), so if you did only have 256MB it might be an issue (as your card has 112 shaders, my GT240 has only 96). It would be interesting to know how much RAM GPUZ says your card has, and how much it is using!
Hopefully the fix will allow plenty of redundant cards to work again.
GPUZ: http://www.techpowerup.com/downloads/1761/mirrors.php |
|
|
|
I have Linux and I can't stop to run GPUZ...
but...
nvclock -i
-- General info --
Card: nVidia Geforce 8800GT
Architecture: G92 A2
PCI id: 0x611
GPU clock: 601.712 MHz
Bustype: PCI-Express
-- Shader info --
Clock: 1512.000 MHz
Stream units: 112 (11111011b)
ROP units: 16 (1111b)
-- Memory info --
Amount: 256 MB
Type: 256 bit DDR3
Clock: 702.000 MHz
-- PCI-Express info --
Current Rate: 16X
Maximum rate: 16X
-- Sensor info --
Sensor: Analog Devices ADT7473
Board temperature: 40C
GPU temperature: 55C
Fanspeed: 82 RPM
Fanspeed mode: manual
PWM duty cycle: 29.8%
-- VideoBios information --
Version: 62.92.16.00.a1
Signon message: GeForce 8800 GT VGA BIOS
Performance level 0: gpu 600MHz/shader 1500MHz/memory 700MHz/0.00V/100%
VID mask: 3
Voltage level 0: 0.95V, VID: 0
Voltage level 1: 1.00V, VID: 1
Voltage level 2: 1.05V, VID: 2
Voltage level 3: 1.10V, VID: 3
01:00.0 VGA compatible controller: nVidia Corporation GeForce 8800 GT (rev a2)
Subsystem: XFX Pine Group Inc. Device 2334
Flags: bus master, fast devsel, latency 0, IRQ 18
Memory at c2000000 (32-bit, non-prefetchable) [size=16M]
Memory at b0000000 (64-bit, prefetchable) [size=256M]
Memory at c0000000 (64-bit, non-prefetchable) [size=32M]
I/O ports at 9000 [size=128]
[virtual] Expansion ROM at c3000000 [disabled] [size=128K]
Capabilities: <access denied>
Kernel driver in use: nvidia
Kernel modules: nvidiafb, nvidia |
|
|
|
I think that only Full-atom molecular dynamics v6.70 (cuda)
work !!!
But ACEMD - GPU molecular dynamics v6.04 (cuda) NOT!
2031809 1280039 21 Mar 2010 10:52:11 UTC 26 Mar 2010 10:52:11 UTC In progress --- --- --- --- Full-atom molecular dynamics v6.70 (cuda)
2031228 1279667 21 Mar 2010 7:43:30 UTC 21 Mar 2010 10:52:11 UTC Error while computing 5.05 4.74 0.01 --- ACEMD - GPU molecular dynamics v6.04 (cuda)
2024218 1275454 19 Mar 2010 23:08:15 UTC 21 Mar 2010 12:25:18 UTC Completed and validated 111,720.16 54,169.80 7,645.29 9,556.61 Full-atom molecular dynamics v6.70 (cuda)
2017917 1270836 18 Mar 2010 23:10:45 UTC 18 Mar 2010 23:12:48 UTC Error while computing 8.17 7.98 0.01 --- ACEMD - GPU molecular dynamics v6.04 (cuda)
2017890 1270813 18 Mar 2010 23:12:48 UTC 18 Mar 2010 23:14:40 UTC Error while computing 9.19 8.80 0.02 --- ACEMD - GPU molecular dynamics v6.04 (cuda)
2017480 1270533 18 Mar 2010 21:15:41 UTC 18 Mar 2010 21:19:22 UTC Error while computing 10.69 8.22 0.01 --- ACEMD - GPU molecular dynamics v6.04 (cuda)
2012755 1267945 17 Mar 2010 23:07:39 UTC 17 Mar 2010 23:09:46 UTC Error while computing 9.19 8.51 0.01 --- ACEMD - GPU molecular dynamics v6.04 (cuda)
2010804 1266573 17 Mar 2010 23:09:46 UTC 18 Mar 2010 21:15:41 UTC Completed and validated 58,608.77 57,965.07 5,830.52 8,745.78 ACEMD - GPU molecular dynamics v6.04 (cuda)
|
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
same problem here, looking at my more recent validated work, its all 6.70 and 6.03, however, its a 50/50 if 6.03 passes or fails. all my errored work is 6.04 & 6.03, still blowing though roughly 10/1 or better WUs erroring out vs running/validating on a given day/machine. variety of cards, CPUs and 64bit OS's. Another common hit I'm seeing is 'SWAN: FATAL : Unable to create context', not sure what the deal is w/ that, but, bottom line it's ~ 0-3sec cpu time and out for these WU's. Reading the moderators comment I'm leaning toward WU vs config issue, if absolutely no WU's were crunching, I would be hunting for a tweak here or there but, I'm still looking for a solution, open to suggestions:) |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
imcola,
Looks like your GeForce 9600GT on Linux, your 9800GT on Linux, and your GTS250 on W7hpx64 can only run ACEMD tasks.
Your 9800 GT on W7hpx64 can run both ACEMD and ACEMD ver 2 tasks.
I would suggest you select to only run ACEMD tasks.
Alternatively you create another user account and add your 9800 GT on W7hpx64 to that new user account and set it to just pick up ACEMD ver 2 tasks (faster).
If you do, you could create a team and have 2 accounts for the team.
The perfect solution would be for the techs to allow people to configure individual cards/systems online. Lets face it, new apps are needed, so this may be the way forward.
Paolo Biagini,
Do the same, select to only run ACEMD tasks.
To change account settings:
Goto Your Account, GPUGRID preferences, Edit GPUGRID Preferences,
Alter the Application versions as follows:
Run only the selected applications
ACEMD: YES
ACEMD ver 2.0: no
ACEMD beta: no |
|
|
liveonc Send message
Joined: 1 Jan 10 Posts: 292 Credit: 41,567,650 RAC: 0 Level
Scientific publications
|
Why does he need to make two accounts? Why not just use the Default/Home/School/Work GPUGRID preferences? Just use one GPUGRID preference for the 9600GT & another GPUGRID preference for the 9800GT.
____________
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
liveonc, you are correct - imcola should just setup 2 different profiles; that's the sane way to do it! |
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
all of you, thanks for the input, I still have another day until I get back home, I will give a look at my machine profiles as you suggest and post back once I see how things go, Happy Crunching!! |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
tomorrow we will upload a new beta which should fix the problem.
gdf |
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
Thanks guys!, back in business, I did end up creating a couple diff profiles as suggested, but once I got home and detached- bounced BOINC and re-attached on the various machines, everybody went back to work using 'home' w/ just original ACEMD WUs, no ver2 or beta work. All except for the gts250, kept saying no WUs avail, so I set 'school' to get both ACEMD original & ver2 ACEMD WUs and pointed the 250 to school, detached/bounced BOINC and re-attached and it took right off after the d/l(s) completed. Cool!!
I guess I still can't differentiate, between the versions beyond v6.70, v6.71, v6.04 and v6.03, but as long as BOINC/gpugrid and the profiles/cards can, I'll be happy to just play along. THANKS again for the insight!! Many happy crunched returns!!
|
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
Thanks guys!, back in business, I did end up creating a couple diff profiles as suggested, but once I got home and detached- bounced BOINC and re-attached on the various machines, everybody went back to work using 'home' w/ just original ACEMD WUs, no ver2 or beta work. All except for the gts250, kept saying no WUs avail, so I set 'school' to get both ACEMD original & ver2 ACEMD WUs and pointed the 250 to school, detached/bounced BOINC and re-attached and it took right off after the d/l(s) completed. Cool!!
I guess I still can't differentiate, between the versions beyond v6.70, v6.71, v6.04 and v6.03, but as long as BOINC/gpugrid and the profiles/cards can, I'll be happy to just play along. THANKS again for the insight!! Many happy crunched returns!!
6.03 is for Windows and 6.04 is for Linux - Both are ACEMD ver 2
6.70 & 6.71 are ACEMD
You dont really need to differentiate, just select ACEMD or ACEMD ver 2 for each profile. |
|
|
|
Paolo Biagini,
Do the same, select to only run ACEMD tasks.
To change account settings:
Goto Your Account, GPUGRID preferences, Edit GPUGRID Preferences,
Alter the Application versions as follows:
Run only the selected applications
ACEMD: YES
ACEMD ver 2.0: no
ACEMD beta: no
OK, I try it.
bye |
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
6.04 WUs for my linux boxes seem to primarily crash and burn ~ 0-3 secs CPU and out. 6.70 seems to run ok, but it seems there are not so many of those WUs. Is this the case?
GDF, you mentioned a new beta release which would correct errors mentioned in this thread, Linux and 6.04 seems buggy for sure for my 2 boxes. I have set up a 'school' profile to pick up acemd and ver2 and beta, but don't seem to get anything but 6.70 (runs) & 6.04 (won't) (Home should just grab ACEMD WUs, while work should take either ACEMD or Ver2) . This GPUGRID project seems to need some extra help, where as boinc/cpu crunching I can pretty much let it go by itself. If you have some guidance or can use a crunchers feedback, i will play guinea pig for a little while if it will help to bebug this linux side of the project, you might throw me in the deep end from the code perspective, but H/W wise I'm pretty solid, offer extended FWIW. UBUNTU 9.10 on both machines. |
|
|
GDFVolunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message
Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level
Scientific publications
|
BOINC is not able to report your driver version.
Could it be that it's not properly installed?
gdf
6.04 WUs for my linux boxes seem to primarily crash and burn ~ 0-3 secs CPU and out. 6.70 seems to run ok, but it seems there are not so many of those WUs. Is this the case?
GDF, you mentioned a new beta release which would correct errors mentioned in this thread, Linux and 6.04 seems buggy for sure for my 2 boxes. I have set up a 'school' profile to pick up acemd and ver2 and beta, but don't seem to get anything but 6.70 (runs) & 6.04 (won't) (Home should just grab ACEMD WUs, while work should take either ACEMD or Ver2) . This GPUGRID project seems to need some extra help, where as boinc/cpu crunching I can pretty much let it go by itself. If you have some guidance or can use a crunchers feedback, i will play guinea pig for a little while if it will help to bebug this linux side of the project, you might throw me in the deep end from the code perspective, but H/W wise I'm pretty solid, offer extended FWIW. UBUNTU 9.10 on both machines.
|
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
Good question, I've just updated from Ubuntu recommended v185, and installed 195.36.15 Nvidia drivers on both boxes, Ubuntu sees new version and displays the Nvidia X server settings. But after restarting BOINC, msg is that it sees the card, but driver version unknown and CUDA vers 3000, so I would say yes, they are installed and correctly, but BOINC still seems clueless about them |
|
|
|
All of my linux hosts have always reported the driver as unknown.( I think the boinc dev's are aware of this) 6.04 runs ok on my hosts but uses 100% cpu so I have elected not to run them. The ACEMD tasks seem to be running out as I haven't got any for a few days now. |
|
|
skgivenVolunteer moderator Volunteer tester
Send message
Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level
Scientific publications
|
6.04 runs ok on my hosts but uses 100% cpu so I have elected not to run them.
Are you completely mad? Run them, they are not there to look at. They run 30% faster than the other tasks (that have run out).
Let me put this another way.
My highly overclocked i7 does about 1/5th the work of a GT240 (under windows). A GTX260 does about twice the work of a GT240 and a GTX295 does twice the work of a GTX260. So a GTX295 does 20 times the work of my i7.
Your GTX260s would do 13 times the work of an i7 under Linux!
- Give up one CPU core - you have quads and GTX260's!
I even sack a CPU core under Windows, just to facilitate GPUGrid even when it does not need one full core. It's worth it for me using Windows never mind linux!
BTW this is Way OT. |
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
Well I ended up throttling back CPU utiliz to 80% on 1 machine, the other i didnt touch, but I pointed them both to 'work' to pick up some ver2 WUs (maybe). I now have 2 6.04 WUs running and well past the 2sec cpu mark, so I guess, i will kick back and monitor for a few days, see what happens w/ no twiddling for a couple days, be back in a few I guess |
|
|
|
Paolo Biagini,
Do the same, select to only run ACEMD tasks.
To change account settings:
Goto Your Account, GPUGRID preferences, Edit GPUGRID Preferences,
Alter the Application versions as follows:
Run only the selected applications
ACEMD: YES
ACEMD ver 2.0: no
ACEMD beta: no
OK, I try it.
bye
NOT WORK IN ANY WAY :-(
|
|
|
imcolaSend message
Joined: 26 Oct 09 Posts: 7 Credit: 8,428,912 RAC: 0 Level
Scientific publications
|
Hi Paolo, it took me a few days to get my Linux boxes dialed in to run 6.04 WUs. I have now got 6 validated vs 1 errored out, and 7 more running/queued to go. Let me backtrack, maybe you will spot a prob on yours. I had to throttle back CPU utiliz on one of mine under BOINC prefs, from using 100% CPU back to 80%. Your own earlier post showed your single 6.04 successful run, confirms my work, these 6.04 linux GPU runs are real CPU hogs vs a windows GPU WU, although I have not a clue why a GPU linux WU should require nearly a sec/CPU for the entire run time in secs, while a windows seems to need 10% or less CPU for its GPU run to complete. End result, try freeing up CPU cycles from other projects. See if that helps.
If you dont have the latest/greatest (I assume) Nvidia drivers installed, it was another upgrade for me to get the 6.04 WUs to run clean, you will want to be able to display the NVIDIA X server settings from the OS to confirm good install, not to worry if BOINC still cant detect it, the WU's can tell the diff and were not running from backlevel drivers. HTHs
Everybody who responded to thread, I really had to bang all of your observations/suggestions together in order to get this project back on track/ functional. Thanks, Paolo check back if u r still stuck |
|
|