Advanced search

Message boards : Graphics cards (GPUs) : Merging of acemd and acemd2

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17124 - Posted: 18 May 2010 | 15:29:09 UTC

We will be soon merging the two queues into a single one. So there will be left only:
acemd2
acemdbeta (for who accepts beta runs).

This is in order to simply your crunching.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17125 - Posted: 18 May 2010 | 15:49:29 UTC

So will there still be v6.03 and v6.72 WUs? If so I think the merging is a bad idea. Some machines run v6.72 better and some (especially older cards) run v6.03 better. Merging the two types into 1 queue will make us do the babysit and abort shuffle again. Much more work for us, less output for the project.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17128 - Posted: 18 May 2010 | 16:36:15 UTC - in response to Message 17125.

6.03 will disappear as it is old by now. There is no reason why it should work better than 6.72.
We will use only one application in two versions, cuda2.2 (for old drivers) and cuda3.0 for new drivers and fermi.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17130 - Posted: 18 May 2010 | 16:45:02 UTC - in response to Message 17128.

We will use only one application in two versions, cuda2.2 (for old drivers) and cuda3.0 for new drivers and fermi.

How new do the drivers have to be to use v3.0 (not v3.1?)?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17131 - Posted: 18 May 2010 | 16:50:13 UTC - in response to Message 17130.

I don't actually know. It is BOINC that select if the driver is CUDA3 compatible.
I guess that it is the one suggested in the cuda development page.
gdf

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,835,466,430
RAC: 19,993,329
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17133 - Posted: 18 May 2010 | 17:06:36 UTC - in response to Message 17131.

I don't actually know. It is BOINC that select if the driver is CUDA3 compatible.
I guess that it is the one suggested in the cuda development page.
gdf

BOINC doesn't "select", BOINC "reports". Compatibility or otherwise is determined by the driver itself.

I think for CUDA 3.0 you need at least driver 197.xx, and that's the point where the reduced available memory starts to kick in.

Driver 190.xx / CUDA 2.2 seems a nice stable combination for older cards, but have you ruled out CUDA 2.3 (or just decided it doesn't offer any improvement over 2.2 for GPUGrid)?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17135 - Posted: 18 May 2010 | 17:33:47 UTC - in response to Message 17133.

I think for CUDA 3.0 you need at least driver 197.xx, and that's the point where the reduced available memory starts to kick in.

Driver 190.xx / CUDA 2.2 seems a nice stable combination for older cards, but have you ruled out CUDA 2.3 (or just decided it doesn't offer any improvement over 2.2 for GPUGrid)?

Thanks for the info. I've been using v195.62 for a long time with great results. If the new plan slows down the v197.45 machines I'll move them back to v195.62 again.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17137 - Posted: 18 May 2010 | 19:02:23 UTC - in response to Message 17133.

I don't actually know. It is BOINC that select if the driver is CUDA3 compatible.
I guess that it is the one suggested in the cuda development page.
gdf

BOINC doesn't "select", BOINC "reports". Compatibility or otherwise is determined by the driver itself.

I think for CUDA 3.0 you need at least driver 197.xx, and that's the point where the reduced available memory starts to kick in.

Driver 190.xx / CUDA 2.2 seems a nice stable combination for older cards, but have you ruled out CUDA 2.3 (or just decided it doesn't offer any improvement over 2.2 for GPUGrid)?


There is no advantage in using cuda2.3 and cuda2.2 covers a larger driver installation.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17207 - Posted: 21 May 2010 | 16:47:15 UTC - in response to Message 17128.

6.03 will disappear as it is old by now. There is no reason why it should work better than 6.72.
We will use only one application in two versions, cuda2.2 (for old drivers) and cuda3.0 for new drivers and fermi.

gdf

Is v6.05 replacing both v6.03 and v6.72? Is it simply a renamed v6.72? Looks like it's exactly the same size...

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17209 - Posted: 21 May 2010 | 18:36:20 UTC - in response to Message 17207.

it's renamed.
gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17275 - Posted: 25 May 2010 | 7:21:25 UTC

Since v6.03 was shut down I've had to move a 9600GSO and a GT 8800 to Collatz, too many errors with the v6.05 WUs. They ran the v6.03 WUs fine. A 3rd GPU, another 9600GSO that ran well with v6.03 can't get work at all now, probably since it has 384MB of ram, also moved to Collatz. That's 3 decent cards moved from GPUGRID due to problems with the transition. Also problems getting work with the faster cards for the last day. No work at all available for any apps for the last few hours. Things aren't looking good :-(

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 17277 - Posted: 25 May 2010 | 8:10:46 UTC - in response to Message 17275.

We will solve this problem in the next release. I'll keep you informed.
gdf

cristipurdel
Send message
Joined: 31 Mar 10
Posts: 45
Credit: 103,429,292
RAC: 0
Level
Cys
Scientific publications
watwatwatwat
Message 17284 - Posted: 25 May 2010 | 11:35:31 UTC - in response to Message 17275.
Last modified: 25 May 2010 | 11:36:15 UTC

Since v6.03 was shut down I've had to move a 9600GSO and a GT 8800 to Collatz, too many errors with the v6.05 WUs. They ran the v6.03 WUs fine. A 3rd GPU, another 9600GSO that ran well with v6.03 can't get work at all now, probably since it has 384MB of ram, also moved to Collatz. That's 3 decent cards moved from GPUGRID due to problems with the transition. Also problems getting work with the faster cards for the last day. No work at all available for any apps for the last few hours. Things aren't looking good :-(


I'm experiencing something similar when I'm trying to run seti & collatz.
Do you have the same thing as I wrote here ?
http://lunatics.kwsn.net/gpu-testing/ati-sse3-astropulse-app-openclbrook-beta-testing.msg27299.html#msg27299

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17294 - Posted: 25 May 2010 | 14:31:28 UTC - in response to Message 17277.

We will solve this problem in the next release. I'll keep you informed.
gdf

Thanks, but which problem: the one where older cards produce many more errors or the one that doesn't allow 384MB cards at all? How about bringing back v6.03 in a separate queue until it's fixed?

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17295 - Posted: 25 May 2010 | 14:34:18 UTC - in response to Message 17284.
Last modified: 25 May 2010 | 14:35:04 UTC

Since v6.03 was shut down I've had to move a 9600GSO and a GT 8800 to Collatz, too many errors with the v6.05 WUs. They ran the v6.03 WUs fine. A 3rd GPU, another 9600GSO that ran well with v6.03 can't get work at all now, probably since it has 384MB of ram, also moved to Collatz. That's 3 decent cards moved from GPUGRID due to problems with the transition. Also problems getting work with the faster cards for the last day. No work at all available for any apps for the last few hours. Things aren't looking good :-(

I'm experiencing something similar when I'm trying to run seti & collatz.
Do you have the same thing as I wrote here ?
http://lunatics.kwsn.net/gpu-testing/ati-sse3-astropulse-app-openclbrook-beta-testing.msg27299.html#msg27299

Link doesn't work:

"An Error Has Occurred! The topic or board you are looking for appears to be either missing or off limits to you."

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,835,466,430
RAC: 19,993,329
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17297 - Posted: 25 May 2010 | 15:07:38 UTC - in response to Message 17295.

Link doesn't work:

"An Error Has Occurred! The topic or board you are looking for appears to be either missing or off limits to you."

It's in a Beta testing area - probably accessible to registered users only.

He wrote:

I'm having a small problem wit rev 420 and collatz.
While collatz 2.09 ati13ati is running, I'm getting [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
I have to suspend collatz in order to have [sched_op_debug] ATI GPU work request: 8640.00 seconds; 1.00 GPUs
It seems as collatz is bullying ati, despite having 90/10 share for seti/collatz.
Can somebody test this on their rig (.56 x64, SETI rev 420 + Collatz 2.09, only gpu)?
I'm afraid it might be smth with the app_info.

I don't think it is anything to do with the app_info.xml file: sounds more like BOINC long-term debt, aka 'ATI work fetch priority'.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17299 - Posted: 25 May 2010 | 15:48:52 UTC - in response to Message 17297.

He wrote:

[quote]I'm having a small problem wit rev 420 and collatz.
While collatz 2.09 ati13ati is running, I'm getting [sched_op_debug] ATI GPU work request: 0.00 seconds; 0.00 GPUs
I have to suspend collatz in order to have [sched_op_debug] ATI GPU work request: 8640.00 seconds; 1.00 GPUs
It seems as collatz is bullying ati, despite having 90/10 share for seti/collatz.
Can somebody test this on their rig (.56 x64, SETI rev 420 + Collatz 2.09, only gpu)?
I'm afraid it might be smth with the app_info.

I don't think it is anything to do with the app_info.xml file: sounds more like BOINC long-term debt, aka 'ATI work fetch priority'.

This is not related to the problem above. What you're describing sounds like the GPU FIFO "feature" in BOINC. The only way around this WAS to use a VERY SMALL queue size. Try BOINC v6.10.56 though as things seem to have improved (at least in my tests).

cristipurdel
Send message
Joined: 31 Mar 10
Posts: 45
Credit: 103,429,292
RAC: 0
Level
Cys
Scientific publications
watwatwatwat
Message 17301 - Posted: 25 May 2010 | 17:47:21 UTC - in response to Message 17299.

I'm using 6.10.56 x64 but it's not working.
How do I use a VERY SMALL queue size?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17303 - Posted: 25 May 2010 | 18:25:43 UTC - in response to Message 17301.

Configure Boinc to keep 0.05 days of work units (for example):
Open Boinc in Advanced View,
Advanced,
Preferences,
Network Usage,
Additional work buffer (0.05) days.

cristipurdel
Send message
Joined: 31 Mar 10
Posts: 45
Credit: 103,429,292
RAC: 0
Level
Cys
Scientific publications
watwatwatwat
Message 17323 - Posted: 26 May 2010 | 5:18:31 UTC - in response to Message 17303.

Configure Boinc to keep 0.05 days of work units (for example):
Open Boinc in Advanced View,
Advanced,
Preferences,
Network Usage,
Additional work buffer (0.05) days.

It's not good. I had to change it back to 0.00 to get some work.

Profile Paul D. Buck
Send message
Joined: 9 Jun 08
Posts: 1050
Credit: 37,321,185
RAC: 0
Level
Val
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 17337 - Posted: 26 May 2010 | 13:16:14 UTC - in response to Message 17297.

I don't think it is anything to do with the app_info.xml file: sounds more like BOINC long-term debt, aka 'ATI work fetch priority'.

A subject Richard and others have been trying without much success to get UCB to take seriously ...

There are two other ways to "sometimes" get BOINC back in battery... one is individual project resets to clear the debts and the other is to use the flag in CC Config to clear the debts...

Obviously, if you are going to use the project reset method, wait until you run dry next time and then reset the project and that will reset the debts for that single project ... with multi-project interaction the generalized debt reset is sometimes the only way to get BOINC to behave again ...

Note that in my personal experience you can start seeing the artifacts of the "debt crisis" in as little as a week though most of the time it is tolerable for up to a month ...

Post to thread

Message boards : Graphics cards (GPUs) : Merging of acemd and acemd2

//