Advanced search

Message boards : Graphics cards (GPUs) : New app acemd 6.72 for Windows

Author Message
Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16352 - Posted: 17 Apr 2010 | 10:48:25 UTC

We have uploaded the newest application to replace the old application. So now we have:
acemd 6.72 newest app WIN (cuda2.2 and cuda3)
acemd2 6.03 production application WIN/LINUX (cuda2.2)
acemdbeta newest app (cuda2.2 and cuda3) WIN (not used now)

Linux cuda3 coming soon (Monday).

gdf

ExtraTerrestrial Apes
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 17 Aug 08
Posts: 2705
Credit: 1,311,122,549
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16354 - Posted: 17 Apr 2010 | 10:51:55 UTC - in response to Message 16352.

Any changes important for us in 6.72?

MrS
____________
Scanning for our furry friends since Jan 2002

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16355 - Posted: 17 Apr 2010 | 11:29:56 UTC - in response to Message 16352.

Can these app also been used for fermi-cards?
____________
Ton (ftpd) Netherlands

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16356 - Posted: 17 Apr 2010 | 11:52:36 UTC

and how do the versions map to our preference options?

ACEMD
ACEMD ver 2.0
ACEMD beta
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16359 - Posted: 17 Apr 2010 | 13:28:22 UTC - in response to Message 16356.


acemd 6.7x ACEMD
acemd2 6.0x ACEMD ver 2.0
acemdbeta ACEMD beta

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16361 - Posted: 17 Apr 2010 | 13:30:13 UTC - in response to Message 16354.

Any changes important for us in 6.72?

MrS


6.72 is 30% faster on 1.2 compute capability cards.
Also if the there is not enough GPU RAM it should not request the memory but same speed as before.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16362 - Posted: 17 Apr 2010 | 13:31:37 UTC - in response to Message 16355.

Can these app also been used for fermi-cards?


All the cuda30 should work on Fermi cards, but in practice we saw that they don't.
We will check why as soon as we receive the first Fermi card.


gdf

Profile X-Files 27
Avatar
Send message
Joined: 11 Oct 08
Posts: 95
Credit: 68,023,693
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16363 - Posted: 17 Apr 2010 | 13:55:19 UTC - in response to Message 16362.

I getting a message that there is no workunit for this app.
____________

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16368 - Posted: 17 Apr 2010 | 16:12:34 UTC - in response to Message 16363.

It's possible. We have so many on the production application, that for now we can't create more.

gdf

nmeofdst8
Send message
Joined: 3 Mar 10
Posts: 9
Credit: 837,920
RAC: 0
Level
Gly
Scientific publications
watwatwat
Message 16418 - Posted: 19 Apr 2010 | 5:09:10 UTC - in response to Message 16368.

Finally got one of the 6.71 WUs http://www.gpugrid.net/result.php?resultid=2180296, ran about 22% slower than 6.03 WUs on this machine as feedback, thanks,

cristipurdel
Send message
Joined: 31 Mar 10
Posts: 45
Credit: 103,429,292
RAC: 0
Level
Cys
Scientific publications
watwatwatwat
Message 16420 - Posted: 19 Apr 2010 | 6:56:04 UTC - in response to Message 16352.


acemd 6.72 newest app WIN (cuda2.2 and cuda3)
acemd2 6.03 production application WIN/LINUX (cuda2.2)
acemdbeta newest app (cuda2.2 and cuda3) WIN (not used now)
gdf


Is any app based on 6.22 beta?

Which one amongst apps is shorter for WIN?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16422 - Posted: 19 Apr 2010 | 8:14:11 UTC - in response to Message 16418.

Finally got one of the 6.71 WUs http://www.gpugrid.net/result.php?resultid=2180296, ran about 22% slower than 6.03 WUs on this machine as feedback, thanks,


Indeed, we are looking at 6.72 here.

gdf

nmeofdst8
Send message
Joined: 3 Mar 10
Posts: 9
Credit: 837,920
RAC: 0
Level
Gly
Scientific publications
watwatwat
Message 16434 - Posted: 19 Apr 2010 | 16:09:06 UTC - in response to Message 16422.

haha..woops didn't realize the 2 vs 1, blame it on being tired I guess.

pwolfe
Send message
Joined: 24 Mar 09
Posts: 54
Credit: 16,186,927
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwat
Message 16443 - Posted: 19 Apr 2010 | 21:00:42 UTC - in response to Message 16434.

did the linux app come out today as planned?

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16444 - Posted: 19 Apr 2010 | 22:24:10 UTC - in response to Message 16443.

no,
we are waiting for the Fermi card at this point.
gdf

HTH
Send message
Joined: 1 Nov 07
Posts: 38
Credit: 6,365,573
RAC: 0
Level
Ser
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 16448 - Posted: 20 Apr 2010 | 6:57:54 UTC

How much GPU RAM does it need?
____________

ignasi
Send message
Joined: 10 Apr 08
Posts: 254
Credit: 16,836,000
RAC: 0
Level
Pro
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 16450 - Posted: 20 Apr 2010 | 9:19:27 UTC - in response to Message 16448.
Last modified: 20 Apr 2010 | 9:54:11 UTC

I am submitting internal tests and production WUs to new ACEMD 6.72 app.
They are *_long_100420*. As already explained in other threads, they are of the *long* batch that is twice as long as usual.

Regarding performance, do you see improvements over previous *long* running on acemd2 6.03?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16453 - Posted: 20 Apr 2010 | 9:59:36 UTC - in response to Message 16450.

Can you just clarify which cards should crunch these 6.72 application dependant tasks?

I am thinking,
CC1.1 cards that fail ACEMD Ver. 2 tasks,
possibly CC1.2 cards but I still think ACEMD Ver. 2 would be faster,
and possibly Fermi cards (better chance of working and perhaps aiding development)

Not CC1.3 cards or CC1.1 cards that always succeed with ACEMD Ver 2. tasks.

Thanks,

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16454 - Posted: 20 Apr 2010 | 10:45:21 UTC - in response to Message 16453.

This application will become the new production application if it passes the test.
So everyone who is running now can run these ones.
For Fermi we are waiting for the delivery of the first cards.

gdf

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16456 - Posted: 20 Apr 2010 | 11:18:03 UTC - in response to Message 16454.

OK everybody will be able to run 6.72 tasks, but should they?

Is the 6.72 application replacing the ACEMD (Ver 1 app)?
Is it replacing 6.03?

Will 6.72 be as fast as ACEMD Ver 2 for all cards?
Will a GTX275, for example, work as fast on 6.72 as on 6.03?

Thanks,

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16460 - Posted: 20 Apr 2010 | 13:32:40 UTC - in response to Message 16456.

6.72 is the fastest application. It is faster than 6.03 on G200 cards.

gdf

Profile X-Files 27
Avatar
Send message
Joined: 11 Oct 08
Posts: 95
Credit: 68,023,693
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16468 - Posted: 21 Apr 2010 | 0:23:45 UTC

about 8500 secs faster than 6.03 on 260-216

2189111 1379690 20 Apr 2010 10:46:28 UTC 21 Apr 2010 0:14:01 UTC Completed and validated 29,102.41 7,663.80 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)
2187933 1379207 20 Apr 2010 6:46:22 UTC 20 Apr 2010 22:19:05 UTC Completed and validated 37,654.18 9,277.25 7,645.29 11,467.93 ACEMD - GPU molecular dynamics v6.03 (cuda)

____________

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16469 - Posted: 21 Apr 2010 | 1:40:11 UTC

While I have only completed 1 WU I hope to see lots more to come.
GPUGrid is good ... seriously you guys are really good at what you!!!
My GTX295 gets about 5600 sec improvement.
I'm running shaders at 1656 and the time step was 17.892 ms

I have a 480 arriving at my house Thursday and I am eagerly waiting for the next new version of ACEMD that it can crunch ... we will see just what these monsters can do.

____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16470 - Posted: 21 Apr 2010 | 9:33:35 UTC - in response to Message 16469.

Another 30% improvement!
Again, well done.

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16477 - Posted: 21 Apr 2010 | 15:37:57 UTC

I have received 1 beta WU 6.72.

Takes 8 min 45 sec on GTX 295.

Tomorrow i will report about WU on GTS 250 and GTX 260.

____________
Ton (ftpd) Netherlands

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16479 - Posted: 21 Apr 2010 | 15:46:20 UTC - in response to Message 16477.
Last modified: 21 Apr 2010 | 15:46:34 UTC

Sorry, sorry

Time must be 8 hours 45 minutes.
____________
Ton (ftpd) Netherlands

Profile X-Files 27
Avatar
Send message
Joined: 11 Oct 08
Posts: 95
Credit: 68,023,693
RAC: 0
Level
Thr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16480 - Posted: 21 Apr 2010 | 18:58:46 UTC - in response to Message 16470.

Another 30% improvement!
Again, well done.

Improvement at the expense of higher cpu usage?
6.03=1-2%
6.72=3-6%

I'm seeing "reds" in boinctasks for my cpu projects.
____________

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16481 - Posted: 21 Apr 2010 | 19:51:30 UTC - in response to Message 16470.

Another 30% improvement!

I don't think the improvement is anywhere near 30% it's more like 14% on that type of WU judging by 2 WUs done on ftpd's 295 not a scientific sample but I doubt the improvement is going to be a true 30% on any card just as the last one from 6.71 to 6.03 was anywhere near 60% as i believe was claimed.




____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16482 - Posted: 21 Apr 2010 | 20:46:58 UTC - in response to Message 16481.

On a g200 card the new application is 100% faster than the original ACEMD application which lasted almost one year.
No further improvements are planned.

gdf

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16488 - Posted: 22 Apr 2010 | 9:27:33 UTC

GTX 260 - WU 6.72 = OK
8 hours 5 minutes

____________
Ton (ftpd) Netherlands

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16492 - Posted: 22 Apr 2010 | 12:31:47 UTC
Last modified: 22 Apr 2010 | 12:34:48 UTC

While the speed increased for some GPUs, it looks like the credit awarded is going down by 2000 points:

p5-IBUCH_0510_pYEEI_long_100420-1-4-RND1103_0 1382090 21 Apr 2010 2:37:48 UTC 22 Apr 2010 6:13:51 UTC
Completed and validated 49,845.25 5,175.23 7,954.42 9,943.03 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1381327

p43-IBUCH_0511_pYEEI_long_100420-1-4-RND4049_0 1381327 20 Apr 2010 20:47:15 UTC 21 Apr 2010 16:22:57 UTC
Completed and validated 49,555.66 5,158.56 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1382090

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16495 - Posted: 22 Apr 2010 | 13:39:49 UTC - in response to Message 16492.

They are different workunits with different number of steps. Look at the result report for a confirmation.
gdf


While the speed increased for some GPUs, it looks like the credit awarded is going down by 2000 points:

p5-IBUCH_0510_pYEEI_long_100420-1-4-RND1103_0 1382090 21 Apr 2010 2:37:48 UTC 22 Apr 2010 6:13:51 UTC
Completed and validated 49,845.25 5,175.23 7,954.42 9,943.03 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1381327

p43-IBUCH_0511_pYEEI_long_100420-1-4-RND4049_0 1381327 20 Apr 2010 20:47:15 UTC 21 Apr 2010 16:22:57 UTC
Completed and validated 49,555.66 5,158.56 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1382090

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16497 - Posted: 22 Apr 2010 | 14:19:42 UTC - in response to Message 16495.
Last modified: 22 Apr 2010 | 14:27:15 UTC

Beyond, it looks the top task picked up a 25% bonus rather than a 50% bonus.
Reason, missed 24h deadline by about 3.5h

Sent 21 Apr 2010 2:37:48 UTC
Returned 22 Apr 2010 6:13:51 UTC

Turn your cache down to 0.05 or similar.

If you are running CPU tasks and want to keep a few days worth in cache,
stop receiving new GPU tasks, turn up your cache, collect whatever CPU tasks you need (say 5days) then turn your cach right down to 0.05days, and enable new GPU tasks.

We could probably do with a seperate GPU control Tab on Boinc!

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16503 - Posted: 22 Apr 2010 | 19:47:49 UTC - in response to Message 16495.

They are different workunits with different number of steps. Look at the result report for a confirmation.
gdf


While the speed increased for some GPUs, it looks like the credit awarded is going down by 2000 points:

p5-IBUCH_0510_pYEEI_long_100420-1-4-RND1103_0 1382090 21 Apr 2010 2:37:48 UTC 22 Apr 2010 6:13:51 UTC
Completed and validated 49,845.25 5,175.23 7,954.42 9,943.03 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1381327

p43-IBUCH_0511_pYEEI_long_100420-1-4-RND4049_0 1381327 20 Apr 2010 20:47:15 UTC 21 Apr 2010 16:22:57 UTC
Completed and validated 49,555.66 5,158.56 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1382090


I don't think so. Time/step was nearly identical as was the total time for the WU Same computer/GPU.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16504 - Posted: 22 Apr 2010 | 19:59:58 UTC - in response to Message 16497.

Beyond, it looks the top task picked up a 25% bonus rather than a 50% bonus.
Reason, missed 24h deadline by about 3.5h

Sent 21 Apr 2010 2:37:48 UTC
Returned 22 Apr 2010 6:13:51 UTC

Turn your cache down to 0.05 or similar.

If you are running CPU tasks and want to keep a few days worth in cache,
stop receiving new GPU tasks, turn up your cache, collect whatever CPU tasks you need (say 5days) then turn your cach right down to 0.05days, and enable new GPU tasks.

We could probably do with a seperate GPU control Tab on Boinc!

I agree about the separate GPU cache control, as well as project by project cache control. I've asked DA & company for it. You can guess the response, same as always :-(

Didn't know that the bonus went down after 24hours. With these new long WUs, for now looks like I'll have to be preforming a lot of abortions (of the WU variety)...

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16505 - Posted: 22 Apr 2010 | 20:13:12 UTC - in response to Message 16504.

Recently the bonus was increased to 50% for tasks completed in 1day. Prior to this there was a bonus of 25% for tasks completed within 2 days – This 2 day bonus still applies.
This is essentially to encourage faster turnaround of tasks, which in turn speeds up the project, as in compound interest (accelerates it)! Presently the CC1.3 cards can all do the longer WUs well within the 24h time frame, but it is important to have a low cache for CC1.1 andCC1.2 cards. It also helps the project to have a lower cache for CC1.3 cards.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16506 - Posted: 22 Apr 2010 | 20:48:38 UTC - in response to Message 16505.

Recently the bonus was increased to 50% for tasks completed in 1day. Prior to this there was a bonus of 25% for tasks completed within 2 days – This 2 day bonus still applies.
This is essentially to encourage faster turnaround of tasks, which in turn speeds up the project, as in compound interest (accelerates it)! Presently the CC1.3 cards can all do the longer WUs well within the 24h time frame, but it is important to have a low cache for CC1.1 andCC1.2 cards. It also helps the project to have a lower cache for CC1.3 cards.

The cc1.1 cards can't make the 24 hour limit with the long WUs so they're back on v6.03. The GTX 260 is fine with the .33 day cache. The 3 GT 240 cards have a problem with the 24 hour deadline when using a .33 day cache for the CPU projects. Don't really want to go too much less than that.

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16507 - Posted: 22 Apr 2010 | 22:00:13 UTC - in response to Message 16503.
Last modified: 22 Apr 2010 | 22:00:44 UTC

I repeat. There is nothing strange about these two workunits, one is simply longer than the other one (time/step is the same but number of timesteps is different).

gdf



They are different workunits with different number of steps. Look at the result report for a confirmation.
gdf


While the speed increased for some GPUs, it looks like the credit awarded is going down by 2000 points:

p5-IBUCH_0510_pYEEI_long_100420-1-4-RND1103_0 1382090 21 Apr 2010 2:37:48 UTC 22 Apr 2010 6:13:51 UTC
Completed and validated 49,845.25 5,175.23 7,954.42 9,943.03 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1381327

p43-IBUCH_0511_pYEEI_long_100420-1-4-RND4049_0 1381327 20 Apr 2010 20:47:15 UTC 21 Apr 2010 16:22:57 UTC
Completed and validated 49,555.66 5,158.56 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1382090


I don't think so. Time/step was nearly identical as was the total time for the WU Same computer/GPU.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16508 - Posted: 23 Apr 2010 | 9:08:41 UTC - in response to Message 16507.
Last modified: 23 Apr 2010 | 9:34:43 UTC

This long 6.72 task ran on one of my GT240s.
It completed and validated in about 18h; within 1 day, for the full 50% bonus.
6.72 tasks are about 21% faster than the 6.03 tasks on a GT240.
All my systems use a 0.05 day cache.


2200081 1379585 22 Apr 2010 10:36:29 UTC 23 Apr 2010 5:34:49 UTC Completed and validated 64,182.75 13,417.75 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

- I am expecting one of my more overclocked GT240 cards to finish a long 6.72 task in 16h 45min.
- GPU 600MHz, Shaders 1625MHz, GDDR5 1800MHz. The shaders are 21% overclocked, which is what counts the most.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16510 - Posted: 23 Apr 2010 | 10:24:11 UTC - in response to Message 16508.

22 Apr 2010 16:31:37 UTC 23 Apr 2010 10:12:32 UTC Completed and validated 60,358.42 12,729.48 7,853.81 11,780.71 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/result.php?resultid=2201187

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16512 - Posted: 23 Apr 2010 | 13:45:23 UTC - in response to Message 16507.

Actually skgiven explained the 2000 credit difference. The 2nd WU didn't quite make the new 24hr cutoff for receiving the 50% bonus. The 1st one did.

I repeat. There is nothing strange about these two workunits, one is simply longer than the other one (time/step is the same but number of timesteps is different).

gdf

They are different workunits with different number of steps. Look at the result report for a confirmation.
gdf


While the speed increased for some GPUs, it looks like the credit awarded is going down by 2000 points:

p5-IBUCH_0510_pYEEI_long_100420-1-4-RND1103_0 1382090 21 Apr 2010 2:37:48 UTC 22 Apr 2010 6:13:51 UTC
Completed and validated 49,845.25 5,175.23 7,954.42 9,943.03 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1381327

p43-IBUCH_0511_pYEEI_long_100420-1-4-RND4049_0 1381327 20 Apr 2010 20:47:15 UTC 21 Apr 2010 16:22:57 UTC
Completed and validated 49,555.66 5,158.56 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

http://www.gpugrid.net/workunit.php?wuid=1382090


I don't think so. Time/step was nearly identical as was the total time for the WU Same computer/GPU.


Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16563 - Posted: 26 Apr 2010 | 16:14:33 UTC

It seems the v6.72 WUs have been very hard to get since yesterday. Any chance of generating some more? Thanks!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16857 - Posted: 6 May 2010 | 9:25:25 UTC - in response to Message 16563.
Last modified: 6 May 2010 | 9:27:56 UTC

These are the User Account Preference Settings that matter most:

- Run test applications?

Run only the selected applications
- ACEMD:
- ACEMD ver 2.0:
- ACEMD beta:

- If no work for selected applications is available, accept work from other applications?


For Windows systems we presently have the following tasks:

. 6.03
. 6.72
. 6.73
. Betas

It is not clear which settings will select for the Fermi tasks (6.73), which select the 6.72 tasks and which select the 6.03 tasks.

This is what I am guessing:
. ACEMD = 6.73 (Fermi)
. ACEMD ver 2.0 = 6.03 & 6.72
. ACEMD Beta = any beta tasks (server can determine card type is Fermi or Not)

Can someone Confirm or Correct this?

Thanks,

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16867 - Posted: 6 May 2010 | 14:33:36 UTC - in response to Message 16857.

For Windows systems we presently have the following tasks:

. 6.03
. 6.72
. 6.73
. Betas

It is not clear which settings will select for the Fermi tasks (6.73), which select the 6.72 tasks and which select the 6.03 tasks.

This is what I am guessing:
. ACEMD = 6.73 (Fermi)
. ACEMD ver 2.0 = 6.03 & 6.72
. ACEMD Beta = any beta tasks (server can determine card type is Fermi or Not)

Can someone Confirm or Correct this?

Thanks,

As of yesterday some more v6.72 WUs became available. They were pretty much nonexistent for over a week. I think they started flowing the same time as the v6.73 WUs. Wonder if the WU part is actually the same except the Fermi only gets the v6.73 app, all others get the v6.72 app. If this is correct then maybe:

. ACEMD = 6.73 for Fermi, 6.72 if Fermi not detected
. ACEMD ver 2.0 = 6.03

That's my guess anyway. I have no idea about the beta.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16904 - Posted: 7 May 2010 | 22:56:57 UTC

GDF - can you please help us out?

what version app relates to which preference and if Windows / Linux WUs availablity for each. I bet you could tell us right off the top of your head and it would only take a minute to post ... please?
____________
Thanks - Steve

Profile GDF
Volunteer moderator
Project administrator
Project developer
Project tester
Volunteer developer
Volunteer tester
Project scientist
Send message
Joined: 14 Mar 07
Posts: 1957
Credit: 629,356
RAC: 0
Level
Gly
Scientific publications
watwatwatwatwat
Message 16905 - Posted: 8 May 2010 | 8:12:34 UTC - in response to Message 16904.

Acemdbeta are the beta

Acemd is from now giving only Fermi applications, it is just for testing but on a larger number that the beta.

Acemd ver 2 is our production application (6.03/6.04).

Soon we will move the latest application into the acemd ver 2.

Application termed cuda3 are given only to Fermi as cuda3 is slower on older hardware.

gdf

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16906 - Posted: 8 May 2010 | 12:06:58 UTC - in response to Message 16905.

Acemd is from now giving only Fermi applications, it is just for testing but on a larger number that the beta.

What, no more v6.72? ARG :-(

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16908 - Posted: 8 May 2010 | 13:57:50 UTC - in response to Message 16906.
Last modified: 8 May 2010 | 13:58:47 UTC

I think 6.72 is still set to replace 6.03; as 6.72 is faster and works for non-Fermi's.

6.72 looked like the longer version of the 6.22 Betas, in the same way 6.73 is a large scale Fermi test and the 6.23 WU's are the smaller Betas.

Well, I hope so; the 6.72 tasks were faster for my CC1.3 and CC1.2 cards.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16949 - Posted: 10 May 2010 | 22:27:48 UTC - in response to Message 16908.

Getting runaway 6.72 TONI_HERGunb Failures:

2301778 1453161 10 May 2010 22:19:27 UTC 10 May 2010 22:21:08 UTC Error while computing 6.37 4.82 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301768 1453151 10 May 2010 22:17:49 UTC 10 May 2010 22:19:27 UTC Error while computing 7.38 4.62 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301761 1453144 10 May 2010 22:12:37 UTC 10 May 2010 22:14:16 UTC Error while computing 7.44 4.79 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301760 1453143 10 May 2010 22:16:11 UTC 10 May 2010 22:17:49 UTC Error while computing 7.38 4.99 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301755 1453138 10 May 2010 22:14:16 UTC 10 May 2010 22:16:11 UTC Error while computing 7.38 4.70 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301750 1453133 10 May 2010 22:10:50 UTC 10 May 2010 22:12:37 UTC Error while computing 6.36 4.70 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301744 1453128 10 May 2010 22:07:31 UTC 10 May 2010 22:09:09 UTC Error while computing 7.38 4.88 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301730 1453114 10 May 2010 22:05:48 UTC 10 May 2010 22:07:31 UTC Error while computing 6.41 4.77 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301712 1453098 10 May 2010 21:55:29 UTC 10 May 2010 22:05:48 UTC Error while computing 7.24 4.90 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301711 1453097 10 May 2010 22:09:09 UTC 10 May 2010 22:10:50 UTC Error while computing 7.41 4.70 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301659 1453046 10 May 2010 21:12:42 UTC 10 May 2010 21:14:38 UTC Error while computing 6.33 4.76 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301635 1453022 10 May 2010 21:02:38 UTC 10 May 2010 21:12:42 UTC Error while computing 6.19 4.79 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301623 1453010 10 May 2010 20:44:56 UTC 10 May 2010 20:46:57 UTC Error while computing 8.31 4.74 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301614 1453001 10 May 2010 20:43:06 UTC 10 May 2010 20:44:56 UTC Error while computing 6.33 4.74 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301607 1452994 10 May 2010 21:30:15 UTC 10 May 2010 21:41:00 UTC Error while computing 7.23 4.93 0.03 --- Full-atom molecular dynamics v6.72 (cuda) 2301548 1452936 10 May 2010 19:46:31 UTC 10 May 2010 20:43:06 UTC Error while computing 6.18 4.68 0.03 --- Full-atom molecular dynamics v6.72 (cuda)

DESELECTED ACEMD and Betas for now.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16952 - Posted: 11 May 2010 | 2:18:50 UTC - in response to Message 16949.

Getting runaway 6.72 TONI_HERGunb Failures:


Seems v6.72 and v6.73 use the same WUs and the TONI_HERG are failing for both apps. Here's an example:

http://www.gpugrid.net/workunit.php?wuid=1453358

ftpd
Send message
Joined: 6 Jun 08
Posts: 152
Credit: 328,250,382
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16953 - Posted: 11 May 2010 | 8:07:35 UTC - in response to Message 16952.

No problems with 6.72 with Windows-xp-pro and gtx295.

Now running 6.72 on 4 different machines!
____________
Ton (ftpd) Netherlands

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16963 - Posted: 11 May 2010 | 15:24:54 UTC - in response to Message 16952.

Getting runaway 6.72 TONI_HERGunb Failures:

Seems v6.72 and v6.73 use the same WUs and the TONI_HERG are failing for both apps. Here's an example:

Even though SK and a few others have had some trouble, the v6.72 TONI_HERG WUs are running OK here so far. Maybe it was some bad WUs at the start of the run?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 16964 - Posted: 11 May 2010 | 17:27:51 UTC - in response to Message 16963.

I have seen runaway errors before, they are not always the result of bad WU's; more often a system issue. I just wanted to make sure my cards kept picking up work; too many failures = no tasks!
One 6.72 task did manage to complete.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17006 - Posted: 13 May 2010 | 19:21:27 UTC - in response to Message 16964.

Counted a total of 4 successful 6.72 tasks, and 67 failures on the one system with 4 GT240's.
Now just running 6.03 on that system.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17012 - Posted: 13 May 2010 | 21:52:58 UTC - in response to Message 17006.

Counted a total of 4 successful 6.72 tasks, and 67 failures on the one system with 4 GT240's.
Now just running 6.03 on that system.

You've got a problem, but it's not the project's fault. I've recently had 53 successful v6.72 tasks and 1 failure. The failure was my fault (accidentally unplugged the machine, oops). Most were run on 3 GT240 cards, the rest on a GTX260/216. I have had some bad v6.03 TONI_GAUS2 WUs though.

Maybe a problem supplying power to all those GT240 cards through the PCIe buss?

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17021 - Posted: 14 May 2010 | 7:57:54 UTC - in response to Message 17006.

Counted a total of 4 successful 6.72 tasks, and 67 failures on the one system with 4 GT240's.
Now just running 6.03 on that system.

Try lowering shader clocks to 1580
____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17027 - Posted: 14 May 2010 | 10:24:21 UTC - in response to Message 17012.
Last modified: 14 May 2010 | 10:26:54 UTC

I might have a problem, but the system is very stable (Corsair 550W PSU and native CPU & RAM). The CPU does clock up to 3.3GHz without manually upping the voltage. The cards are OC'd but use stock voltage.
Most of the 6.03 tasks run fine. Picked up a few v6.03 TONI_GAUS failures over the last week or so, and some 6.22 failures, but not too many.
The 6.72 mostly crash after 7seconds, suggesting an app/WU issue.
Using Vista x64, so the drivers are the same as yours, 197.45.
Boinc 6.10.45

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17029 - Posted: 14 May 2010 | 11:31:21 UTC - in response to Message 17027.
Last modified: 14 May 2010 | 11:46:12 UTC

For now all the 6.03 tasks are running well, and I am averaging over 50K per day on the system, so it's hardly a disaster;

Average, 50,077.21. Total 5,267,924

Boinc 6.10.45
AuthenticAMD AMD Phenom(tm) II X4 940 Processor [Family 16 Model 4 Stepping 2](4 processors)
[4] NVIDIA GeForce GT 240 (474MB) driver: 19745
Microsoft Windows Vista Ultimate x64 Edition, Service Pack 1, (06.00.6001.00) 14 May 2010 10:44:37 UTC

I suppose the shader clocks might be the issue with these faster WU's (the 6.72 units use more GPU so probably draw slightly more power from the board). I might try to drop the clocks as you suggest.
I did notice a similar situation with my GTX260sp216; (6.72 fail after a few seconds but 6.03 run fine). It has raised shaders too.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17032 - Posted: 14 May 2010 | 13:01:21 UTC - in response to Message 17027.

I might have a problem, but the system is very stable (Corsair 550W PSU and native CPU & RAM). The CPU does clock up to 3.3GHz without manually upping the voltage. The cards are OC'd but use stock voltage.

If I remember correctly that PSU has a single rail which is preferable for your use, with everything running on the rail that powers the MB. I'd suggest pulling 1 of the cards and then run some v6.72 WUs. If it works your output from v6.72 with 3 cards would probably be as high or higher than v6.03 with 4 cards. Lower power consumption too. Also maybe try cutting back the OC a bit.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17033 - Posted: 14 May 2010 | 13:32:01 UTC - in response to Message 17032.
Last modified: 14 May 2010 | 13:56:53 UTC

I might try that over the weekend.
I did set my GTX260 clocks back to stock and tried a 6.72 task on that card but it failed immediately. I have another couple of GT240s that I can set to stock and try first - I dont want to mess too much with the quad system as there are always tasks running at different stages.

- Just checked and the +12V rating is 41A (492W) for the Corsair 550.
There is no way I am pushing that with 4 GT240s and a stock CPU.
The CPU has a 125W TDP and the GT240s are 69W TDP.
The one HDD, one DVDRW, and 2 sticks RAM wont do much damage either.

- Just measured it on a Watt metre and the whole system is only drawing 235W when crunching (CPU 90%, 4 GPUs at about 75%).

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17049 - Posted: 14 May 2010 | 22:32:49 UTC - in response to Message 17033.

I might try that over the weekend.
I did set my GTX260 clocks back to stock and tried a 6.72 task on that card but it failed immediately. I have another couple of GT240s that I can set to stock and try first - I dont want to mess too much with the quad system as there are always tasks running at different stages.

- Just checked and the +12V rating is 41A (492W) for the Corsair 550.
There is no way I am pushing that with 4 GT240s and a stock CPU.
The CPU has a 125W TDP and the GT240s are 69W TDP.
The one HDD, one DVDRW, and 2 sticks RAM wont do much damage either.

- Just measured it on a Watt metre and the whole system is only drawing 235W when crunching (CPU 90%, 4 GPUs at about 75%).

No problem with WUs at different stages, it'll just suspend one until a GPU becomes available. As far as the power, a big question is whether the MB can keep the power supplied properly to all 4 PCIe slots with no additional power supplied by PCIe connectors. It's not like you're running HD4770 cards that draw not much more than the GT240 but supply most of it through a PCIe connector to each card.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17057 - Posted: 15 May 2010 | 8:42:27 UTC - in response to Message 17049.
Last modified: 15 May 2010 | 8:43:03 UTC

Suspended all tasks, set clocks to default (system power usage fell to 140W), enabled the one 6.72 task to run (no other CPU or GPU tasks were running).

The 6.72 task crashed in 5sec!

I think the problem is not power related:
When these GPUs are not crunching they only use about 10W (actually about 6W, with the board drawing the other 4W).
So, with the other cards not crunching (and the CPU not crunching) the cards only used about 90W between them.
90W can’t be an issue, because if it were the cards could not crunch four 6.03 tasks at the same time; using about 240W between them!

It might be the fact that I am using more than one card, some Boinc setting, the WU’s, or Vista?

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17058 - Posted: 15 May 2010 | 10:18:39 UTC

The WUs that you are reporting that error out quickly all throw the same error:

ERROR: file ntnbrlist.cpp line 63: Insufficent memory available for pairlists. Set pairlistdist to match the cutoff.
called boinc_finish

this is what the WU is reporting you have for memory on each of these cards.
# Total amount of global memory: 497614848 bytes

I took a look at the machines that passed these same WUs and did find a 240 that reported a different amount of memory and it passed:
# Total amount of global memory: 536870912 bytes

so maybe the WU is early terminating not because it threw an error as much as it was smart enough not to even try calculating the real science.

I know that memory requirements for the WUs have been an issue in the past and while we thought we were past that I think it needs to be revisited.


Does anyone know if there is one version of BOINC that is more accurate than others at reporting available GPU memory?

Do the apps rely on BOINC to say how much free memory a GPU has?

Can the app get a more accurate number than BOINC?


In thinking about not only the insufficient memory issue but also guessing that there could be substantial performance increases if you used more of the available memory on any given card (e.g. a 480 has 1.5 gb but the max usage I have observed is 437 mb) I can think of two approaches that may be worth consideration.

Could the feeder be made more "memory aware" of the machines as they ask for work to parcel out WU more appropriately? I know there will be machines that have multiple GPUs that are not exactly matched but even if you target the smallest amount of memory card for the machine asking for work, at least it would not fail due to lack of memory. I am also guessing that the same people who run multiple card systems are the same people who will actually understand this and optimize their systems accordingly.

My other thought, which may be even more difficult to implement, is to make the apps themselves more dynamic in their ability to determine and allocate available memory.
____________
Thanks - Steve

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 8,900,364,783
RAC: 19,817,716
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17059 - Posted: 15 May 2010 | 10:29:41 UTC - in response to Message 17058.

BOINC developers have been struggling with CUDA memory questions recently. They were trying to grapple with two different measures:

1) How much actual memory does the card have?
2) How much space does the science application have available to run in, after system overheads (notoriously, Aero, and likewise OS X) have taken their bite.

The live, running memory detection in particular was causing problems, so they've taken that bit back out again in the latest v6.10.55/.56: but you could probably try with .54 or just before if you want to see what happens.

The other noticeable thing is that the 197.xx drivers tend to give a smaller answer for total memory size than earlier drivers. Did you compare driver versions on those two 240s with different global memory reports?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17066 - Posted: 15 May 2010 | 13:12:12 UTC - in response to Message 17059.
Last modified: 15 May 2010 | 13:15:46 UTC

The cards I have are all 512MB GT240 DDR3, according to the manufacturer and GPUZ, so I am not sure what or why they are being reported as having 474.5625MB

With the recommended NVidia driver (197.45) and Boinc Version (6.10.18):
15/05/2010 13:09:08 NVIDIA GPU 0: GeForce GT 240 (driver version 19745, CUDA version 3000, compute capability 1.2, 475MB, 307 GFLOPS peak)
Boinc 6.10.45 & NVidia 197.45:
15/05/2010 13:11:26 NVIDIA GPU 0: GeForce GT 240 (driver version 19745, CUDA version 3000, compute capability 1.2, 475MB, 307 GFLOPS peak)
Boinc 6.10.56 & NVidia 197.45:
15/05/2010 13:11:26 NVIDIA GPU 0: GeForce GT 240 (driver version 19745, CUDA version 3000, compute capability 1.2, 475MB, 307 GFLOPS peak)
Boinc 6.10.56 & NVidia 197.57 (beta):
15/05/2010 13:47:28 NVIDIA GPU 0: GeForce GT 240 (driver version 19757, CUDA version 3000, compute capability 1.2, 475MB, 312 GFLOPS peak)

The card you found reports exactly 512MB. What Version of Boinc & NVidia driver was it using and what operating system (as there are different drivers for different operating systems)?

I also noticed that my GTX260sp216 (in a different system) has less than 1GB RAM; 881MB according to Boinc, and 896 according to GPUZ.
This card also fails 6.72 tasks when at stock, but crunches the 6.03 tasks (it’s a good card), using recent drivers too.

I will try the NVidia 196.21 drivers.

Snow Crash
Send message
Joined: 4 Apr 09
Posts: 450
Credit: 539,316,349
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17067 - Posted: 15 May 2010 | 13:53:31 UTC - in response to Message 17066.

This is the WU I saw where someone else's 240 completed but your's said insufficient memory:
http://www.gpugrid.net/workunit.php?wuid=1465191

They do a have different driver: 19562 on Microsoft Windows 7 Ultimate x64 Edition, (06.01.7600.00).

Your 260 is getting a different error:
http://www.gpugrid.net/result.php?resultid=2330409
- exit code -40 (0xffffffd8)
SWAN: FATAL : swanMalloc failed

I think it is also a memory issue, I just do not know enough to guess as to what, specifically, it did not like .

Hopefully looking at these instances will help GDF and crew come up with a solution.
____________
Thanks - Steve

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17070 - Posted: 15 May 2010 | 15:13:25 UTC - in response to Message 17067.
Last modified: 15 May 2010 | 15:58:34 UTC

15/05/2010 16:05:31 NVIDIA GPU 0: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 257 GFLOPS peak)
15/05/2010 16:05:31 NVIDIA GPU 1: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 257 GFLOPS peak)
15/05/2010 16:05:31 NVIDIA GPU 2: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 257 GFLOPS peak)
15/05/2010 16:05:31 NVIDIA GPU 3: GeForce GT 240 (driver version 19621, CUDA version 3000, compute capability 1.2, 512MB, 257 GFLOPS peak)

Just need to wait for a 6.72 task to come my way, there are none at the minute.
I also put 196.21 onto the system with the GTX260.

I guess Boinc ver 6.10.45 (and variants) might have worked well right up to the point where people started using the 197.xx drivers!

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17074 - Posted: 15 May 2010 | 16:41:21 UTC - in response to Message 17070.

Will there be any 6.72 tasks available over the weekend?

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17078 - Posted: 15 May 2010 | 18:16:05 UTC - in response to Message 17074.
Last modified: 15 May 2010 | 19:14:15 UTC

Well, I picked up two 6.72 WUs on the GT240 system.
So far (1h 15 min), so good – Using Vista 64bit, 196.21 drivers and Boinc 6.10.56
- Note they use to fail after about 7sec (197.45 drivers)

It might be a Driver/System/Boinc combination problem. I suppose if it was easy to spot it would have been fixed by now.

6.72 tasks seem to work fine for the 19745 driver for XP x86 on Boinc 6.10.51:
2309288 1457946 10 May 2010 17:37:28 UTC 11 May 2010 11:40:02 UTC Completed and validated 60,352.50 5,833.84 7,954.42 11,931.63 Full-atom molecular dynamics v6.72 (cuda)

But there are differences between XP and Vista drivers.

It looks like the problems (here) are just with Vista & W7 drivers when it comes to 6.72 WU’s.

Anyone else with 6.72 issues who uses Vista or Win7 – Use a 196.xx driver (196.21 seems to work).

Snow Crash, Beyond and Richard Haselgrove,
Thanks for your help with this.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17085 - Posted: 16 May 2010 | 11:58:07 UTC - in response to Message 17078.

So far, three 6.72 tasks completed on the Vista system with the GT240s.
There have been no failures (since changing the driver to 196.21), and 4 tasks are running well (2 should finish within an hour):
2337257 1477078 15 May 2010 21:08:32 UTC 16 May 2010 11:43:47 UTC Completed and validated 32,909.90 3,104.78 4,503.74 6,755.61 Full-atom molecular dynamics v6.72 (cuda)
2336451 1476466 15 May 2010 18:01:24 UTC 16 May 2010 8:33:23 UTC Completed and validated 33,353.63 3,316.39 4,503.74 6,755.61 Full-atom molecular dynamics v6.72 (cuda)
2336422 1476441 15 May 2010 17:56:50 UTC 16 May 2010 3:12:30 UTC Completed and validated 32,999.91 3,178.07 4,503.74 6,755.61 Full-atom molecular dynamics v6.72 (cuda)

So, 196.21 seems to work well for Vista x64 when crunching 6.72 WU's (using Boinc 6.10.56)

Driver 197.45 still seems to be working well on Win XP x86 with Boinc 6.10.51 for 6.72 tasks.

Driver 196.21 might still have issues on Win7 x64:
With Boinc 6.10.50 (GT240):
This one failed with less than an hour to go, but looks like it is a different issue (its not like it failed after 7 seconds),
2335968 1476139 15 May 2010 15:50:19 UTC 16 May 2010 6:40:12 UTC Error while computing 33,138.30 5,189.69 4,503.74 --- Full-atom molecular dynamics v6.72 (cuda)]
This one finished (same system),
2301423 1452813 10 May 2010 17:58:29 UTC 11 May 2010 4:08:09 UTC Completed and validated 35,936.96 5,254.89 4,503.74 6,755.61 Full-atom molecular dynamics v6.72 (cuda)
- changed to native GPU & RAM clocks, with only 1575 to 1600MHz shaders, & upped the fan speeds to see if that makes them more stable with these 6.72 WU’s

With Boinc 6.10.51 for a GTX260sp216 (native clocks):
All still failing after about 8sec, so same problem as before (well actually it was 5sec before):
2337284 , 2337028 , 2332609
- Have moved up to Boinc 6.10.56. If that does not work I will try a different driver.

Thanks,

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17107 - Posted: 18 May 2010 | 3:10:40 UTC

Can anyone give me a hand with how to set my preferences? I've read through the thread and I am trying to figure out if they can be set so that I only receive the 6.72 WUs on my system. These are really running great on my GT240s so if it is possible to only receive them that would be great.

-Matt

Betting Slip
Send message
Joined: 5 Jan 09
Posts: 670
Credit: 2,498,095,550
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17108 - Posted: 18 May 2010 | 6:35:34 UTC - in response to Message 17107.

Can anyone give me a hand with how to set my preferences? I've read through the thread and I am trying to figure out if they can be set so that I only receive the 6.72 WUs on my system. These are really running great on my GT240s so if it is possible to only receive them that would be great.

-Matt


6.72 will soon become the production application I suspect. for now you can go to your account page and select GPUGRID preferences. click on "Edit GPUGRID preferences" under "Run only the selected applications" check "ACEMD", uncheck "ACEMD ver 2", check "If no work for selected applications is available, accept work from other applications?" or you may not get work and then click on "Update Preferences" and you're done.


____________
Radio Caroline, the world's most famous offshore pirate radio station.
Great music since April 1964. Support Radio Caroline Team -
Radio Caroline

Profile Bikermatt
Send message
Joined: 8 Apr 10
Posts: 37
Credit: 3,839,902,185
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 17132 - Posted: 18 May 2010 | 16:57:02 UTC - in response to Message 17108.

Sure enough, GDF just announced the merge after I posted.
Thanks, Matt

Post to thread

Message boards : Graphics cards (GPUs) : New app acemd 6.72 for Windows

//