Message boards : News : New acemdshort app 846
Author | Message |
---|---|
I've promoted the CUDA65 app version 846 from beta to short. | |
ID: 38185 | Rating: 0 | rate: / Reply Quote | |
Looking good. Boinc reporting 0.90 worth of CPU for 6.5, but task manager only at 1-2%. For Beta tasks boinc reported same, and task showed 1-2%. | |
ID: 38187 | Rating: 0 | rate: / Reply Quote | |
First NOELIA_SH2 WU on GTX980 completed & validated with beta app. | |
ID: 38190 | Rating: 0 | rate: / Reply Quote | |
Looking good. Boinc reporting 0.90 worth of CPU for 6.5, but task manager only at 1-2%. For Beta tasks boinc reported same, and task showed 1-2%. I saw that too on windows 8.1 so I added the environment variable swan_sync with a value of 0 and rebooted. Now I see ~100% core usage. I'm not sure if it will make a difference but it makes me feel better. See this thread for discussion of swan_sync: http://www.gpugrid.net/forum_thread.php?id=2123 | |
ID: 38191 | Rating: 0 | rate: / Reply Quote | |
First NOELIA_SH2 WU on GTX980 completed & validated with beta app. But biodoc, do you have run times of these WU's on non-Maxwell to compare? That is where I am very interested in. ____________ Greetings from TJ | |
ID: 38192 | Rating: 0 | rate: / Reply Quote | |
biodoc, thanks for the tip---You're Win8.1 system has WDDM tax of ~7% compared XP. You're Win8.1 is blazing fast. Have you tested you're GTX780Ti with new short CUDA 6.5? I'm very curious to see how well GTX 780ti performs with new refined code compared to GM204. Also,GM204 shows how Maxwell able to carry more threads (atoms) per SMM vs. SMX. | |
ID: 38193 | Rating: 0 | rate: / Reply Quote | |
First NOELIA_SH2 WU on GTX980 completed & validated with beta app. No, my 780TI is on a linux box and exclusively runs the long WUs. The beta app is for Windows only so we need a data from a 780Ti using the new app for a fair comparison, I think. | |
ID: 38195 | Rating: 0 | rate: / Reply Quote | |
The NOELIA_SH2 WU I just finished is ranked #6 in the new Performance section. | |
ID: 38196 | Rating: 0 | rate: / Reply Quote | |
Excellent code refinement by Matt. Was there any code refinement between 8.44 and and 8.46? | |
ID: 38200 | Rating: 0 | rate: / Reply Quote | |
In the "Maxwell now" thread he mentioned----
I'm assuming there was. | |
ID: 38202 | Rating: 0 | rate: / Reply Quote | |
Nope, that's just a rebuild, modulo a fix for a compiler regression. | |
ID: 38204 | Rating: 0 | rate: / Reply Quote | |
Can't wait for recipe to be added, when the grapes are wine. | |
ID: 38205 | Rating: 0 | rate: / Reply Quote | |
While searching for runtimes/processing rates for GTX980/970-- I noticed a abnormal variance concerning the 8.46 short app "Average processing rate". This number 653.09405051673 was taken from host113695 with a (GTX980). While my GT650m "average processing rate" is 71.024125852776 for the same CUDA6.5/8.46app. What's the formula for average processing rate? | |
ID: 38284 | Rating: 0 | rate: / Reply Quote | |
eXaPower wrote: ACEMD TMU usage is high I don't have insight into the actual code, but TMUs are Texture Mapping Units. They are fixed functions units to map textures to geometry and I highly doubt they can be exploited for GPU-Grid. The same applies to ROPs: these are Raster Output Units, i.e. they deal with assembling the finalized images ("pushing the pixels"). We're not pixelating anything at GPU-Grid or in other GP-GPU apps. Think of GP-GPU work of endless loops of matrix and vector operations, which are all performed on the shaders. eXaPower wrote: could someone explain how a GTX980 shows 11digits after decimal point, while a GT650m has 12? That seems to be simply caused by the number of total digits being equal to 14. BTW: consider the variance in WU completion times. You can easily round those numbers to 3 significant digits, anything else will be drowned in "experimental noise" anyway: GTX980: 653.09405051673 -> 653 GT650m: 71.024125852776 -> 71.0 This also answers your other question: How does a much more powerful GPU have the smaller number? It doesn't, see the numbers above. BTW2: you also mention a factor of about 8 in performanc ebetween these cards, based on other measures. The factor between the processing rates quoted above matches this, approximately. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 38512 | Rating: 0 | rate: / Reply Quote | |
ETA- Thank you for explaining what processing rate numbers mean. | |
ID: 38517 | Rating: 0 | rate: / Reply Quote | |
Thanks for pointing that out! The paper is from 2009, but I suspect the code has been enhanced since then, but not radically changed. | |
ID: 38523 | Rating: 0 | rate: / Reply Quote | |
Message boards : News : New acemdshort app 846