Author |
Message |
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Dear all, please give the new acemdbeta app, ver 845, a work out. This supports all GPUs now.
It's Windows only - if you don't get WUs, you'll need to update your driver.
Matt |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
Matt, is the 343.98 Driver accepted? I've been trying to get Beta tasks. 14/09/29 06:12:36 | GPUGRID | No tasks are available for ACEMD beta version
I have correct configure-- /run testing app/Beta app checked, not accepting other short or long. I never update to WHQL drivers, from being limited for certain functional areas, unlike Betas or Developer Driver. |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Huh, yes. You should be getting something...
According to the logs your host #159309 got given work at 12:15 CEST. |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
14/09/29 06:48:02 | GPUGRID | No tasks are available for ACEMD beta version
Strange, I see no Beta tasks running on Boinc Manager. I just tried again. If driver is accepted, I will continue to try. Thanks for the help.
14/09/29 06:50:55 | GPUGRID | No tasks are available for ACEMD beta version
|
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
*Now* you should get something.. |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
I did, indeed.
Update: unknown error) - exit code -97 (0xffffff9f)after 8s
The simulation has become unstable. Terminating to avoid lock-up (1)(this first time I've had this during my time at GPUGRID. GPU1 Temp was 58C.
If you don't mind errors, I will try again.
Update2: same error. GPU usage go's to 90% for seconds, after GPU usage to 0% then crashes. |
|
|
|
On the first test unit, I got an error.
9/29/2014 7:13:41 AM | GPUGRID | Computation for task 21-MJHARVEY_TEST4000-0-10-RND0794_0 finished
9/29/2014 7:13:41 AM | GPUGRID | Output file 21-MJHARVEY_TEST4000-0-10-RND0794_0_1 for task 21-MJHARVEY_TEST4000-0-10-RND0794_0 absent
9/29/2014 7:13:41 AM | GPUGRID | Output file 21-MJHARVEY_TEST4000-0-10-RND0794_0_2 for task 21-MJHARVEY_TEST4000-0-10-RND0794_0 absent
9/29/2014 7:13:41 AM | GPUGRID | Output file 21-MJHARVEY_TEST4000-0-10-RND0794_0_3 for task 21-MJHARVEY_TEST4000-0-10-RND0794_0 absent
Name 21-MJHARVEY_TEST4000-0-10-RND0794_0
Workunit 10123268
Created 29 Sep 2014 | 9:50:11 UTC
Sent 29 Sep 2014 | 11:10:35 UTC
Received 29 Sep 2014 | 11:13:11 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -97 (0xffffffffffffff9f) Unknown error number
Computer ID 127986
Report deadline 4 Oct 2014 | 11:10:35 UTC
Run time 4.10
CPU time 3.48
Validate state Invalid
Credit 0.00
Application version ACEMD beta version v8.45 (cuda65)
Stderr output
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -97 (0xffffff9f)
</message>
<stderr_txt>
# GPU [GeForce GTX 690] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 690
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:07:00.0
# Device clock : 1019MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r343_98 : 34411
# GPU 0 : 67C
# GPU 1 : 42C
# GPU 2 : 69C
# GPU 3 : 70C
# The simulation has become unstable. Terminating to avoid lock-up (1)
</stderr_txt>
]]>
|
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
Update#3 I've received 5 Beta tasks- all have failed, but two caused a system hang ( no error files).
FATAL : Cuda driver error 702 in file 'swanlibnv2.cpp' in line 1965/ Simulation unstable. Flag 11 value 1
# The simulation has become unstable. Terminating to avoid lock-up
# The simulation has become unstable. Terminating to avoid lock-up (2)
Simulation unstable. Flag 11 value 1
# The simulation has become unstable. Terminating to avoid lock-up
# The simulation has become unstable. Terminating to avoid lock-up (2)
Update#4 Still failing on both cards with same error--
(unknown error) - exit code -97 (0xffffff9f)
[url] http://www.gpugrid.net/workunit.php?wuid=10099983 [/url]
This work unit has 3 Linux failures (all with GTX 780) and 2 Win8.1 failures.
Update#5 received 5 more beta for total of ten-- all failed with same error number. All Tasks have started fine (90+GPUusage/14%MCU) with progress .016 intervals, before failing. Wingman with Tesla K20c/GTX780 (c.c3.5, along C.C3.0 wingman, failed also. |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Yes, looks like CUDA65 is bad on everything but GM204s. Ho hum. |
|
|
|
-97 error here, on my GTX 460
# Simulation unstable. Flag 11 value 1
# The simulation has become unstable. Terminating to avoid lock-up
# The simulation has become unstable. Terminating to avoid lock-up (2)
=========================
http://www.gpugrid.net/result.php?resultid=13149151
Name 43-MJHARVEY_TEST1999-1-10-RND5744_2
Workunit 10123176
Created 29 Sep 2014 | 11:40:28 UTC
Sent 29 Sep 2014 | 12:54:10 UTC
Received 29 Sep 2014 | 13:17:01 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -97 (0xffffffffffffff9f) Unknown error number
Computer ID 153764
Report deadline 4 Oct 2014 | 12:54:10 UTC
Run time 2.56
CPU time 0.00
Validate state Invalid
Credit 0.00
Application version ACEMD beta version v8.45 (cuda65)
Stderr output
<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -97 (0xffffff9f)
</message>
<stderr_txt>
# GPU [GeForce GTX 460] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 1 :
# Name : GeForce GTX 460
# ECC : Disabled
# Global mem : 1024MB
# Capability : 2.1
# PCI ID : 0000:07:00.0
# Device clock : 1526MHz
# Memory clock : 1900MHz
# Memory width : 256bit
# Driver version : r343_98 : 34411
# Simulation unstable. Flag 11 value 1
# The simulation has become unstable. Terminating to avoid lock-up
# The simulation has become unstable. Terminating to avoid lock-up (2)
</stderr_txt>
]]> |
|
|
|
Yikes, I'm seeing these same errors on the Short Run queue -- I guess the Cuda65 app has been deployed there too? |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
It was on acemdshort briefly. It is no longer.
Matt |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
14/09/29 09:48:39 | GPUGRID | No tasks are available for ACEMD beta version
Has beta app been pulled for non-C.C 5.2 cards? |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Has beta app been pulled for non-C.C 5.2 cards?
Yes, it's served its purpose there. The CUDA65 build is broken on non-5.2
Matt |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
846 on acemdbeta now. CUDA65 for sm 3.0 and higher.
Matt |
|
|
|
My 2 GTX 660 Tis, and my GTX 460, in my main rig, are now successfully simultaneously crunching 3 ACEMD beta version 8.46 (cuda65) tasks.
Thank you! |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
So far, so good. .004% progress intervals-- 1.000% in four minutes. 24,000s est. time to complete. |
|
|
|
Matt:
I even think the canary behavior works better for me now. I tried the scenario where it was failing on the 8.41 app, and now it worked fine without failure on the 8.46 beta app.
Can you please explain, in detail, how the canary behavior was changed? How exactly does behave in 8.46?
Thanks,
Jacob |
|
|
Jim1348Send message
Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level
Scientific publications
|
Running fine after 25 minutes on a GTX 650 Ti. It will complete in 3 hours 16 minutes (344.11 driver, Win7 64-bit). |
|
|
Jim1348Send message
Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level
Scientific publications
|
It completed OK on the GTX 650 Ti, but seems to be causing problems on some higher-end cards. But their versions of ACEMD probably have more changes than the one I got (8.46).
http://www.gpugrid.net/workunit.php?wuid=10123336
I will be trying my GTX 660 Ti next on the same machine to see what happens. |
|
|
biodocSend message
Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 1,709,448 Level
Scientific publications
|
All WU's completed & validated thus far on my GTX980 with beta app versions 8.44, 8.45 and 8.46. I'm running windows 8.1 and nvidia drivers v. 344.16.
http://www.gpugrid.net/results.php?hostid=142719 |
|
|
|
All the beta units are finishing valid. Though, the output files are rather large, 44 Megabytes.
52-MJHARVEY_TEST4000-0-10-RND4601_3
Workunit 10123299
Created 29 Sep 2014 | 12:46:43 UTC
Sent 29 Sep 2014 | 19:41:27 UTC
Received 29 Sep 2014 | 22:41:52 UTC
Server state Over
Outcome Success
Client state Done
Exit status 0 (0x0)
Computer ID 127986
Report deadline 4 Oct 2014 | 19:41:27 UTC
Run time 6,363.92
CPU time 6,033.14
Validate state Valid
Credit 1,500.00
Application version ACEMD beta version v8.46 (cuda65)
Stderr output
<core_client_version>7.2.42</core_client_version>
<![CDATA[
<stderr_txt>
# GPU [GeForce GTX 690] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 1 :
# Name : GeForce GTX 690
# ECC : Disabled
# Global mem : 2048MB
# Capability : 3.0
# PCI ID : 0000:04:00.0
# Device clock : 1019MHz
# Memory clock : 3004MHz
# Memory width : 256bit
# Driver version : r343_98 : 34411
# GPU 0 : 63C
# GPU 1 : 73C
# GPU 2 : 74C
# GPU 3 : 74C
# GPU 0 : 64C
# GPU 0 : 65C
# GPU 0 : 66C
# GPU 0 : 67C
# GPU 0 : 68C
# GPU 0 : 69C
# GPU 0 : 70C
# GPU 0 : 71C
# Time per step (avg over 2500000 steps): 2.549 ms
# Approximate elapsed time for entire WU: 6371.977 s
# PERFORMANCE: 23558 Natoms 2.549 ns/day 0.000 ms/step 0.000 us/step/atom
18:24:40 (5228): called boinc_finish
</stderr_txt>
]]>
|
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
What's the meaning of ns/day performance? Number is same as time (ms) per step.
23558 Natoms 4.726 ns/day-GTX650Ti
23558 Natoms 2.549 ns/day-GTX690
23558 Natoms 1.633 ns/day-GTX980
|
|
|
TJSend message
Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level
Scientific publications
|
It completed OK on the GTX 650 Ti, but seems to be causing problems on some higher-end cards. But their versions of ACEMD probably have more changes than the one I got (8.46).
http://www.gpugrid.net/workunit.php?wuid=10123336
I will be trying my GTX 660 Ti next on the same machine to see what happens.
I think the errors on the higher-end cards where caused by to old drivers Jim. I had a lot errors on my 780Ti's yesterday, but when I updated to the latest driver, they run smooth as usual again.
The beta did okay on my 660, so your 660Ti will do great as well.
____________
Greetings from TJ |
|
|
Jim1348Send message
Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level
Scientific publications
|
TJ,
Thanks, that is probably it. My GTX 660 Ti did finish fine; I will be trying a couple of GTX 750 Ti's now just for fun. |
|
|
Matt Send message
Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level
Scientific publications
|
Just enabled Test Apps for my GTX 680 and GTX 780Ti cards. I'll check back in a while to see how they're doing.
Edit: I saw that TJ recommended updating to the latest drivers. Is this the latest Beta or WHQL driver? I'm currently running 344.11. Thanks. |
|
|
Jim1348Send message
Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level
Scientific publications
|
Edit: I saw that TJ recommended updating to the latest drivers. Is this the latest Beta or WHQL driver? I'm currently running 344.11. Thanks.
344.11 works fine on my GTX 650 Ti and 660 Ti on the test apps. I am running it on my GTX 750 Ti also, but haven't picked up the new apps yet
|
|
|
Matt Send message
Joined: 11 Jan 13 Posts: 216 Credit: 846,538,252 RAC: 0 Level
Scientific publications
|
Thanks, Jim1348. I'll stick with 344.11 for now, then. |
|
|
|
Thanks, Jim1348. I'll stick with 344.11 for now, then.
What other options are there? :) |
|
|
TJSend message
Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level
Scientific publications
|
Just enabled Test Apps for my GTX 680 and GTX 780Ti cards. I'll check back in a while to see how they're doing.
Edit: I saw that TJ recommended updating to the latest drivers. Is this the latest Beta or WHQL driver? I'm currently running 344.11. Thanks.
Hello Matt, yes I am running 344.11 the latest WHQL driver. But to be clear it was recommended by Matt from the project.
The older driver I was using, was a bit faster on Win7 as the WDDM was introduced with Vista and can not be switched of, but that is besides the scope of this thread.
____________
Greetings from TJ |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
If I've got things right, the 65 apps shouldn't be sent any driver older than 343.00. The exception to that will be the Linux app, when that finally exists. That will give the WU out to any client that reports CUDA 6.5 capability, as only our patched client reports the driver version.
Matt |
|
|
TJSend message
Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level
Scientific publications
|
If I've got things right, the 65 apps shouldn't be sent any driver older than 343.00. The exception to that will be the Linux app, when that finally exists. That will give the WU out to any client that reports CUDA 6.5 capability, as only our patched client reports the driver version.
Matt
Well Matt with driver 331 on my 780Ti's win7 where a bit faster but then I got cuda65 tasks and errored out. With your advice I updated the driver and no more errors (yesterday one, but that was another reason).
But if you have made changes yesterday or today, then you are probably right.
____________
Greetings from TJ |
|
|
|
I think it's safe to promote the CUDA6.5 application to the long queue. |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Not just yet... |
|
|
biodocSend message
Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 1,709,448 Level
Scientific publications
|
If I've got things right, the 65 apps shouldn't be sent any driver older than 343.00. The exception to that will be the Linux app, when that finally exists. That will give the WU out to any client that reports CUDA 6.5 capability, as only our patched client reports the driver version.
Matt
boinc 7.4.22 (development version) now reports the driver version:
Starting BOINC client version 7.4.22 for x86_64-pc-linux-gnu
CUDA: NVIDIA GPU 0: GeForce GTX 780 Ti (driver version 343.22, CUDA version 6.5, compute capability 3.5, 3072MB, 2814MB available, 5345 GFLOPS peak)
Shows up here too:
http://www.gpugrid.net/show_host_detail.php?hostid=183991
|
|
|
|
MJH:
I've been processing Beta tasks, and although nearly all are successful for me on the 8.46 app, I did have a failure last night. This is on a completely-stable Windows 8.1 Update 1 x64 machine, on one of my GTX 660 Ti GPUs, using 344.11 driver.
Any ideas?
http://www.gpugrid.net/result.php?resultid=13154266
Name 79-MJHARVEY_TEST4001-2-10-RND8149_0
Workunit 10126844
Created 30 Sep 2014 | 18:55:59 UTC
Sent 1 Oct 2014 | 3:25:13 UTC
Received 1 Oct 2014 | 4:35:24 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -97 (0xffffffffffffff9f) Unknown error number
Computer ID 153764
Report deadline 6 Oct 2014 | 3:25:13 UTC
Run time 1,431.05
CPU time 384.63
Validate state Invalid
Credit 0.00
Application version ACEMD beta version v8.46 (cuda65)
Stderr output
<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -97 (0xffffff9f)
</message>
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:08:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r343_98 : 34411
# GPU 0 : 69C
# GPU 1 : 64C
# GPU 2 : 69C
# GPU 1 : 65C
# GPU 1 : 66C
# GPU 1 : 67C
# GPU 0 : 70C
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 5505000)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:08:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r343_98 : 34411
# The simulation has become unstable. Terminating to avoid lock-up (1)
</stderr_txt>
]]>
|
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
Question: would a 4.2CUDA long task running on one card slow down CUDA 6.5 short or Beta tasks running on other or vise versa? I just noticed a CUDA 4.2 Noelia Long task running, with 6.5 Beta and Short. Runtime for Long task is more than normal. It takes ~40Hr to complete, but at ~40Hr the 4.2 task is 80%. |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Maybe, if the processes are competing for CPU.
Matt |
|
|
|
Any idea why my task failed, 3 posts up? |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
Any idea why my task failed, 3 posts up?
Have you checked event viewer to locate any occurrences at the time task failed? Any kernel failures ? Or database instances? If you have automatic windows updates or auto Maintenance enabled- this can trigger random failures for other processes. (or sometimes fault any heavy usage process) Also, a security "audit" can trigger background task (GPUGRID) failures. |
|
|
|
Thanks, but it was a couple "simulation became unstable" errors, which I believe to be a problem with the GPUGrid application. |
|
|
|
I think it's safe to promote the CUDA6.5 application to the long queue.
Not just yet...
Are we waiting for your GTX980 to arrive? |
|
|
biodocSend message
Joined: 26 Aug 08 Posts: 183 Credit: 10,085,929,375 RAC: 1,709,448 Level
Scientific publications
|
I think it's safe to promote the CUDA6.5 application to the long queue.
Not just yet...
Are we waiting for your GTX980 to arrive?
It's probably due to me overclocking my GTX980. I had several tasks fail while I was at work yesterday. Since I clocked back, I've had 4 short run tasks complete successfully.
My apologies for messing up the beta test. |
|
|
|
I had another failure, where simulation became unstable on a 8.46 Cuda 6.5 beta task. http://www.gpugrid.net/result.php?resultid=13161365
I am not entirely convinced that the error is the fault of the task or the application. Perhaps the new 344.11 drivers push the GPUs even harder than previous drivers. I will do additional testing, with Heaven, to attempt to confirm.
Thanks,
Jacob
Name 30-MJHARVEY_TEST1999-5-10-RND7983_0
Workunit 10132352
Created 2 Oct 2014 | 16:49:51 UTC
Sent 2 Oct 2014 | 21:36:23 UTC
Received 2 Oct 2014 | 22:20:50 UTC
Server state Over
Outcome Computation error
Client state Compute error
Exit status -97 (0xffffffffffffff9f) Unknown error number
Computer ID 153764
Report deadline 7 Oct 2014 | 21:36:23 UTC
Run time 386.44
CPU time 100.75
Validate state Invalid
Credit 0.00
Application version ACEMD beta version v8.46 (cuda65)
Stderr output
<core_client_version>7.4.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -97 (0xffffff9f)
</message>
<stderr_txt>
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:08:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r343_98 : 34411
# GPU 0 : 69C
# GPU 1 : 64C
# GPU 2 : 70C
# GPU 1 : 65C
# GPU 1 : 66C
# GPU 1 : 67C
# BOINC suspending at user request (exit)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:08:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r343_98 : 34411
# GPU 0 : 62C
# GPU 1 : 58C
# GPU 2 : 54C
# GPU 0 : 63C
# GPU 1 : 59C
# GPU 2 : 55C
# GPU 0 : 64C
# GPU 1 : 60C
# GPU 2 : 56C
# GPU 0 : 65C
# GPU 2 : 57C
# GPU 0 : 66C
# GPU 1 : 61C
# GPU 2 : 58C
# GPU 2 : 59C
# GPU 0 : 67C
# GPU 1 : 62C
# GPU 2 : 60C
# GPU 2 : 61C
# GPU 0 : 68C
# GPU 1 : 63C
# GPU 2 : 62C
# GPU 2 : 63C
# GPU 0 : 69C
# GPU 1 : 64C
# GPU 2 : 64C
# GPU 2 : 65C
# GPU 0 : 70C
# GPU 0 : 71C
# GPU 1 : 65C
# GPU 2 : 66C
# GPU 1 : 66C
# GPU 2 : 67C
# GPU 0 : 72C
# GPU 1 : 67C
# GPU 2 : 68C
# The simulation has become unstable. Terminating to avoid lock-up (1)
# Attempting restart (step 12630000)
# GPU [GeForce GTX 660 Ti] Platform [Windows] Rev [3212] VERSION [65]
# SWAN Device 2 :
# Name : GeForce GTX 660 Ti
# ECC : Disabled
# Global mem : 3072MB
# Capability : 3.0
# PCI ID : 0000:08:00.0
# Device clock : 1045MHz
# Memory clock : 3004MHz
# Memory width : 192bit
# Driver version : r343_98 : 34411
# The simulation has become unstable. Terminating to avoid lock-up (1)
</stderr_txt>
]]>
|
|
|
TJSend message
Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level
Scientific publications
|
I don't think the 344.11 driver is pushing the cards harder as with this driver my 780Ti's are around 700 seconds slower than with the 331 driver I used until 309 September when I was forced to update as I got errors with the new app. See below in this thread.
____________
Greetings from TJ |
|
|
|
Hello - I have noticed that my dual gtx 780 machine has been getting mostly beta tasks lately - only 2 short runs and no long runs over the past few days ? I even set my prefs to no beta and no other apps but still pulling only beta tasks?
My other 3 systems on the account - gtx 770 & 660 do not show any beta tasks?
Just curious |
|
|
|
I too seem to only be getting Beta, even though I've re-enabled all applications. Is the scheduler prioritizing the Beta application somehow? |
|
|
eXaPowerSend message
Joined: 25 Sep 13 Posts: 293 Credit: 1,897,601,978 RAC: 0 Level
Scientific publications
|
I don't think the 344.11 driver is pushing the cards harder as with this driver my 780Ti's are around 700 seconds slower than with the 331 driver I used until 309 September when I was forced to update as I got errors with the new app. See below in this thread.
With new technologies being added to 343 branch driver for Second Generation Maxwell GM204: Dynamic Super Resolution, Third Generation Delta Color Compression, Multi-Pixel Programming Sampling, NVidia VXGI (Real-Time-Voxel-Global Illumination), VR Direct, Multi-Projection Acceleration, and Multi-Frame Sampled Anti-Aliasing(MFAA) with support for CSAA removed. HDMI 2.0 support was also added.
I'd say this driver branch is not fully complete yet. A couple more releases should find driver's full potential. Considering how support for pre-Fermi cards were dropped, and amount differences between SM/SMX/SMM, these first 343 branch drivers have room to be refined.
|
|
|
|
Matt:
I even think the canary behavior works better for me now. I tried the scenario where it was failing on the 8.41 app, and now it worked fine without failure on the 8.46 beta app.
Can you please explain, in detail, how the canary behavior was changed? How exactly does behave in 8.46?
Thanks,
Jacob
Any answer on this? |
|
|
MJHProject administrator Project developer Project scientist Send message
Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level
Scientific publications
|
Can you please explain, in detail, how the canary behavior was changed? How exactly does behave in 8.46?
It doesn't. I've disabled it altogether. I'm counting on the newer drivers to do a better job at recovering from deadlocks.
Matt |
|
|
|
Sound good to me. If you ever decide to re-add it, or modify its functionality, please be sure to let us know.
Thanks,
Jacob |
|
|
Beyond Send message
Joined: 23 Nov 08 Posts: 1112 Credit: 6,162,416,256 RAC: 0 Level
Scientific publications
|
Can you please explain, in detail, how the canary behavior was changed? How exactly does behave in 8.46?
It doesn't. I've disabled it altogether. I'm counting on the newer drivers to do a better job at recovering from deadlocks. Matt
Thanks much for disabling that feature, I've lost a lot of WUs to it :-) |
|
|