Advanced search

Message boards : News : PASCAL App Testing

Author Message
Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44869 - Posted: 27 Oct 2016 | 14:11:15 UTC
Last modified: 27 Oct 2016 | 14:21:16 UTC

Hello Crunchers,

I have just put out application version 900 for owners of Pascal GPUs. To test this you need:

* A Geforce 10-series or Tesla P-series GPU
* 64 bit Windows 7 or later
* NVIDIA Driver 360+
* Accept beta work, and WUs from the acemdbeta and acemdshort applications

Please report all problems in this thread.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44871 - Posted: 27 Oct 2016 | 16:14:49 UTC
Last modified: 27 Oct 2016 | 16:17:38 UTC

WOW

EDIT: It's not listed in the http://www.gpugrid.net/apps.php page.

EDIT of EDIT: Now it's listed in short runs too as 9.10 :)

Dave Peachey
Send message
Joined: 16 May 09
Posts: 11
Credit: 131,226,034
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 44872 - Posted: 27 Oct 2016 | 16:54:53 UTC
Last modified: 27 Oct 2016 | 16:55:21 UTC

MJH,

After a plethora of failed downloads (well, 14) with various PASCALS, SDOERR_CASP10s and the like, I've had several of the new CUDA80 items actual get to the point of running with the following results:


There are more downloading, but I suspect they'll also fail; if I get a successful run, I'll post again.

Note: I'm using BOINC 7.6.22 on Windows 7 Pro SP1 (64bit) with NVIDIA GTX1060 (3GB) running v372.70 drivers.

Cheers
Dave

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44874 - Posted: 27 Oct 2016 | 17:03:58 UTC
Last modified: 27 Oct 2016 | 17:08:46 UTC

Just tried to get some test apps and it took me three times just do download the main DLL. Now, after successfully downloading after all, it immediately quit with a crash. Ouch.

Problemsignatur:
Problemereignisname: APPCRASH
Anwendungsname: acemd.904-80.exe
Anwendungsversion: 0.0.0.0
Anwendungszeitstempel: 58100aa4
Fehlermodulname: ntdll.dll
Fehlermodulversion: 6.3.9600.18438
Fehlermodulzeitstempel: 57ae642e
Ausnahmecode: c000007b
Ausnahmeoffset: 00000000000ecdd0
Betriebsystemversion: 6.3.9600.2.0.0.256.48
Gebietsschema-ID: 1031
Zusatzinformation 1: ac05
Zusatzinformation 2: ac0507478d1c5bd693cfc4fe3987e900
Zusatzinformation 3: ac05
Zusatzinformation 4: ac0507478d1c5bd693cfc4fe3987e900


Edit: my GPU is a Palit GTX 1070 Super Jetstream.
Driver is 10.18.13.6839
OS is Win8.1/64 Bit
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44875 - Posted: 27 Oct 2016 | 17:06:05 UTC - in response to Message 44872.

All three with

(unknown error) - exit code -1073741515 (0xc0000135)

- which is very far from being an 'unknown' error: it's the extremely common NTSTATUS error

0xC0000135
STATUS_DLL_NOT_FOUND
{Unable To Locate Component}
This application has failed to start because %hs was not found. Reinstalling the application might fix this problem.

Recommended action: check all application dependencies using Dependency Walker, and compare with the files supplied and listed in the <app_version> segment of client_state.xml

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44876 - Posted: 27 Oct 2016 | 17:18:44 UTC - in response to Message 44875.

All three with

(unknown error) - exit code -1073741515 (0xc0000135)


I've got the same error.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44878 - Posted: 27 Oct 2016 | 17:26:52 UTC - in response to Message 44876.

All three with

(unknown error) - exit code -1073741515 (0xc0000135)


I've got the same error.

So, we need to find out which (needed) DLL matt has forgotten to define in app_version. Sorry, I don't own a Pascal card (yet): it's down to you early adopters to debug.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44879 - Posted: 27 Oct 2016 | 17:37:27 UTC - in response to Message 44878.

So, we need to find out which (needed) DLL matt has forgotten to define in app_version. Sorry, I don't own a Pascal card (yet): it's down to you early adopters to debug.

CUDART64.DLL - missing underscore at the 1st character in the file name
CUFFT64_80.DLL - missing underscore at the 1st character in the file name
TCL86.DLL - missing underscore at the 1st character in the file name

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44880 - Posted: 27 Oct 2016 | 17:38:10 UTC - in response to Message 44878.


So, we need to find out which (needed) DLL matt has forgotten to define in app_version. Sorry, I don't own a Pascal card (yet): it's down to you early adopters to debug.


Richard -- are you saying you got the app but dont have a Pascal card? That shouldn't have happened! Could you link to the WU/result please?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44881 - Posted: 27 Oct 2016 | 17:45:26 UTC - in response to Message 44879.
Last modified: 27 Oct 2016 | 17:49:01 UTC


CUDART64.DLL - missing underscore at the 1st character in the file name
CUFFT64_80.DLL - missing underscore at the 1st character in the file name
TCL86.DLL - missing underscore at the 1st character in the file name

[/quote]

There are 3 aux files -- cudart64_80.dll cufft64_80.dll and tcl86.dll . I'm distributing those with underscore prefixes, but then they make it into the BOINC slot directory they should be named without the _.

Could you inspect the slot directory and see what's there, please?
Althernatively, if you cant get to a slot directory, rename the files without the _'s and try dependency walker again.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44882 - Posted: 27 Oct 2016 | 17:48:57 UTC - in response to Message 44879.

So, we need to find out which (needed) DLL matt has forgotten to define in app_version. Sorry, I don't own a Pascal card (yet): it's down to you early adopters to debug.

CUDART64.DLL - missing underscore at the 1st character in the file name
CUFFT64_80.DLL - missing underscore at the 1st character in the file name
TCL86.DLL - missing underscore at the 1st character in the file name

I've copied the filed without the leading "_", but the dependency walker still shows a lot of yellow question marks. It's so many, that I think it's caused by a missing compiler option or something similar.
API-MS-WIN-CORE-HANDLE-L1-1-0.DLL
API-MS-WIN-CORE-RTLSUPPORT-L1-2-0.DLL
API-MS-WIN-CORE-SYSINFO-L1-2-1.DLL
(Thousands of similar)
EXT-MS-WIN-NTDSAPI-ACTIVEDIRECTORYCLIENT-L1-1-1.DLL
EXT-MS-WIN-APPXDEPLOYMENTCLIENT-APPXDEPLOYONECORE-L1-1-0.DLL
....
EXT-MS-WIN-GDI-PATH-L1-1-0.DLL
EXT-MS-WIN-NTUSER-SYNCH-L1-1-0.DLL

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44883 - Posted: 27 Oct 2016 | 17:50:05 UTC - in response to Message 44881.

Could you inspect the slot directory and see what's there, please?
Althernatively, if you cant get to a slot directory, rename the files without the _'s and try dependency walker again.

It's ok in the slot directory.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44884 - Posted: 27 Oct 2016 | 17:51:04 UTC

BTW the exe I've received is acemd.904-80.exe
Is it okay? (9.04)

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44885 - Posted: 27 Oct 2016 | 17:53:52 UTC - in response to Message 44882.


API-MS-WIN-CORE-HANDLE-L1-1-0.DLL
API-MS-WIN-CORE-RTLSUPPORT-L1-2-0.DLL
API-MS-WIN-CORE-SYSINFO-L1-2-1.DLL
(Thousands of similar)
EXT-MS-WIN-NTDSAPI-ACTIVEDIRECTORYCLIENT-L1-1-1.DLL
EXT-MS-WIN-APPXDEPLOYMENTCLIENT-APPXDEPLOYONECORE-L1-1-0.DLL
....
EXT-MS-WIN-GDI-PATH-L1-1-0.DLL
EXT-MS-WIN-NTUSER-SYNCH-L1-1-0.DLL


And when it ran inside dependency walker, did it complete (text console opens, a few lines are output briefly, then it closes) or did you get an error dialog?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44886 - Posted: 27 Oct 2016 | 17:56:58 UTC - in response to Message 44884.

BTW the exe I've received is acemd.904-80.exe
Is it okay? (9.04)


Yep, 9.04 is right

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44887 - Posted: 27 Oct 2016 | 18:00:35 UTC - in response to Message 44880.
Last modified: 27 Oct 2016 | 18:12:45 UTC

So, we need to find out which (needed) DLL matt has forgotten to define in app_version. Sorry, I don't own a Pascal card (yet): it's down to you early adopters to debug.

Richard -- are you saying you got the app but dont have a Pascal card? That shouldn't have happened! Could you link to the WU/result please?

No, not me - I looked at the results (failures) reported by others, and tried to analyse what the problems were from there.

My maximum capability is GTX 970 at the moment, and I'm still getting app v8.48 as expected.

Edit - my local trade counter has some GTX 1050 Ti in stock - I could pick one up in the morning if you're still stuck.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44888 - Posted: 27 Oct 2016 | 18:04:14 UTC - in response to Message 44882.

API-MS-WIN-CORE-HANDLE-L1-1-0.DLL
API-MS-WIN-CORE-RTLSUPPORT-L1-2-0.DLL
API-MS-WIN-CORE-SYSINFO-L1-2-1.DLL
(Thousands of similar)
EXT-MS-WIN-NTDSAPI-ACTIVEDIRECTORYCLIENT-L1-1-1.DLL
EXT-MS-WIN-APPXDEPLOYMENTCLIENT-APPXDEPLOYONECORE-L1-1-0.DLL
....
EXT-MS-WIN-GDI-PATH-L1-1-0.DLL
EXT-MS-WIN-NTUSER-SYNCH-L1-1-0.DLL


Could you try installing the Visual Studio 2010 Redistributables

https://www.microsoft.com/en-US/download/details.aspx?id=14632

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44889 - Posted: 27 Oct 2016 | 18:07:03 UTC - in response to Message 44876.
Last modified: 27 Oct 2016 | 18:37:20 UTC

All three with

(unknown error) - exit code -1073741515 (0xc0000135)


I've got the same error.

All WU are erroring out:
1073741515 (0xffffffffc0000135) Unknown error number on 5-MJHARVEY_PASCALx4002-0-10-RND1192_4 and
App acemd 9.04-80 has stopped responding with regular CASP short WU on Maxwell [-1073741515 (0xffffffffc0000135) Unknown error number.]
Every WU download has a file that stalls.

EDIT- App acemd 9.04-80 has stopped responding details fault module as zlib.dll version 6.3.9600.17736

My host not allowe anymore tasks - I've used up daily quota of WU due to errors. Maybe a reset or change to a higher amount for beta testing would help?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44890 - Posted: 27 Oct 2016 | 18:10:59 UTC - in response to Message 44888.

API-MS-WIN-CORE-HANDLE-L1-1-0.DLL
API-MS-WIN-CORE-RTLSUPPORT-L1-2-0.DLL
API-MS-WIN-CORE-SYSINFO-L1-2-1.DLL
(Thousands of similar)
EXT-MS-WIN-NTDSAPI-ACTIVEDIRECTORYCLIENT-L1-1-1.DLL
EXT-MS-WIN-APPXDEPLOYMENTCLIENT-APPXDEPLOYONECORE-L1-1-0.DLL
....
EXT-MS-WIN-GDI-PATH-L1-1-0.DLL
EXT-MS-WIN-NTUSER-SYNCH-L1-1-0.DLL


Could you try installing the Visual Studio 2010 Redistributables

https://www.microsoft.com/en-US/download/details.aspx?id=14632

It might be worth simply double-clicking on the executable file - completely outside the BOINC environment. Last time I had an app fail to run, it popped up a dialog to say which runtime version was missing - and there's a %hs placeholder in that NTSTATUS description, which might get populated and displayed.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44891 - Posted: 27 Oct 2016 | 18:15:35 UTC - in response to Message 44890.

It might be worth simply double-clicking on the executable file - completely outside the BOINC environment. Last time I had an app fail to run, it popped up a dialog to say which runtime version was missing - and there's a %hs placeholder in that NTSTATUS description, which might get populated and displayed.

This method gives the error saying zlib1.dll is missing :)

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,672,242,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 44892 - Posted: 27 Oct 2016 | 18:16:07 UTC

I apparently hit my daily quota of 3 WU and I cannot help you guys test the new units. Is there a way you can interdict this limit so I can help.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44893 - Posted: 27 Oct 2016 | 18:21:58 UTC

The size of acemd.848-65.exe is 5.126.656 bytes, while the size of acemd.904-80.exe is only 1.849.856 bytes.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44894 - Posted: 27 Oct 2016 | 18:22:30 UTC - in response to Message 44891.

It might be worth simply double-clicking on the executable file - completely outside the BOINC environment. Last time I had an app fail to run, it popped up a dialog to say which runtime version was missing - and there's a %hs placeholder in that NTSTATUS description, which might get populated and displayed.

This method gives the error saying zlib1.dll is missing :)

From answers.microsoft.com:

zlib is an open source library file that may be used with various programs. It's not a Windows native file.

It was a good decision not to download the dubious "repair tool" your search found.
This link should help http://www.zlib.net/

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44895 - Posted: 27 Oct 2016 | 18:38:37 UTC - in response to Message 44891.


This method gives the error saying zlib1.dll is missing :)


Weird, I dont see that dependency at all. Version 911 now includes this anyway.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44896 - Posted: 27 Oct 2016 | 18:39:19 UTC - in response to Message 44893.

The size of acemd.848-65.exe is 5.126.656 bytes, while the size of acemd.904-80.exe is only 1.849.856 bytes.


That's because it is Pascal only.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44897 - Posted: 27 Oct 2016 | 18:58:28 UTC - in response to Message 44892.

I apparently hit my daily quota of 3 WU and I cannot help you guys test the new units. Is there a way you can interdict this limit so I can help.

I also hit daily quota after 12 errors. Is it possible to raise the limit on hosts with Pascal?

BOINC manager message says no short tasks are available even as server status shows 955 unsent.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44898 - Posted: 27 Oct 2016 | 19:10:53 UTC
Last modified: 27 Oct 2016 | 19:11:14 UTC

I've hit the daily quota of my host, so I had to hack the BOINC manager to get a new host ID, but after that I've got a different error :)
http://www.gpugrid.net/result.php?resultid=15434934

(unknown error) - exit code -1073741511 (0xc0000139)

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44899 - Posted: 27 Oct 2016 | 19:13:03 UTC - in response to Message 44898.

I've hit the daily quota of my host, so I had to hack the BOINC manager to get a new host ID, but after that I've got a different error :)
http://www.gpugrid.net/result.php?resultid=15434934
(unknown error) - exit code -1073741511 (0xc0000139)


Would you mind trying to run it again? I've included the zlib.dll you said it was missing, but it would seem there's still something else amiss..

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44900 - Posted: 27 Oct 2016 | 19:16:44 UTC - in response to Message 44899.

Would you mind trying to run it again? I've included the zlib.dll you said it was missing, but it would seem there's still something else amiss..

It says that it could not find the inflateGetHeader process entry point in tcl86.dll

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44901 - Posted: 27 Oct 2016 | 19:26:20 UTC - in response to Message 44900.


It says that it could not find the inflateGetHeader process entry point in tcl86.dll


Ok. 912.

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 44902 - Posted: 27 Oct 2016 | 19:29:34 UTC
Last modified: 27 Oct 2016 | 19:31:05 UTC

Exit status -1073741701 (0xffffffffc000007b) Unknown error number

Task Name 1-MJHARVEY_PASCALx4002-0-10-RND3464_7
https://www.gpugrid.net/result.php?resultid=15434689

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44903 - Posted: 27 Oct 2016 | 19:30:45 UTC - in response to Message 44901.


It says that it could not find the inflateGetHeader process entry point in tcl86.dll


Ok. 912.

Still NOT OK :)
http://www.gpugrid.net/result.php?resultid=15430762
(unknown error) - exit code -59 (0xffffffc5) #SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44904 - Posted: 27 Oct 2016 | 19:32:14 UTC

27.10.2016 21:35:26 | GPUGRID | Task e6s8_e5s6p0f1-SDOERR_CASP0X_crystal_contacts_1ns_ntl9_1-0-1-RND0552_0 exited with zero status but no 'finished' file

27.10.2016 21:35:26 | GPUGRID | If this happens repeatedly you may need to reset the project.


what can we derive from this...?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44905 - Posted: 27 Oct 2016 | 19:33:33 UTC - in response to Message 44902.

Exit status -1073741701 (0xffffffffc000007b) Unknown error number

0xC000007B
STATUS_INVALID_IMAGE_FORMAT
{Bad Image}
%hs is either not designed to run on Windows or it contains an error. Try installing the program again using the original installation media or contact your system administrator or the software vendor for support.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44906 - Posted: 27 Oct 2016 | 19:33:51 UTC - in response to Message 44903.


It says that it could not find the inflateGetHeader process entry point in tcl86.dll


Ok. 912.

Still NOT OK :)
http://www.gpugrid.net/result.php?resultid=15430762
(unknown error) - exit code -59 (0xffffffc5) #SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610

Getting same error message also.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44907 - Posted: 27 Oct 2016 | 19:34:27 UTC
Last modified: 27 Oct 2016 | 19:37:46 UTC

When I try to start it manually it gives a very different error message than before:

The application was unable to start correctly (0xc000007b)

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44908 - Posted: 27 Oct 2016 | 19:44:49 UTC
Last modified: 27 Oct 2016 | 19:46:37 UTC

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -59 (0xffffffc5)
</message>
<stderr_txt>
# GPU [GeForce GTX 1080] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX 1080
# ECC : Disabled
# Global mem : 8192MB
# Capability : 6.1
# PCI ID : 0000:01:00.0
# Device clock : 1784MHz
# Memory clock : 5005MHz
# Memory width : 256bit
# Driver version : r372_53 : 37254
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610

</stderr_txt>
]]>


+1

NOTE: my 1070 is apparently identified as a 1080 (???)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44909 - Posted: 27 Oct 2016 | 19:44:58 UTC
Last modified: 27 Oct 2016 | 20:03:50 UTC

Testing Now on Driver 373.6 and two GTX 1080 Cards.
Downloads are very Very Slow.
Computer:
http://www.gpugrid.net/show_host_detail.php?hostid=379625

<app_config>
<app>
<name>acemdshort</name>
<max_concurrent>1</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdbeta</name>
<max_concurrent>1</max_concurrent>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>



One Running One Failed with: Take it Back Both Crashed

Problem signature:
Problem Event Name: APPCRASH
Application Name: acemd.904-80.exe
Application Version: 0.0.0.0
Application Timestamp: 58100aa4
Fault Module Name: amData\BOINC\slots\49\tcl86.dll!inflateGetHeader
Fault Module Version: 6.3.9600.18438
Fault Module Timestamp: 57ae642e
Exception Code: c0000139
Exception Offset: 00000000000ecdd0
OS Version: 6.3.9600.2.0.0.272.7
Locale ID: 1033
Additional Information 1: ac05
Additional Information 2: ac0507478d1c5bd693cfc4fe3987e900
Additional Information 3: ac05
Additional Information 4: ac0507478d1c5bd693cfc4fe3987e900



Faulting application name: acemd.904-80.exe, version: 0.0.0.0, time stamp: 0x58100aa4
Faulting module name: amData\BOINC\slots\49\tcl86.dll!inflateGetHeader, version: 6.3.9600.18438, time stamp: 0x57ae642e
Exception code: 0xc0000139
Fault offset: 0x00000000000ecdd0
Faulting process id: 0x1e8
Faulting application start time: 0x01d2308caf2e98fe
Faulting application path: D:\ProgramData\BOINC\projects\www.gpugrid.net\acemd.904-80.exe
Faulting module path: amData\BOINC\slots\49\tcl86.dll
Report Id: ed41a9e4-9c7f-11e6-810e-1866da6f8db3
Faulting package full name:
Faulting package-relative application ID:
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44910 - Posted: 27 Oct 2016 | 19:48:06 UTC - in response to Message 44909.

Looks like Matt's pulled the Beta apps for now.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44911 - Posted: 27 Oct 2016 | 19:52:17 UTC - in response to Message 44910.

Looks like Matt's pulled the Beta apps for now.

That last "#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610" error definitely sounded like a cue for a beer'n'pizza break - I don't blame him for taking a pause and quite possibly a night's rest.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44912 - Posted: 27 Oct 2016 | 19:54:31 UTC - in response to Message 44911.

Looks like Matt's pulled the Beta apps for now.

That last "#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610" error definitely sounded like a cue for a beer'n'pizza break - I don't blame him for taking a pause and quite possibly a night's rest.

Yeah, that's where we started :) (=the 8.48 app gives the same error message)

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44913 - Posted: 27 Oct 2016 | 20:17:23 UTC - in response to Message 44912.

913.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44914 - Posted: 27 Oct 2016 | 20:24:59 UTC - in response to Message 44913.

913.

-59 (0xffffffffffffffc5) Unknown error number
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 520
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610


3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44915 - Posted: 27 Oct 2016 | 20:25:31 UTC

I have an endless loop..............


27.10.2016 22:29:18 | GPUGRID | Task e7s14_e3s3p0f4-SDOERR_CASP0X_crystal_contacts_1ns_ntl9_0-0-1-RND5449_2 exited with zero status but no 'finished' file

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Dave Peachey
Send message
Joined: 16 May 09
Posts: 11
Credit: 131,226,034
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 44916 - Posted: 27 Oct 2016 | 20:30:28 UTC
Last modified: 27 Oct 2016 | 20:31:00 UTC

Just tried another one against v9.13 (same BOINC, O/S, card and driver as before); task http://www.gpugrid.net/result.php?resultid=15435746 errored out again:

Stderr output

<core_client_version>7.6.22</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -59 (0xffffffc5)
</message>
<stderr_txt>
# GPU [GeForce GTX 1060 3GB] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX 1060 3GB
# ECC : Disabled
# Global mem : 3072MB
# Capability : 6.1
# PCI ID : 0000:04:00.0
# Device clock : 1784MHz
# Memory clock : 4004MHz
# Memory width : 192bit
# Driver version : r372_69 : 37270
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 610

</stderr_txt>
]]>

Dave

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44918 - Posted: 27 Oct 2016 | 20:53:03 UTC

914.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44919 - Posted: 27 Oct 2016 | 20:55:54 UTC - in response to Message 44918.

914.

It's working!

Dave Peachey
Send message
Joined: 16 May 09
Posts: 11
Credit: 131,226,034
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 44920 - Posted: 27 Oct 2016 | 21:07:00 UTC
Last modified: 27 Oct 2016 | 21:08:42 UTC

Task http://www.gpugrid.net/result.php?resultid=15436247 under v9.14 is 5:30mins in and 4% complete.

Agreed, seems to be working; excellent work Matt!

Cheers
Dave

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 44921 - Posted: 27 Oct 2016 | 21:07:28 UTC

success here to!

Profile [B@P] Daniel
Send message
Joined: 17 Sep 16
Posts: 5
Credit: 382,453,727
RAC: 0
Level
Asp
Scientific publications
watwatwatwatwatwat
Message 44922 - Posted: 27 Oct 2016 | 21:12:31 UTC

Works for me. 7 mins 10 secs for Short runs WUs on Nvidia 1070, nice :)
____________

Profile cybersleauth
Send message
Joined: 30 Sep 11
Posts: 10
Credit: 119,766,724
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 44923 - Posted: 27 Oct 2016 | 21:14:10 UTC - in response to Message 44872.

How many g-flops do you get from the 1060?

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44924 - Posted: 27 Oct 2016 | 21:14:53 UTC

Great News! Can't wait till I get some!
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44925 - Posted: 27 Oct 2016 | 21:15:22 UTC

sorry to spoil the party but I still have an endless loop here...........................


27.10.2016 23:18:26 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:26 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:27 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:27 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:28 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:28 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:29 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:29 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:30 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:30 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:31 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:31 | GPUGRID | If this happens repeatedly you may need to reset the project.
27.10.2016 23:18:32 | GPUGRID | Task e8s9_e2s4p0f2-SDOERR_CASP0X_crystal_contacts_1ns_a3D_0-0-1-RND5487_2 exited with zero status but no 'finished' file
27.10.2016 23:18:32 | GPUGRID | If this happens repeatedly you may need to reset the project.


____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44926 - Posted: 27 Oct 2016 | 21:17:41 UTC - in response to Message 44925.

sorry to spoil the party but I still have an endless loop here...........................



Could you unhide your computers so I can see, please?

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44927 - Posted: 27 Oct 2016 | 21:19:06 UTC - in response to Message 44926.

sorry to spoil the party but I still have an endless loop here...........................



Could you unhide your computers so I can see, please?


Done.... :-)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44928 - Posted: 27 Oct 2016 | 21:25:54 UTC

The app works perfectly on my first machine (Win7/64), i7-3770S, GTX1080 but causes endless loops on my second (Win 8.1/64), i7-6700K, GTX 1070. The 1070 is identified as a 1080... but I dont know if this has something to do with it.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44929 - Posted: 27 Oct 2016 | 21:26:21 UTC - in response to Message 44927.

JoergF - your 1080 machine looks fine, only the 1070 might be misbehaving. Try a project reset as indicated and see if that resolves it.

Matt

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44930 - Posted: 27 Oct 2016 | 21:30:15 UTC

OK gang, looks to be working satisfactorily. I'll check back in tomorrow and follow up on any common failure modes.


Linux app to follow next week.

Matt

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44931 - Posted: 27 Oct 2016 | 21:32:14 UTC

Okay... I will try.

Besides, I hope that you have enough jobs for us to crunch... my 1080 just swallowed a "2-3 hour short run" in about 6 minutes! And that is still not 100% as it shows load oscillations. It simply is too fast for one CPU core feeding it.

Wow. What a great leap forward.

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44932 - Posted: 27 Oct 2016 | 21:32:40 UTC - in response to Message 44927.

Might be the older driver (368.39)?

The 372.54 driver works on your W7 system and a GTX1070 works for someone else.
For the record I can see that the 372.70 and 373.6 drivers also work for others.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44933 - Posted: 27 Oct 2016 | 21:37:08 UTC - in response to Message 44932.

Might be the older driver (368.39)?

The 372.54 driver works on your W7 system and a GTX1070 works for someone else.
For the record I can see that the 372.70 and 373.6 drivers also work for others.


Okay... stay tuned... I will update and try again.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44934 - Posted: 27 Oct 2016 | 21:37:12 UTC - in response to Message 44919.
Last modified: 27 Oct 2016 | 21:50:39 UTC

914.

It's working!

Multi GPU set-up working well - Looks like the app can handle high OC's. Currently at 2050MHz. (Completed CASP WU clean)
Upped clock to 2076MHz - so far so good with no simulation unstable messages.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44935 - Posted: 27 Oct 2016 | 21:50:24 UTC

Does anyone else have the same problem downloading the cufft64_80.dll? It stops transmission after 3-5% every time.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44936 - Posted: 27 Oct 2016 | 21:52:26 UTC - in response to Message 44935.

Does anyone else have the same problem downloading the cufft64_80.dll? It stops transmission after 3-5% every time.

It took over 30 minutes for mine to complete.
Each Task take 25-35 minutes do download.
I have seen this for over a year if not more on this Project.

____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44937 - Posted: 27 Oct 2016 | 21:52:32 UTC - in response to Message 44935.
Last modified: 27 Oct 2016 | 21:53:19 UTC

This new Projects task only 5 minutes to complete a task!!!!
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44938 - Posted: 27 Oct 2016 | 21:55:56 UTC - in response to Message 44932.

Might be the older driver (368.39)?

The 372.54 driver works on your W7 system and a GTX1070 works for someone else.
For the record I can see that the 372.70 and 373.6 drivers also work for others.


FYI about Driver 373.06 it is also the last Driver that works over on Folding@Home GPU Task.
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44939 - Posted: 27 Oct 2016 | 21:56:55 UTC

yes... it is amazing. My 1080 needs to run 4 jobs in parallel and one CPU core per job feeding it. Otherwise it is not utilized fully. Wow.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44940 - Posted: 27 Oct 2016 | 21:59:15 UTC - in response to Message 44932.
Last modified: 27 Oct 2016 | 22:01:40 UTC

Might be the older driver (368.39)?

The 372.54 driver works on your W7 system and a GTX1070 works for someone else.
For the record I can see that the 372.70 and 373.6 drivers also work for others.


can you possibly allow some more tasks per machine because I want to check my driver update but already reached the daily limit of jobs... or have you already run out of work? Thanks...

28.10.2016 00:03:15 | GPUGRID | This computer has finished a daily quota of 4 tasks
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44941 - Posted: 27 Oct 2016 | 22:02:50 UTC - in response to Message 44939.

yes... it is amazing. My 1080 needs to run 4 jobs in parallel and one CPU core per job feeding it. Otherwise it is not utilized fully. Wow.


Testing this now:
<app_config>
<app>
<name>acemdbeta</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
</app_config>

But Downloads are Supper Slow.
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44943 - Posted: 27 Oct 2016 | 22:07:19 UTC - in response to Message 44941.
Last modified: 27 Oct 2016 | 22:08:35 UTC

yes... it is amazing. My 1080 needs to run 4 jobs in parallel and one CPU core per job feeding it. Otherwise it is not utilized fully. Wow.


Testing this now:
<app_config>
<app>
<name>acemdbeta</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<gpu_usage>0.5</gpu_usage>
<cpu_usage>0.5</cpu_usage>
</gpu_versions>
</app>
</app_config>

But Downloads are Supper Slow.


I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

Besides, dont forget to specify the "max_concurrent" parameter in your above config.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44944 - Posted: 27 Oct 2016 | 22:09:46 UTC - in response to Message 44937.

This new Projects task only 5 minutes to complete a task!!!!

A lot of the tasks which have been referenced in this thread over the last couple of hours have been SDOERR_CASP11_crystal_contacts_1ns - and they only take 5-6 minutes on my GTX 970s with the older app, too. Don't attribute all aspects of runtime to the new app alone.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44945 - Posted: 27 Oct 2016 | 22:12:25 UTC - in response to Message 44943.
Last modified: 27 Oct 2016 | 22:13:42 UTC

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44947 - Posted: 27 Oct 2016 | 22:17:14 UTC - in response to Message 44945.

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU



I already run 2 GPU tasks on my 1080 and dont have any other iGPU or CPU jobs active. The Pascal is simply too fast, so GPUGRID should possibly consider allowing 4 concurrent threads for the new app. Otherwise we will simply waste computing power by load oscillations.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44948 - Posted: 27 Oct 2016 | 22:17:38 UTC - in response to Message 44944.

This new Projects task only 5 minutes to complete a task!!!!

A lot of the tasks which have been referenced in this thread over the last couple of hours have been SDOERR_CASP11_crystal_contacts_1ns - and they only take 5-6 minutes on my GTX 970s with the older app, too. Don't attribute all aspects of runtime to the new app alone.

OK...


I am only hitting 75% of the GPU but that is ok too.

Going to: Mostly for Temps and Steady Tasks. This to from Retvari Zoltan below.

<app_config>
<app>
<name>acemdbeta</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>
</gpu_versions>
</app>
</app_config>
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44949 - Posted: 27 Oct 2016 | 22:20:49 UTC - in response to Message 44945.

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU



Then this is better
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>


Than this?

<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.5</cpu_usage>
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44950 - Posted: 27 Oct 2016 | 22:22:22 UTC - in response to Message 44944.
Last modified: 27 Oct 2016 | 22:25:12 UTC

Some of Matt's Beta apps will take 5 times as long as others. It's just a Beta test - don't try to high-tune your setups based preliminary beta runs. Their purpose is to test, report and facilitate fixing. When it basically works conclude nothing else.

Thanks for testing - you helped make it work!
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44951 - Posted: 27 Oct 2016 | 22:23:17 UTC - in response to Message 44949.

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU



Then this is better
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>


Than this?

<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.5</cpu_usage>



I need 4 Tasks to keep the 1080 fully utilized I am afraid.

<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.99</cpu_usage>

I see no alternative.

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44952 - Posted: 27 Oct 2016 | 22:26:36 UTC - in response to Message 44951.

It's a BETA test!
Runs with more atoms will task the cards to a greater extent...

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU



Then this is better
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>


Than this?

<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.5</cpu_usage>



I need 4 Tasks to keep the 1080 fully utilized I am afraid.

<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.99</cpu_usage>

I see no alternative.


____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44953 - Posted: 27 Oct 2016 | 22:26:59 UTC - in response to Message 44951.

Can we keep this thread for problem reports please.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44954 - Posted: 27 Oct 2016 | 22:27:36 UTC - in response to Message 44951.
Last modified: 27 Oct 2016 | 22:28:38 UTC

I would like to run 4 ... but I dont get more than 2. The log reads "reached a Limit on Tasks in Progress".

You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.
If you want to maximize GPU usage on a WDDM os (Windows Vista, 7, 8, 8.1, 10) you should:
- crunch only 1 CPU task
- not crunch on the iGPU
- use SWAN_SYNC environmental value to make the GPUGrid app use a full CPU thread
- use the app_config.xml posted earlier to run two WU on a single GPU



Then this is better
<gpu_usage>1.0</gpu_usage>
<cpu_usage>1.0</cpu_usage>


Than this?

<gpu_usage>1.0</gpu_usage>
<cpu_usage>0.5</cpu_usage>



I need 4 Tasks to keep the 1080 fully utilized I am afraid.

<gpu_usage>0.25</gpu_usage>
<cpu_usage>0.99</cpu_usage>

I see no alternative.


Remember this though:
You can't have more than 2 (per GPU), this is GPUGrid policy to achieve fast turnarounds.

You should not go more than 2 Task Per GPU.
<gpu_usage>0.5</gpu_usage>
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44955 - Posted: 27 Oct 2016 | 22:28:54 UTC

OK.. do you know when the app (roughly) will be released officially?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44956 - Posted: 27 Oct 2016 | 22:29:27 UTC - in response to Message 44953.

Can we keep this thread for problem reports please.


Sorry and Yes.

____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Dave Peachey
Send message
Joined: 16 May 09
Posts: 11
Credit: 131,226,034
RAC: 0
Level
Cys
Scientific publications
watwatwatwatwatwatwatwatwatwat
Message 44958 - Posted: 27 Oct 2016 | 22:33:25 UTC
Last modified: 27 Oct 2016 | 22:55:38 UTC

Echoing an earlier post, I'm still running one of Matt's test WUs (29-MJHARVEY_PASCALx4002-2-10-RND2572_0) on my GTX1060 (its currently 1hr50m in for 75% completion) and it seems fine.

As a further test, I've now added a second SDOERR10_CASP10 running in parallel which seems to be OK as well. The test WU is still taking far longer than that conventional, non-test WU (as one might expect) but this seems to prove the stability of the v9.14 application.

Thanks Matt
Dave

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44959 - Posted: 27 Oct 2016 | 22:54:27 UTC - in response to Message 44940.

Might be the older driver (368.39)?

The 372.54 driver works on your W7 system and a GTX1070 works for someone else.
For the record I can see that the 372.70 and 373.6 drivers also work for others.


can you possibly allow some more tasks per machine because I want to check my driver update but already reached the daily limit of jobs... or have you already run out of work? Thanks...

28.10.2016 00:03:15 | GPUGRID | This computer has finished a daily quota of 4 tasks


sorry to ask this one again...

28.10.2016 00:48:51 | GPUGRID | This computer has finished a daily quota of 4 tasks

appears on my gtx1070 Computer only whereas the other one (with my GTX1080) gets one job after the other. Is there anything wrong in my config?
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

eXaPower
Send message
Joined: 25 Sep 13
Posts: 293
Credit: 1,897,601,978
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44960 - Posted: 27 Oct 2016 | 22:55:45 UTC

Trying to get BOINC to download 6.5 CUDA app WU for (C.C 5.2) Maxwell - server only gives (C.C 6.1) 8.0 CUDA app WU which error out on Maxwell. Host's with multiple cards that are of different compute capability should keep a watchful eye.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44961 - Posted: 27 Oct 2016 | 22:59:13 UTC - in response to Message 44959.

28.10.2016 00:48:51 | GPUGRID | This computer has finished a daily quota of 4 tasks

appears on my gtx1070 Computer only whereas the other one (with my GTX1080) gets one job after the other. Is there anything wrong in my config?

No, it's just the scheduler's self defense algorithm (to filter failing hosts from ruining entire batches).
You don't have to do anything, it will be lifted from your host in 24 hours automatically, please be patient.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 44962 - Posted: 27 Oct 2016 | 23:07:26 UTC - in response to Message 44960.

Trying to get BOINC to download 6.5 CUDA app WU for (C.C 5.2) Maxwell - server only gives (C.C 6.1) 8.0 CUDA app WU which error out on Maxwell. Host's with multiple cards that are of different compute capability should keep a watchful eye.


Yes, if you have a host with a mix of Pascal and non-Pascal GPUs, you are going to have problems. Time to go shopping!

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44963 - Posted: 27 Oct 2016 | 23:10:24 UTC - in response to Message 44961.

28.10.2016 00:48:51 | GPUGRID | This computer has finished a daily quota of 4 tasks

appears on my gtx1070 Computer only whereas the other one (with my GTX1080) gets one job after the other. Is there anything wrong in my config?

No, it's just the scheduler's self defense algorithm (to filter failing hosts from ruining entire batches).
You don't have to do anything, it will be lifted from your host in 24 hours automatically, please be patient.


Okay, thanks...
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile Skivelitis2
Avatar
Send message
Joined: 12 Mar 16
Posts: 1
Credit: 129,575,973
RAC: 0
Level
Cys
Scientific publications
watwatwatwat
Message 44964 - Posted: 27 Oct 2016 | 23:42:25 UTC

Just received the following work unit for my GT 730:

63_MJHARVEY_PASCAL_x4002-1-10-RND9941_0

Host ID: 383850

Should I abort? I have since unchecked the option to run test apps.

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44965 - Posted: 28 Oct 2016 | 0:00:17 UTC

Should it take the Beta Tasks 15 to 20 minutes to download a file that is less than 3MB?
It is on every task that downloads.
I ask because the wait time between download are causing my CPU task to start up and then go into waiting status.

Or is the Servers being over burdened buy us members.
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

zioriga
Send message
Joined: 30 Oct 08
Posts: 46
Credit: 542,282,425
RAC: 3,246,377
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44966 - Posted: 28 Oct 2016 | 5:34:11 UTC

And now, we wait for a regular batch of short WUs

When there will be good news about long WUs ??

Arif Mert Kapicioglu
Send message
Joined: 26 May 10
Posts: 6
Credit: 597,131,550
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44967 - Posted: 28 Oct 2016 | 12:16:38 UTC

Hello,

MSI 1080 X 8G only factory oc/boost, 2 wu at the same time, gpu load %69, Win7X64, validated 80 wu and no error so far.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,672,242,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 44968 - Posted: 28 Oct 2016 | 12:19:14 UTC

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44969 - Posted: 28 Oct 2016 | 12:44:10 UTC - in response to Message 44968.

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.


Same problem here. 76% for long time, GPU and CPU idle.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44970 - Posted: 28 Oct 2016 | 14:23:11 UTC
Last modified: 28 Oct 2016 | 14:40:17 UTC

Strange but restarting the computer helped. Previosly idling WU is now completed and validated.

EDIT:

Effect:

15443859 11849617 28 Oct 2016 | 11:09:56 UTC 28 Oct 2016 | 14:20:55 UTC Completed and validated 4,786.27 942.96 1,500.00 Short runs (2-3 hours on fastest card) v9.14 (cuda80)
15443057 11848896 28 Oct 2016 | 11:09:56 UTC 28 Oct 2016 | 14:28:25 UTC Completed and validated 2,303.71 472.00 6,900.00 Short runs (2-3 hours on fastest card) v9.14 (cuda80)

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44971 - Posted: 28 Oct 2016 | 14:58:30 UTC

From previous experience with other apps and other projects, simply suspending that single task for a few seconds (using BOINC Manager), and then resuming it again, is often enough to get it running properly. A bit like the procedure I'm using on GPUGrid stalled downloads at the moment...

Profile 2DJFcFTcRK5gHhxpndmdMJorY...
Avatar
Send message
Joined: 15 Nov 12
Posts: 10
Credit: 792,812,843
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwat
Message 44972 - Posted: 28 Oct 2016 | 15:10:10 UTC
Last modified: 28 Oct 2016 | 15:10:20 UTC

Looks like it works on my machine...
https://www.gpugrid.net/result.php?resultid=15446504

Profile bcavnaugh
Send message
Joined: 8 Nov 13
Posts: 56
Credit: 1,002,640,163
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44973 - Posted: 28 Oct 2016 | 15:10:23 UTC - in response to Message 44971.
Last modified: 28 Oct 2016 | 15:13:17 UTC

From previous experience with other apps and other projects, simply suspending that single task for a few seconds (using BOINC Manager), and then resuming it again, is often enough to get it running properly. A bit like the procedure I'm using on GPUGrid stalled downloads at the moment...



The problem is the download on a 800kb file will stop for up to 10 minutes before it starts download, as this happens newer downloads start up for this project and the stall on the newer files and stop the first one from downloading.
I sat and watch this for an hour and while I can have 4 tasks running at the same time I can maybe get 3 to run so what happens is the first four completed but now new tasks have completed their downloads.
While this happen a new CPU task will start up and then the Downloads from here completes and then put the CPU task in Waiting to Run.
I have 14 Waiting to Run CPU Tasks because of this.
It seems that you can only download the number of task your can run and you are no longer allowed to have any tasks ready to run.....This is the main issue here.
For the 130Kbs that I see for download speeds we should be able to have 2 or even 3 time the number of tasks to be able to run.
This way as a task complete the next one can start and not have wait 10 or 15 minutes for the next task to be downloaded to even start.

Is what I do not know is this because it is a Beta Task or is this now the normal way this project is going to work.
____________

Crunching@EVGA The Number One Team in the BOINC Community. Folding@EVGA The Number One Team in the Folding@Home Community.

Tomas Brada
Send message
Joined: 3 Nov 15
Posts: 38
Credit: 6,768,093
RAC: 0
Level
Ser
Scientific publications
wat
Message 44974 - Posted: 28 Oct 2016 | 15:39:50 UTC - in response to Message 44973.

The problem is the download on a 800kb file will stop for up to 10 minutes before it starts download, as this happens newer downloads start up for this project and the stall on the newer files and stop the first one from downloading.

The download speed and stalls are common problems on this project.
https://www.gpugrid.net/forum_thread.php?id=4399
https://www.gpugrid.net/forum_thread.php?id=4373

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44975 - Posted: 28 Oct 2016 | 15:45:41 UTC - in response to Message 44973.

It seems that you can only download the number of task your can run and you are no longer allowed to have any tasks ready to run.....This is the main issue here.

This project limits the number of tasks that may be allocated by the Scheduler, but that's a different issue from the number of files which can be downloaded - which, as Thomas Brada says, is the subject of discussions in other threads.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44979 - Posted: 28 Oct 2016 | 17:45:49 UTC

Well, I got my GTX 1050 Ti - last Asus 'Expedition' model (designed for 24/7 running) in the shop - and I've just fired it up.

I must say, all your hard work last night paid off - it was an absolute breeze. Installed hardware (tool-free case), installed driver, attached to project (new host) - was assigned two tasks, and both downloaded without a hitch. Even the 139 MB cufft64_80.dll only took 37 seconds:

28/10/2016 18:30:24 | GPUGRID | Started download of _cufft64_80.dll
28/10/2016 18:31:01 | GPUGRID | Finished download of _cufft64_80.dll

OK, it's probably a good idea to attempt that sort of thing after everybody else has switched their machines off for the weekend, but still impressive.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 44980 - Posted: 28 Oct 2016 | 18:15:34 UTC - in response to Message 44971.

From previous experience with other apps and other projects, simply suspending that single task for a few seconds (using BOINC Manager), and then resuming it again, is often enough to get it running properly.


I know but this time it didn't help. Even restarting BM didn't help. But restart did. No idea why.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44982 - Posted: 28 Oct 2016 | 18:46:37 UTC

Possible small bug-ette to investigate next week: I tried suspending a 'short' (CASP 5ns) task, to let an 'even shorter' (CASP 1ns) task run instead and overtake it in the queue. Although BOINC v7.6.33 issued the 'preempt' instruction, acemd.914-80 just kept on running.

I don't think that's critical, certainly not in the short term.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44984 - Posted: 28 Oct 2016 | 19:03:12 UTC

Still I have problems with downloading cufft64_80.dll ... it takes several attempts and even hours to get it. But I cannot stop that by hand, as it would probably lock me out again for another day by the self defense algorithm. (sigh)
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

10esseeTony
Send message
Joined: 4 Feb 15
Posts: 8
Credit: 1,206,208,249
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44986 - Posted: 28 Oct 2016 | 19:30:08 UTC - in response to Message 44984.

Try going to Activity and turning Network to Suspend, then turn back on after a few seconds.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44987 - Posted: 28 Oct 2016 | 20:14:37 UTC - in response to Message 44986.

Try going to Activity and turning Network to Suspend, then turn back on after a few seconds.


no way ... I have to wait another 2 hours.

28.10.2016 22:16:51 | GPUGRID | Temporarily failed download of _cufft64_80.dll: transient HTTP error
28.10.2016 22:16:51 | GPUGRID | Backing off 02:17:53 on download of _cufft64_80.dll
28.10.2016 22:16:52 | | Project communication failed: attempting access to reference site
28.10.2016 22:16:54 | | Internet access OK - project servers may be temporarily down.
28.10.2016 22:17:24 | | Suspending network activity - user request
28.10.2016 22:17:29 | | Resuming network activity


____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 44990 - Posted: 28 Oct 2016 | 20:43:01 UTC - in response to Message 44987.

28.10.2016 22:17:24 | | Suspending network activity - user request
28.10.2016 22:17:29 | | Resuming network activity

and then you click the 'retry now' button...

Actually, the "suspend/resume" route is for when you happen to notice it while the download is still 'active' - though making no progress. If you've already reached 'backing off', just retry it without the extra work.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 44993 - Posted: 29 Oct 2016 | 0:00:37 UTC - in response to Message 44990.

28.10.2016 22:17:24 | | Suspending network activity - user request
28.10.2016 22:17:29 | | Resuming network activity

and then you click the 'retry now' button...

Actually, the "suspend/resume" route is for when you happen to notice it while the download is still 'active' - though making no progress. If you've already reached 'backing off', just retry it without the extra work.


Seems to work ... for now. Thanks.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

DigitalParasite
Send message
Joined: 20 Jul 16
Posts: 3
Credit: 479,881,429
RAC: 0
Level
Gln
Scientific publications
watwatwatwatwatwat
Message 44994 - Posted: 29 Oct 2016 | 0:11:11 UTC

Just got home and checked, seems to be working fine with my 1070 card:
Short runs (2-3 hours on fastest card) v9.14 (cuda80)

Good job!

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 44998 - Posted: 29 Oct 2016 | 8:33:38 UTC - in response to Message 44994.
Last modified: 29 Oct 2016 | 8:54:21 UTC

works great on gtx 1070, 4 min for a short run:


https://youtu.be/VtI24iKLozU

10esseeTony
Send message
Joined: 4 Feb 15
Posts: 8
Credit: 1,206,208,249
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 45003 - Posted: 29 Oct 2016 | 14:26:17 UTC

I began crunching after 914 it seems, and haven't had any trouble, except for one small detail, but I believe this is more of a general issue with ANY computing task:

I asked one task to suspend and it was showing as suspended in the BOINC manager, but continued to crunch for perhaps 45-60 seconds.

mmonnin
Send message
Joined: 2 Jul 16
Posts: 337
Credit: 7,765,428,051
RAC: 10,112,111
Level
Tyr
Scientific publications
watwatwatwatwat
Message 45006 - Posted: 29 Oct 2016 | 15:07:22 UTC - in response to Message 44979.

Well, I got my GTX 1050 Ti - last Asus 'Expedition' model (designed for 24/7 running) in the shop - and I've just fired it up.

I must say, all your hard work last night paid off - it was an absolute breeze. Installed hardware (tool-free case), installed driver, attached to project (new host) - was assigned two tasks, and both downloaded without a hitch. Even the 139 MB cufft64_80.dll only took 37 seconds:

28/10/2016 18:30:24 | GPUGRID | Started download of _cufft64_80.dll
28/10/2016 18:31:01 | GPUGRID | Finished download of _cufft64_80.dll

OK, it's probably a good idea to attempt that sort of thing after everybody else has switched their machines off for the weekend, but still impressive.


Are you using the 375.63 or .70 driver? Some other DC projects and games don't like 375.63 at all.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45008 - Posted: 29 Oct 2016 | 16:37:56 UTC

When can we expect a Linux app?

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45009 - Posted: 29 Oct 2016 | 17:34:18 UTC - in response to Message 45006.

Well, I got my GTX 1050 Ti - last Asus 'Expedition' model (designed for 24/7 running) in the shop - and I've just fired it up.

I must say, all your hard work last night paid off - it was an absolute breeze. Installed hardware (tool-free case), installed driver, attached to project (new host) - was assigned two tasks, and both downloaded without a hitch. Even the 139 MB cufft64_80.dll only took 37 seconds:

28/10/2016 18:30:24 | GPUGRID | Started download of _cufft64_80.dll
28/10/2016 18:31:01 | GPUGRID | Finished download of _cufft64_80.dll

OK, it's probably a good idea to attempt that sort of thing after everybody else has switched their machines off for the weekend, but still impressive.

Are you using the 375.63 or .70 driver? Some other DC projects and games don't like 375.63 at all.

375.70 - host 388426

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45010 - Posted: 29 Oct 2016 | 17:46:13 UTC

Has anyone received a 20ns task on a Pascal GPU?
I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 45012 - Posted: 29 Oct 2016 | 18:27:32 UTC - in response to Message 45008.

Linux app to follow next week. /Matt

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45014 - Posted: 29 Oct 2016 | 18:41:59 UTC - in response to Message 45010.

Has anyone received a 20ns task on a Pascal GPU?
I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.

I'm getting 20ns on the long queue now - I think they were moved.

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45016 - Posted: 29 Oct 2016 | 18:59:25 UTC - in response to Message 45010.

Has anyone received a 20ns task on a Pascal GPU?
I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.


I get 5ns one on gtx 1070.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45019 - Posted: 29 Oct 2016 | 19:48:34 UTC - in response to Message 45010.
Last modified: 29 Oct 2016 | 19:55:35 UTC

I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.

They've been in the long queue for about the last 4 days, and a nice thumbs up for that. :-)

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45021 - Posted: 29 Oct 2016 | 20:24:43 UTC - in response to Message 45019.

I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.

They've been in the long queue for about the last 4 days, and a nice thumbs up for that. :-)

Then the server status page / detailed computing status table lists these 20ns units in the wrong queue:
Application in success error unsent progress rate Short runs (2-3 hours on fastest card) MJHARVEY_PASCALx400 0 56 715 26.52% SDOERR_CASP0X_crystal_contacts_1ns_a3D_ 0 37 781 8.44% SDOERR_CASP0X_crystal_contacts_1ns_ntl9_ 0 33 759 10.5% SDOERR_CASP0X_crystal_contacts_5ns_a3D_ 0 29 620 3.88% SDOERR_CASP0X_crystal_contacts_5ns_ntl9_ 0 35 683 3.8% SDOERR_CASP0X_crystal_ss_1ns_a3D_ 0 34 808 6.37% SDOERR_CASP0X_crystal_ss_1ns_ntl9_ 0 42 666 7.11% SDOERR_CASP0X_crystal_ss_5ns_a3D_ 4 34 541 5.42% SDOERR_CASP0X_crystal_ss_5ns_ntl9_ 5 46 712 4.56% SDOERR_CASP0X_crystal_ss_contacts_1ns_a3D_ 0 39 872 3.54% SDOERR_CASP0X_crystal_ss_contacts_1ns_ntl9_ 0 34 914 5.58% SDOERR_CASP0X_crystal_ss_contacts_5ns_a3D_ 0 34 542 5.41% SDOERR_CASP0X_crystal_ss_contacts_5ns_ntl9_ 0 35 736 4.17% SDOERR_CASP10_crystal_contacts_1ns_a3D_ 0 33 913 6.17% SDOERR_CASP10_crystal_contacts_1ns_ntl9_ 0 30 1761 11.28% SDOERR_CASP10_crystal_contacts_20ns_ntl9_ 0 38 686 9.97% SDOERR_CASP10_crystal_contacts_5ns_a3D_ 0 39 632 6.09% SDOERR_CASP10_crystal_contacts_5ns_ntl9_ 0 34 1211 8.47% SDOERR_CASP10_crystal_ss_1ns_a3D_ 0 33 1290 7.73% SDOERR_CASP10_crystal_ss_1ns_ntl9_ 0 35 1441 10.94% SDOERR_CASP10_crystal_ss_20ns_ntl9_ 0 46 628 13.38% SDOERR_CASP10_crystal_ss_5ns_a3D_ 0 41 687 10.2% SDOERR_CASP10_crystal_ss_5ns_ntl9_ 0 42 1057 13.29% SDOERR_CASP11_crystal_contacts_1ns_a3D_ 0 24 1172 9.71% SDOERR_CASP11_crystal_contacts_1ns_ntl9_ 0 27 1781 11.13% SDOERR_CASP11_crystal_contacts_20ns_ntl9_ 0 43 688 12.13% SDOERR_CASP11_crystal_contacts_5ns_a3D_ 0 28 685 9.99% SDOERR_CASP11_crystal_contacts_5ns_ntl9_ 0 28 1310 13.07% SDOERR_CASP11_crystal_ss_1ns_a3D_ 0 36 1206 10.53% SDOERR_CASP11_crystal_ss_1ns_ntl9_ 0 30 1616 13.02% SDOERR_CASP11_crystal_ss_20ns_ntl9_ 2 43 718 10.36% SDOERR_CASP11_crystal_ss_5ns_a3D_ 4 41 700 7.89% SDOERR_CASP11_crystal_ss_5ns_ntl9_ 0 50 1190 11.66% SDOERR_CASP11_crystal_ss_contacts_1ns_a3D_ 0 28 1197 8.77% SDOERR_CASP11_crystal_ss_contacts_1ns_ntl9_ 0 34 1274 9.71% SDOERR_CASP11_crystal_ss_contacts_20ns_ntl9_ 3 53 585 15.71% SDOERR_CASP11_crystal_ss_contacts_5ns_a3D_ 0 37 722 13.33% SDOERR_CASP11_crystal_ss_contacts_5ns_ntl9_ 0 41 1018 13.21% SDOERR_CASP1XX_crystal_ss_contacts_1ns_a3D_ 0 29 1213 7.33% SDOERR_CASP1XX_crystal_ss_contacts_1ns_ntl9_ 0 33 1387 10.46% SDOERR_CASP1XX_crystal_ss_contacts_20ns_ntl9_ 0 47 715 12.16% SDOERR_CASP1XX_crystal_ss_contacts_5ns_a3D_ 0 50 653 8.03% SDOERR_CASP1XX_crystal_ss_contacts_5ns_ntl9_ 0 48 1212 11.4% SDOERR_CASP20M_crystal_contacts_5ns_a3D_ 0 13 321 5.59% SDOERR_CASP20M_crystal_ss_1ns_ntl9_ 0 14 544 7.01% SDOERR_CASP20M_crystal_ss_5ns_a3D_ 0 12 323 14.55% SDOERR_CASP20M_crystal_ss_5ns_ntl9_ 0 11 410 6.39%

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 45025 - Posted: 29 Oct 2016 | 21:23:21 UTC

one question .. the short runs have the app name 'acemdshort' in the client config. But which name do the long ones have?

____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45026 - Posted: 29 Oct 2016 | 22:19:30 UTC - in response to Message 45025.

one question .. the short runs have the app name 'acemdshort' in the client config. But which name do the long ones have?

acemdlong

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 45027 - Posted: 29 Oct 2016 | 22:48:56 UTC - in response to Message 45026.

one question .. the short runs have the app name 'acemdshort' in the client config. But which name do the long ones have?

acemdlong


Well....... Boinc says

GPUGRID: Notice from BOINC
Your app_config.xml file refers to an unknown application 'acemdlong'. Known applications: 'acemdshort'
29.10.2016 23:22:58
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45028 - Posted: 29 Oct 2016 | 23:02:19 UTC - in response to Message 45027.

If that's the GTX 1070 machine you attached to the project for the first time on 24 Sep 2016, it won't have been assigned any long-queue tasks yet, until this testing is complete and the application is released on the long queue as well.

Your BOINC client only learns about acemdlong when is is assigned the first long task and downloads the application. The notice will go away at that point, and can be ignored in the meantime.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 45029 - Posted: 29 Oct 2016 | 23:11:57 UTC

Okay, thanks...
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45033 - Posted: 30 Oct 2016 | 1:49:55 UTC - in response to Message 44974.


The download speed and stalls are common problems on this project.
https://www.gpugrid.net/forum_thread.php?id=4399
https://www.gpugrid.net/forum_thread.php?id=4373


Yes they are and why they haven't been fixed is beyond ridiculous and reflects very badly on this project IMHO.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45036 - Posted: 30 Oct 2016 | 10:09:32 UTC - in response to Message 45033.

20nm tasks are in the Long queue, but not for Pascals.
The Pascals need the 64bit Windows 9.14 (cuda80) application which hasn't been uploaded into the Long queue yet. See the Apps page.

As the Pascal app is working as good as the Maxwells and previous generations (at processing the short queue tasks) I don't see any reason not to upload the cuda8.0 app into the long queue too.
While you could make overlap/parallel run arguments for analysing the performances of the Pascals against the Maxwells for scientific acceptance that shouldn't preclude the use of the Pascal app in the Long queue now. Qualifying the apps would be retrospective anyway.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45038 - Posted: 30 Oct 2016 | 11:11:54 UTC - in response to Message 44969.
Last modified: 30 Oct 2016 | 11:13:32 UTC

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.


Same problem here. 76% for long time, GPU and CPU idle.


I still have this problem with maybe 10% of WU. Only restart helps - they are starting from 0% and crunching without problems. But I cant manually control it 24/7...

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45039 - Posted: 30 Oct 2016 | 11:46:34 UTC - in response to Message 45038.

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.


Same problem here. 76% for long time, GPU and CPU idle.


I still have this problem with maybe 10% of WU. Only restart helps - they are starting from 0% and crunching without problems. But I cant manually control it 24/7...

Have you tried suspending the stalled task and then resuming it? POEM used to do the same thing and suspending/resuming usually got the task moving again.

ericbe
Send message
Joined: 29 Sep 16
Posts: 6
Credit: 27,120,051
RAC: 0
Level
Val
Scientific publications
watwatwat
Message 45040 - Posted: 30 Oct 2016 | 11:58:53 UTC - in response to Message 45038.

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.


Same problem here. 76% for long time, GPU and CPU idle.


I still have this problem with maybe 10% of WU. Only restart helps - they are starting from 0% and crunching without problems. But I cant manually control it 24/7...


I had this same problem at least twice now. I tried suspending and resuming (task and/or project) and that didn't seem to work. Only a restart gets the task moving again.

ericbe
Send message
Joined: 29 Sep 16
Posts: 6
Credit: 27,120,051
RAC: 0
Level
Val
Scientific publications
watwatwat
Message 45041 - Posted: 30 Oct 2016 | 12:03:00 UTC - in response to Message 45010.

Has anyone received a 20ns task on a Pascal GPU?
I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.


I'm crunching a 20ns short right now, and completed another one a couple of hours ago on my GTX1060.

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45042 - Posted: 30 Oct 2016 | 12:47:43 UTC - in response to Message 45041.

Has anyone received a 20ns task on a Pascal GPU?
I'm asking because even though the 20ns tasks are in the short queue I'm getting only 1ns and 5ns tasks on my GTX 1080.

I'm crunching a 20ns short right now, and completed another one a couple of hours ago on my GTX1060.

That's one which was created and placed on the short queue five days ago, but the original volunteer never returned it: WU 11817373
But it'll help the scientists (and watchers) by showing them how the new application behaves on the longer tasks.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45043 - Posted: 30 Oct 2016 | 12:50:47 UTC - in response to Message 45039.

Just had a short WU on my 1080 go for 9 hours and it was at 50% done. the GPU was idling. Other Pascal units had worked flawlessly. I aborted it and gave it a new one and it works again.


Same problem here. 76% for long time, GPU and CPU idle.


I still have this problem with maybe 10% of WU. Only restart helps - they are starting from 0% and crunching without problems. But I cant manually control it 24/7...

Have you tried suspending the stalled task and then resuming it? POEM used to do the same thing and suspending/resuming usually got the task moving again.


I did, it didn't help. Only restart does the trick.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45044 - Posted: 30 Oct 2016 | 14:18:49 UTC - in response to Message 45043.
Last modified: 30 Oct 2016 | 14:20:46 UTC

Have you tried suspending the stalled task and then resuming it? POEM used to do the same thing and suspending/resuming usually got the task moving again.

I did, it didn't help. Only restart does the trick.

It sounds like the card downclocked itself. It happens when you (or the factory) overclock your card too much.

[AF>P4G] anthony
Send message
Joined: 14 Mar 10
Posts: 14
Credit: 501,938,373
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45045 - Posted: 30 Oct 2016 | 16:15:27 UTC

Hello,

These apps work fine on my 1080 FE with a light OC (+135 Mhz).

But, I have a problem with the downloading :
The download freeze and I need to suspend the network activity and after put network activity always enabled to finish the download.


Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45046 - Posted: 30 Oct 2016 | 17:24:30 UTC - in response to Message 45036.

As the Pascal app is working as good as the Maxwells and previous generations (at processing the short queue tasks) I don't see any reason not to upload the cuda8.0 app into the long queue too.

+1000

While you could make overlap/parallel run arguments for analysing the performances of the Pascals against the Maxwells for scientific acceptance that shouldn't preclude the use of the Pascal app in the Long queue now. Qualifying the apps would be retrospective anyway.

I would like to add that the short queue will run out of work very soon if all the GTX 10x0 cards are receiving work only from this queue.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 45050 - Posted: 30 Oct 2016 | 20:04:22 UTC

Hello all,

Can some of you please confirm that suspend/resume is working ok, please.

Matt

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 45051 - Posted: 30 Oct 2016 | 20:23:02 UTC - in response to Message 45050.
Last modified: 30 Oct 2016 | 20:29:20 UTC

Task suspend/resume works
GPU suspend/resume works

Casp0X_crystal_ss_1ns take around 10 sec until it suspends but manager wait until it stopped. Might be checkpoint it would like to reach before stop do to it size.

Got only one error for unstable, do to it´s time i did other things with gpu when it was running so may not be task fault for that error.
All other went fine.

(Since release of beta 984 valid and 1 error)

ericbe
Send message
Joined: 29 Sep 16
Posts: 6
Credit: 27,120,051
RAC: 0
Level
Val
Scientific publications
watwatwat
Message 45053 - Posted: 30 Oct 2016 | 20:51:19 UTC - in response to Message 45050.

I've kept an eye on it during the day, and task suspend/resume works. I had to do this at least four times.
My GTX1060 is always between 60 and 65° C and when the task is running it usually produces a 60-70% load on the GPU.

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 45054 - Posted: 30 Oct 2016 | 21:03:48 UTC

Ok, Pascal app is live on acemdlong now.

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 45055 - Posted: 30 Oct 2016 | 21:31:49 UTC - in response to Message 45054.

Thanks Matt great work with the application.

Time to do some work with Pascal.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45056 - Posted: 30 Oct 2016 | 22:27:46 UTC - in response to Message 45054.
Last modified: 30 Oct 2016 | 22:28:34 UTC

Ok, Pascal app is live on acemdlong now.

Great news!
I've got one PABLO_SH2TRIPEP_T_TRI running.
GPU: GTX 1080@2000MHz, GDDR5X@4763MHz
GPU usage: 75~78%, GPU bus usage: 23~27%
GPU power: 56%
Estimated completion time: 13.684sec
A similar workunit on a GTX980Ti under Windows XP took 12.052sec. (that is 13.5% faster)

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 45057 - Posted: 30 Oct 2016 | 22:34:54 UTC

Suspend long workunits no problem ether.

Next target will be to test for linux

Revrnd
Send message
Joined: 21 Mar 16
Posts: 6
Credit: 76,105,375
RAC: 0
Level
Thr
Scientific publications
watwat
Message 45059 - Posted: 31 Oct 2016 | 8:08:28 UTC - in response to Message 45057.
Last modified: 31 Oct 2016 | 8:09:49 UTC

I just caught outta the corner of my eye something Trend-Micro picked up when starting a Pascal WU.

I tried to see what pinged Trend and where it copied it to but I can't seem to find in all the settings what it was and where it went. Hopefully Trend isn't coming up with false positives and it doesn't affect results.

Anyone else getting this on Antivirus programs? I may need to dig deeper and see if I can tell Trend to ignore anything coming / going outta my BOINC directories.

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45069 - Posted: 31 Oct 2016 | 16:34:10 UTC - in response to Message 45059.

I just caught outta the corner of my eye something Trend-Micro picked up when starting a Pascal WU.

I tried to see what pinged Trend and where it copied it to but I can't seem to find in all the settings what it was and where it went. Hopefully Trend isn't coming up with false positives and it doesn't affect results.

Anyone else getting this on Antivirus programs? I may need to dig deeper and see if I can tell Trend to ignore anything coming / going outta my BOINC directories.


No issue with windows defender.

Bacon
Send message
Joined: 17 Jun 16
Posts: 1
Credit: 54,496,198
RAC: 0
Level
Thr
Scientific publications
watwatwatwat
Message 45081 - Posted: 31 Oct 2016 | 20:53:30 UTC - in response to Message 44869.

Looking at my GPU with GPU Tweak from Asus and it looks like my GPU is only working at 60% of his capacity. I have a GTX1070 Strix. Feel Free to contact me if needed.

Jonathan Bacon

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 45091 - Posted: 1 Nov 2016 | 10:25:39 UTC

Linux app is live now.

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 45092 - Posted: 1 Nov 2016 | 10:46:07 UTC - in response to Message 45069.

I just caught outta the corner of my eye something Trend-Micro picked up when starting a Pascal WU.

I tried to see what pinged Trend and where it copied it to but I can't seem to find in all the settings what it was and where it went. Hopefully Trend isn't coming up with false positives and it doesn't affect results.

Anyone else getting this on Antivirus programs? I may need to dig deeper and see if I can tell Trend to ignore anything coming / going outta my BOINC directories.


No issue with windows defender.


No issue with Kaspersky...
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45098 - Posted: 1 Nov 2016 | 20:03:33 UTC - in response to Message 45091.

Linux app is live now.



866999 GPUGRID 11/1/2016 2:58:03 PM Sending scheduler request: To fetch work.
867000 GPUGRID 11/1/2016 2:58:03 PM Requesting new tasks for CPU and Miner ASIC and NVIDIA GPU
867001 GPUGRID 11/1/2016 2:58:06 PM Scheduler request completed: got 0 new tasks
867002 GPUGRID 11/1/2016 2:58:06 PM No tasks sent
867003 GPUGRID 11/1/2016 2:58:06 PM No tasks are available for Short runs (2-3 hours on fastest card)
867004 GPUGRID 11/1/2016 2:58:06 PM No tasks are available for Long runs (8-12 hours on fastest card)

GTX 1060. Am I missing something?

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 45100 - Posted: 1 Nov 2016 | 21:36:33 UTC - in response to Message 45098.



GTX 1060. Am I missing something?


Working now.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45102 - Posted: 1 Nov 2016 | 22:09:49 UTC - in response to Message 45100.



GTX 1060. Am I missing something?


Working now.

Thanks

mixolyd
Send message
Joined: 11 Oct 16
Posts: 3
Credit: 50,127,323
RAC: 0
Level
Thr
Scientific publications
watwat
Message 45106 - Posted: 2 Nov 2016 | 4:37:00 UTC - in response to Message 45102.

Seems to be working for me (GTX 1060) but GPU usage is only at 70%? I don't know if this is related but in BOINC it says Running 0.975 CPUs and 1 Nvidia GPU. I have a 4 core CPU so it should be using 1 full core no? I tried changing in GPUGRID "Maximum CPU for graphics" to 100 but no difference.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45107 - Posted: 2 Nov 2016 | 11:05:20 UTC - in response to Message 45044.

Have you tried suspending the stalled task and then resuming it? POEM used to do the same thing and suspending/resuming usually got the task moving again.

I did, it didn't help. Only restart does the trick.

It sounds like the card downclocked itself. It happens when you (or the factory) overclock your card too much.


It is not overclocked. And the behaviour is really strange. Yesterday in the morning my computer started long WU. After 4h it was 60% done, remaining time 3h with some minutes. Everything great to this point, right? But two hours later it was still 60% and remaining time was just "-- --", GPU at 250Mhz and IDLE. So I restarted BM, WCG started, GPUGRID "waiting to start" for about an hour. So I restarted computer, it was crunching again from 60% but this time no problem. Until today when I saw that the same WU has 90% processed, remaining time "-- --", GPU idle... Restarted computer again and it was ended and uploaded.

https://www.gpugrid.net/result.php?resultid=15494663

Reported with run time of 14h, in reality it was almost 25. I will update the driver to the newest version cause I don't have any other ideas.

Matt
Avatar
Send message
Joined: 11 Jan 13
Posts: 216
Credit: 846,538,252
RAC: 0
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45108 - Posted: 2 Nov 2016 | 11:21:38 UTC

I've received a task for my 1080 on a Win7 x64 host. Currently at ~57%GPU usage and ~35% power level. About 70% complete after a little over two hours. The card has not boosted and is running at base clock of 1708MHz.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45110 - Posted: 2 Nov 2016 | 11:45:38 UTC
Last modified: 2 Nov 2016 | 11:50:54 UTC

Manually managed the hanging download issues and ran a few tasks. 4 short tasks completed in 464-537 seconds but another listed as a short task took 13,764 seconds. What's up with that?
GTX 1060 at stock settings, Ubuntu 16.04 LTS, driver 367.35.

Upon further inspection it seems that the longer running task had 5 million steps where the other 4 had only 250k.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45112 - Posted: 2 Nov 2016 | 13:12:58 UTC - in response to Message 45107.
Last modified: 2 Nov 2016 | 13:13:46 UTC

It is not overclocked. And the behaviour is really strange. Yesterday in the morning my computer started long WU. After 4h it was 60% done, remaining time 3h with some minutes. Everything great to this point, right? But two hours later it was still 60% and remaining time was just "-- --", GPU at 250Mhz and IDLE. So I restarted BM, WCG started, GPUGRID "waiting to start" for about an hour. So I restarted computer, it was crunching again from 60% but this time no problem. Until today when I saw that the same WU has 90% processed, remaining time "-- --", GPU idle... Restarted computer again and it was ended and uploaded.

https://www.gpugrid.net/result.php?resultid=15494663

Reported with run time of 14h, in reality it was almost 25. I will update the driver to the newest version cause I don't have any other ideas.

Looking at your task's stderr.txt there's no "exit" messages, so it seems that either the ACEMD app stops (crashes), or it gets disconnected from the GPU. Do you have multiple user accounts on that PC? Switching users could cause such behavior (as the GPU is switched over to the another user, while the BOINC manager still runs as the original user).
Is there anything suspicious in the BOINC manager's event log?

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45127 - Posted: 2 Nov 2016 | 23:21:37 UTC - in response to Message 45091.

Linux app is live now.

What driver version is needed to use the CUDA8.5 (Linux) app?

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45135 - Posted: 3 Nov 2016 | 11:35:12 UTC - in response to Message 45112.

It is not overclocked. And the behaviour is really strange. Yesterday in the morning my computer started long WU. After 4h it was 60% done, remaining time 3h with some minutes. Everything great to this point, right? But two hours later it was still 60% and remaining time was just "-- --", GPU at 250Mhz and IDLE. So I restarted BM, WCG started, GPUGRID "waiting to start" for about an hour. So I restarted computer, it was crunching again from 60% but this time no problem. Until today when I saw that the same WU has 90% processed, remaining time "-- --", GPU idle... Restarted computer again and it was ended and uploaded.

https://www.gpugrid.net/result.php?resultid=15494663

Reported with run time of 14h, in reality it was almost 25. I will update the driver to the newest version cause I don't have any other ideas.

Looking at your task's stderr.txt there's no "exit" messages, so it seems that either the ACEMD app stops (crashes), or it gets disconnected from the GPU. Do you have multiple user accounts on that PC? Switching users could cause such behavior (as the GPU is switched over to the another user, while the BOINC manager still runs as the original user).
Is there anything suspicious in the BOINC manager's event log?


I am the only user of this PC. And it happens even when I am working on it. Driver updated - no change, GPU clock lowered by 200Mhz - no change. Folding@home works flawless. I have no idea what else I could do.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45141 - Posted: 3 Nov 2016 | 17:32:14 UTC - in response to Message 45127.

Linux app is live now.

What driver version is needed to use the CUDA8.5 (Linux) app?

The newest I could find for Linux is 367.44. I'm currently using 367.35 and the completed tasks I have show v9.14 (CUDA 80) as the app. Don't know if the newest driver would make use of 8.5 but since I've had too many gnarly experiences with video driver updates I'm not going to play guinea pig on this one. Maybe someone who has already updated to 367.44 can chime in.

nanoprobe
Send message
Joined: 26 Feb 12
Posts: 184
Credit: 222,376,233
RAC: 0
Level
Leu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45142 - Posted: 3 Nov 2016 | 18:10:15 UTC - in response to Message 45135.


I am the only user of this PC. And it happens even when I am working on it. Driver updated - no change, GPU clock lowered by 200Mhz - no change. Folding@home works flawless. I have no idea what else I could do.

Windows at times has video driver timeout issues with XP, Vista, 7, 8 and 8.1 but that was with ATI cards. If you're using Win10 I have no experience with that OS. Never heard of it happening with Nvidia but that doesn't mean it couldn't. Try this and see if it helps. If it doesn't work it should not affect your computer in any way but to be super safe and eliminate "murphy's law" you could image your drive before you start if it would make you more comfortable. I've shown this fix many times to others without a single problem reported.

Copy and paste ALL of the text below as is into notepad. Rename the file timeoutfix.reg(or another name if you like). Now double click on it. You'll be prompted about adding files to the registry. Just click yes on any that appear. Now restart your computer. If your strange behavior stops then it was a timeout issue. If not then it's something else. If it doesn't work you can always go back and delete the reg entries if you're not comfortable leaving them there.

Windows Registry Editor Version 5.00

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Watchdog]
"DisableBugCheck"="1"

[HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\Watchdog\Display]
"EaRecovery"="0"

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45143 - Posted: 3 Nov 2016 | 18:11:09 UTC - in response to Message 45127.
Last modified: 3 Nov 2016 | 20:06:34 UTC

Linux app is live now.

What driver version is needed to use the CUDA8.5 (Linux) app?

8.5 hasn't been released yet so it must be a typo, otherwise the app version would be different too.

nanoprobe reported using the 367.35 driver with his GTX 1060 on Ubuntu 16.04 LTS.
I've just installed 370.28 in the hope it will work with a 1060-3GB tomorrow...
OT I'm seeing more GPU utilization variation with the 370 driver than with my previous 361.42 driver on my 970 (85 to 92% GPU utilization).

NB: 375.10 Beta (Linux) brings support for the GeForce GTX 1050 & GeForce GTX 1050 Ti. On Windows 7/8.1/10 (x64) you will need the 375.63 or newer driver if you have a GTX1050 or 1050Ti (according to NV; though I wouldn't be surprised if slightly older drivers worked fine for the GPUGrid apps).
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Profile MJH
Project administrator
Project developer
Project scientist
Send message
Joined: 12 Nov 07
Posts: 696
Credit: 27,266,655
RAC: 0
Level
Val
Scientific publications
watwat
Message 45144 - Posted: 3 Nov 2016 | 19:28:08 UTC - in response to Message 45143.

8.5 was a typo. For Linux, cc>=6.0 and driver 367+ will get the 8.0 app

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45150 - Posted: 3 Nov 2016 | 23:39:01 UTC - in response to Message 45144.

Matt, What about mixed generation setups? Say a GTX1060 and a GTX980 in the same rig for example? How are apps/tasks allocated in such a setup?
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45151 - Posted: 4 Nov 2016 | 0:12:35 UTC - in response to Message 45150.

Matt, What about mixed generation setups? Say a GTX1060 and a GTX980 in the same rig for example? How are apps/tasks allocated in such a setup?

With extreme difficulty. If you look at the coproc data transmitted to the server in the sched_request_www.gpugrid.net.xml file, you won't find any mention of the GTX980 card. And if the server doesn't know it exists, then it can't make any tailored allocations to suit the lower spec card.

I was present in Budapest in September 2014 to hear David Anderson's keynote talk A Brief History of BOINC. As slide 52 of workshop_14.pdf indicates, the 'Coprocessor model' is one of the "things we need to change" - this is what they mean. But they haven't done it yet.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45191 - Posted: 5 Nov 2016 | 10:25:03 UTC - in response to Message 45151.

The GPUGrid apps read the GPU's directly, so in theory they can check if they should be running on said GPU (Pascal or earlier) or if a different app should be used (cuda 6.5). Just asking if that (or something similar) was done for the cuda 8.0app.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45205 - Posted: 5 Nov 2016 | 15:53:23 UTC - in response to Message 45191.

The GPUGrid apps read the GPU's directly, so in theory they can check if they should be running on said GPU (Pascal or earlier) or if a different app should be used (cuda 6.5). Just asking if that (or something similar) was done for the cuda 8.0app.

Unless the app can be supplied as some sort of 'fat binary' combining Cuda 8.0 and Cuda 6.5 paths, or as a separate 'launcher plus two alternative child apps', that doesn't get us any further forward.

Even given that the 8.0 app knows it is commencing to run on a card which "should" have been given a 6.5 app instead (which in effect it is doing now, by crashing), the BOINC framework provides no mechanism for transferring the task, either to run on a different card with the same app, or on the same card with a different app.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45214 - Posted: 5 Nov 2016 | 17:46:34 UTC - in response to Message 45205.
Last modified: 5 Nov 2016 | 17:51:37 UTC

As the app's 'Pascal only'; Matt, I'll asume mixed NV generation GPU setups will Not work!
An app_config based GPU/'queue' exclusion won't work because the cuda 8.0 app's populated all queues & app names are <name>Long runs (8-12 hours on fastest card)</name> and so on, they don't specify the CUDA version.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

Richard Haselgrove
Send message
Joined: 11 Jul 09
Posts: 1620
Credit: 9,096,883,853
RAC: 17,989,612
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45217 - Posted: 5 Nov 2016 | 19:04:11 UTC - in response to Message 45214.

As the app's 'Pascal only'; Matt, I'll asume mixed NV generation GPU setups will Not work!
An app_config based GPU/'queue' exclusion won't work because the cuda 8.0 app's populated all queues & app names are <name>Long runs (8-12 hours on fastest card)</name> and so on, they don't specify the CUDA version.

I don't have any mixed Pascal/Maxwell hosts, but I do have three GTX 970/750Ti combos. The 750Ti cards are a bit weak for the normal long queue cards, so I exclude GPUGrid from them completely, and let another project make use of them.

PappaLitto
Send message
Joined: 21 Mar 16
Posts: 511
Credit: 4,672,242,755
RAC: 0
Level
Arg
Scientific publications
watwatwatwatwatwatwatwat
Message 45218 - Posted: 5 Nov 2016 | 19:24:12 UTC

I'm having a big problem on Pascal work units, every few hours I come back to find my 1080 idling. Long and short work units, no particular doctorate's work either. I have to suspend then resume for it to start again

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45220 - Posted: 5 Nov 2016 | 20:01:17 UTC - in response to Message 45218.

I'm having a big problem on Pascal work units, every few hours I come back to find my 1080 idling. Long and short work units, no particular doctorate's work either. I have to suspend then resume for it to start again


So I am not the only one with this problem. Good to know.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45233 - Posted: 7 Nov 2016 | 16:38:49 UTC - in response to Message 45220.

Haven't seen that on Linux and I don't know what else you're doing on the systems or what your Boinc & System settings are so I can't add much, other than point out that you're both using 375.70.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45234 - Posted: 7 Nov 2016 | 18:43:16 UTC - in response to Message 45233.

Haven't seen that on Linux and I don't know what else you're doing on the systems or what your Boinc & System settings are so I can't add much, other than point out that you're both using 375.70.


I updated my drivers to 375.7 because of this problem. There is no difference.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45238 - Posted: 7 Nov 2016 | 23:45:22 UTC - in response to Message 45091.
Last modified: 7 Nov 2016 | 23:46:24 UTC

Linux app is live now.

Please add the SWAN_SYNC=1 option to the new Linux app, as there is a significant performance loss without it on the Pascal cards. The proof of it that there is no GTX 1080 has beaten my GTX 980Ti's yet on the performance chart, not even my own GTX 1080 (which runs under Linux now).

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45240 - Posted: 8 Nov 2016 | 22:42:04 UTC - in response to Message 45234.

Haven't seen that on Linux and I don't know what else you're doing on the systems or what your Boinc & System settings are so I can't add much, other than point out that you're both using 375.70.


I updated my drivers to 375.7 because of this problem. There is no difference.

This could be an app issue but I cant help you with that. I'm suggesting things that you can check/try. In the past there have been power issues (downclocking) that were corrected with new drivers and sometimes worked around using tweaks.

Suggest you check your systems Power settings (sleep/suspend, hybrid sleep, suspend pcie, screen saver). Also, what are your resources settings (Device Profile settings) at WCG? I know you can set them to the following:
Power Saving: Set my profile so that my computer's power settings can take effect
Minimum Impact: Set my profile so that it will have a minimal amount of impact on my computer.
If you change profile settings there, they apply here too (assuming your device is on the same profile), but if you have any local Boinc settings these will override your project profile settings.
____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 45244 - Posted: 10 Nov 2016 | 11:35:26 UTC

I have changed GPU to 560 Ti. Same driver, OS, BM and everything. The problem is gone. So I think that there is a problem with pascal app...

Profile [AF>Libristes] hermes
Send message
Joined: 11 Nov 16
Posts: 26
Credit: 710,087,297
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwat
Message 45251 - Posted: 12 Nov 2016 | 10:36:16 UTC
Last modified: 12 Nov 2016 | 10:50:05 UTC

Hi,

My conf:
Ubuntu 16.04
kernel: Linux 4.4.0-47-generic
CPU: Intel(R) Core(TM) i3-4170 CPU @ 3.70GHz [Family 6 Model 60 Stepping 3]
GPU: NVIDIA GeForce GTX 1060 3GB (3012MB)
NVIDIA Driver Version: 370.28
Version de BOINC 7.6.31
Mémoire 15996.81 MB

Well, It's working well, lot GPU, few CPU. Working maybe to well
GPU (nvidia-smi outputs):
# gpu pid type sm mem enc dec command
# Idx # C/G % % % % name
0 1028 C 93 31 0 0 acemd.914-80.bi

# gpu pwr temp sm mem enc dec mclk pclk
# Idx W C % % % % MHz MHz
0 94 60 93 31 0 0 3802 1949

CPU (4 cores):
%Cpu(s): 1,0 ut, 0,5 sy, 5,1 ni, 92,7 id, 0,6 wa, 0,0 hi, 0,1 si, 0,0 st

PID UTIL. PR NI VIRT RES SHR S %CPU %MEM TEMPS+ COM.
2730 boinc 30 10 12,873g 340636 149216 S 24,9 2,1 1:34.18 acemd.914-80.bin


It is possible to limit a little the GPU usage ? (sorry, newbie on boinc)
Because my Firefox who use GPU acceleration have some difficulties...

But nice job !

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45252 - Posted: 12 Nov 2016 | 16:45:03 UTC - in response to Message 45238.
Last modified: 12 Nov 2016 | 16:48:57 UTC

Linux app is live now.

Please add the SWAN_SYNC=1 option to the new Linux app, as there is a significant performance loss without it on the Pascal cards. The proof of it that there is no GTX 1080 has beaten my GTX 980Ti's yet on the performance chart, not even my own GTX 1080 (which runs under Linux now).

After a couple of workunits processed, I've observed that my GTX 1080 (without SWAN_SYNC) could be up to 7% faster than my GTX 980Ti depending on the batch, but it's usually slower.
Still I think there's a significant performance loss without SWAN_SYNC under Linux, and I can demonstrate this on a PABLO_SH2TRIPEP_L_TRI workunit:
Task 15543899 12.016 sec GTX 980Ti @1367MHz _ _ _ _ _ WinXP, SWAN_SYNC on
Task 15569769 10.528 sec TITAN X (Pascal) @1531MHz (?), Linux, SWAN_SYNC off (87.6% of the GTX 980Ti, i.e. 14.1% faster)
However, it should be much faster:
Let's compare the theoretical computing indices of the two GPUs:
GTX 980Ti: 2816 CUDA cores * 1367MHz = 3.849.472 TITAN X (Pascal): 3584 CUDA cores * 1531MHz = 5.487.104 5.487.104 / 3.849.472 = 1.425
-> 42.5% faster i.e. the runtime should be only 70.16% of the GTX980Ti's (8.430 sec)
If I assume that the Pascal is boosting to its max (~2000MHz) than it's even worse:
GTX 980Ti: 2816 CUDA cores * 1367MHz = 3.849.472 TITAN X (Pascal): 3584 CUDA cores * 2000MHz = 7.168.000 7.168.000 / 3.849.472 = 1.862
-> 86.2% faster i.e. the runtime should be only 53.7% of the GTX 980Ti's (6.453 sec)

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45753 - Posted: 15 Dec 2016 | 21:21:45 UTC

_cufft64_80.dll is downloading at 50% after 5h00 !
What is happening ?
Can somwone post me a link to the file ?

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45754 - Posted: 16 Dec 2016 | 5:15:24 UTC - in response to Message 45753.

_cufft64_80.dll is downloading at 50% after 5h00 !
What is happening ?
Can somwone post me a link to the file ?


Still downlaoding after an extra night. Basically I cannot crunch GPUGRID tasks at the moment.

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 45756 - Posted: 16 Dec 2016 | 10:03:36 UTC - in response to Message 45753.

_cufft64_80.dll is downloading at 50% after 5h00 !
What is happening ?
Can somwone post me a link to the file ?

http://www.gpugrid.net/download/cufft64_80.dll
____________
1 Corinthians 9:16 "For though I preach the gospel, I have nothing to glory of: for necessity is laid upon me; yea, woe is unto me, if I preach not the gospel!"
Ephesians 6:18-20, please ;-)
http://tbc-pa.org

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45758 - Posted: 16 Dec 2016 | 17:45:34 UTC - in response to Message 45756.
Last modified: 16 Dec 2016 | 17:47:33 UTC

_cufft64_80.dll is downloading at 50% after 5h00 !
What is happening ?
Can somwone post me a link to the file ?

http://www.gpugrid.net/download/cufft64_80.dll


Thanks but does not help, Chrome stops downloading at 2.3/139MBytes.
I have 0 issue with internet.
Maybe that Norton blocks it ?
Actually not, Norton off, still same issue.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45759 - Posted: 16 Dec 2016 | 18:06:36 UTC - in response to Message 45758.

Thanks but does not help, Chrome stops downloading at 2.3/139MBytes.
I have 0 issue with internet.
Maybe that Norton blocks it ?
Actually not, Norton off, still same issue.

You should try a download manager, or I can send you the file in 5-10MB pieces if you give me your email address.

Profile [VENETO] sabayonino
Send message
Joined: 4 Apr 10
Posts: 50
Credit: 645,641,596
RAC: 0
Level
Lys
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45760 - Posted: 16 Dec 2016 | 18:26:01 UTC

same here.
I got few Mb ... restart/resume download after a while...

Profile caffeineyellow5
Avatar
Send message
Joined: 30 Jul 14
Posts: 225
Credit: 2,658,976,345
RAC: 0
Level
Phe
Scientific publications
watwatwatwatwatwatwatwat
Message 45766 - Posted: 17 Dec 2016 | 0:17:23 UTC - in response to Message 45758.

_cufft64_80.dll is downloading at 50% after 5h00 !
What is happening ?
Can somwone post me a link to the file ?

http://www.gpugrid.net/download/cufft64_80.dll


Thanks but does not help, Chrome stops downloading at 2.3/139MBytes.
I have 0 issue with internet.
Maybe that Norton blocks it ?
Actually not, Norton off, still same issue.

That is the issue everyone is seeing with that file. I was able to get it over 7 minutes with BitComet, an http/torrent downloader as it made 20 connections to the file and was able to split up the timeouts better getting it in 20 pcs and putting it back together. That was with mirrors off too, so the file might even exist on mirrors of some kind as well as the one location I pointed out? Worth a try with a download manager for sure.

Profile Retvari Zoltan
Avatar
Send message
Joined: 20 Jan 09
Posts: 2353
Credit: 16,304,090,074
RAC: 3,402,267
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45768 - Posted: 17 Dec 2016 | 1:16:56 UTC

I've put the file to onedrive. You can access it here. You should unzip it to C:\ProgramData\BOINC\projects\www.gpugrid.net .

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45771 - Posted: 17 Dec 2016 | 8:10:01 UTC
Last modified: 17 Dec 2016 | 8:59:23 UTC

Maybe that the new boinc 7.6.33 is creating problems with the GPUGRID server ?

BOINC still tries to download the file, there are several files were it keeps its download status, probably why it continue to download it ...
deleting the _ version does not help

Edit: I selected RESET project for GPUGRID in BOINC and it does not try to download anymore, that should be ok now.

xixou
Send message
Joined: 8 Jun 14
Posts: 18
Credit: 19,804,091
RAC: 0
Level
Pro
Scientific publications
watwatwat
Message 45773 - Posted: 17 Dec 2016 | 9:35:36 UTC

mmm there is no short or long task available anymore.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 45777 - Posted: 17 Dec 2016 | 15:23:22 UTC - in response to Message 45771.

Maybe that the new boinc 7.6.33 is creating problems with the GPUGRID server ?

There are a number of threads concerning the slow downloads here. It has nothing to do with BOINC versions. There are also several threads on the lack of work available. Take a look at them. There's just not enough work and the monthly crunchathon or whatever it's called just makes things a lot worse. All it does it take the work from the people who regularly crunch here.

mixolyd
Send message
Joined: 11 Oct 16
Posts: 3
Credit: 50,127,323
RAC: 0
Level
Thr
Scientific publications
watwat
Message 46454 - Posted: 7 Feb 2017 | 8:10:07 UTC - in response to Message 45773.

mmm there is no short or long task available anymore.


Same here.
Any update? When will Pascal cards be officially supported?


____________

3de64piB5uZAS6SUNt1GFDU9d...
Avatar
Send message
Joined: 20 Apr 15
Posts: 285
Credit: 1,102,216,607
RAC: 0
Level
Met
Scientific publications
watwatwatwatwatwat
Message 46455 - Posted: 7 Feb 2017 | 8:20:59 UTC - in response to Message 46454.

mmm there is no short or long task available anymore.


Same here.
Any update? When will Pascal cards be officially supported?



Well, that is already the case. We just have the problem of being undersupplied with tasks, but that is a very general one.
____________
I would love to see HCF1 protein folding and interaction simulations to help my little boy... someday.

kain
Send message
Joined: 3 Sep 14
Posts: 152
Credit: 852,192,014
RAC: 2,323,751
Level
Glu
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwat
Message 46627 - Posted: 10 Mar 2017 | 14:48:38 UTC



Still the same issue...
Everything works fine, then GPU utilization drops to 0-1%, time of crunching in BM is going on but % is not. I have to manually restart BM. Time gets shortened, GPU is utilized properly until next "lock". I'm really getting pissed by this because I cant check every hour if everything is going fine. Luckili not every WU is affected.

I have updated driver and tested other ideas from this and the other topic. No change. In the meantime I was using my GPU at folding@home, not even a single problem. So I don't thing that there is something wrong with my card.

Greger
Send message
Joined: 6 Jan 15
Posts: 76
Credit: 24,371,208,880
RAC: 11,202,738
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwat
Message 46629 - Posted: 10 Mar 2017 | 15:58:49 UTC