New CPU work units: 3HHM-1/ZINC

Message boards : News : New CPU work units: 3HHM-1/ZINC

Author	Message
MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36115 - Posted: 6 Apr 2014 \| 10:38:57 UTC
	Hi gang, The testing for the CPU application is over and these are the first production WUs. In this project we are studying the PI3Kalpha, a mutation of which is implicated in tumor formation. You can read more about it, and see the structure, here: http://www.rcsb.org/pdb/explore.do?structureId=3HHM Over the course of this project we will be testing some 22 million commercially-available drug-like molecules, drawn from the ZINC database http://zinc.docking.org/, to find compounds which are predicted to bind strongly to the protein in a way which will inhibit its function. Once we have screen the whole database, we will take the best hits and test them for efficacy in a series of in vitro experiments. Hopefully we will find inhibitory compounds which can then serve as the basis for future drug development. Matt
	ID: 36115 \| Rating: 0 \| rate: / Reply Quote

Simba123 Send message Joined: 5 Dec 11 Posts: 147 Credit: 69,970,684 RAC: 0 Level Scientific publications	Message 36116 - Posted: 6 Apr 2014 \| 11:35:59 UTC - in response to Message 36115.
	Thanks for the information, looks exciting!
	ID: 36116 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36117 - Posted: 6 Apr 2014 \| 11:44:08 UTC - in response to Message 36115. Last modified: 6 Apr 2014 \| 11:48:46 UTC
	Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting
	ID: 36117 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36136 - Posted: 7 Apr 2014 \| 11:04:41 UTC
	I have set my preferences (all gone when saving changed one), to do these CPU tasks as well. However as these run quite fast, the remaining estimation of the GPU tasks is wrong. BOINC Manager now things my 770 and 780Ti can do a LR in 38 minutes, would be amazingly awesome though :) ____________ Greetings from TJ
	ID: 36136 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36138 - Posted: 7 Apr 2014 \| 11:45:14 UTC
	Watching more closely I see the following as well. Only about 6% is done from the CPU WU (in 15 minutes) and 5 minutes remain. The at about 8.5% done, estimation time is zero and within a few seconds later, the WU finishes to 100% and is uploaded and reported home, without error. Strange behavior. ____________ Greetings from TJ
	ID: 36138 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 36158 - Posted: 8 Apr 2014 \| 13:20:07 UTC - in response to Message 36138.
	On Linux the progress just jumps from 0% to 100%, runtime varies from around 9min to 13min (for me). The estimated runtime for these CPU workunits has dropped from several hours to just over 2h. The GPUGrid Long GPU tasks estimates have also dropped to just over 2h. The estimated computation size is the same as the Long GPU tasks! ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 36158 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36161 - Posted: 8 Apr 2014 \| 14:21:57 UTC - in response to Message 36158.
	On Linux the progress just jumps from 0% to 100%, runtime varies from around 9min to 13min (for me). The estimated runtime for these CPU workunits has dropped from several hours to just over 2h. The GPUGrid Long GPU tasks estimates have also dropped to just over 2h. The estimated computation size is the same as the Long GPU tasks! Thanks for conformation skgiven. Proves my eyes are still okay :) ____________ Greetings from TJ
	ID: 36161 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36173 - Posted: 8 Apr 2014 \| 22:37:42 UTC - in response to Message 36117. Last modified: 8 Apr 2014 \| 23:33:18 UTC
	Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting MJH: All of the GPUGrid CPU tasks are erroring out, on these 2 hosts I manage: http://www.gpugrid.net/results.php?hostid=137361 http://www.gpugrid.net/results.php?hostid=85944 ... Do you have any idea what is happening with the repeated exits near the step mentioned above? I'll try unattaching and reattaching the project, but... it'd be nice if you might mention if you know what might be causing this problem on 2 separate machines. Edit: Project unattach/reattach and resets.. did not help. Please help! I'm super excited about the possibility about having these non-GPU computers do work for your project, for the first time ever, but... They need your help!
	ID: 36173 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36174 - Posted: 8 Apr 2014 \| 23:20:05 UTC - in response to Message 36173.
	The ones I spot-checked all ran for 11 seconds, failed with that 'client dead' message, and then seem to have re-tried to process with the same input file. You'll remember my recent email exchange with David A on the subject of 'ERR_TOO_MANY_EXITS', of course? What does the BOINC Message (event) log have to say about all those client deaths?
	ID: 36174 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36175 - Posted: 8 Apr 2014 \| 23:23:33 UTC - in response to Message 36174. Last modified: 8 Apr 2014 \| 23:29:09 UTC
	I doubt the Event Log will say much, but I'll try to monitor it sometime. From what I can see, this is an application error that GPUGrid (MJH) needs to fix. Each "attempt" is a waste of 10 CPU seconds, and you can see below just how many times each task is attempted before failure. It is literally wasting several minutes of CPU, for each task, only to fail. MJH: I hope you can solve this one, please! Looking at the timestamps below, I see: 01:22:53 ... 01:38:25 That's 16 minutes of wasted CPU, and the log file wasn't even captured entirely, as the introduction was truncated. So.... we're looking at 20+ minutes of wasted CPU per task. Please fix! Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 7236): BOINC client no longer exists - exiting 01:22:53 (7236): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:03 (216): BOINC client no longer exists - exiting 01:23:03 (216): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:14 (7032): BOINC client no longer exists - exiting 01:23:14 (7032): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:25 (7480): BOINC client no longer exists - exiting 01:23:25 (7480): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] ..... [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:15 (6340): BOINC client no longer exists - exiting 01:38:15 (6340): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:25 (7780): BOINC client no longer exists - exiting 01:38:25 (7780): timer handler: client dead, exiting </stderr_txt> ]]>
	ID: 36175 \| Rating: 0 \| rate: / Reply Quote

(retired account) Send message Joined: 22 Dec 11 Posts: 38 Credit: 28,606,255 RAC: 0 Level Scientific publications	Message 36176 - Posted: 9 Apr 2014 \| 3:57:11 UTC Last modified: 9 Apr 2014 \| 4:09:26 UTC
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-)
	ID: 36176 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36177 - Posted: 9 Apr 2014 \| 8:44:37 UTC - in response to Message 36175.
	It doesn't seem to be universal. Some of the ones which fail go on to crash on every machine which tries them, but others run successfully. Look at WU 5573123: the first failure is from Jacob's list, but the task ran successfully on a machine with comparable specifications (i7, Windows 8.1 64-bit, BOINC v7.2.42). I'm trying to reproduce on host 93580, but so far the application is working properly (45% in 25 minutes, moving on from file to file as each segment is completed).
	ID: 36177 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36178 - Posted: 9 Apr 2014 \| 10:03:39 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? Looking into that now.. Matt
	ID: 36178 \| Rating: 0 \| rate: / Reply Quote

localizer Send message Joined: 17 Apr 08 Posts: 113 Credit: 1,656,514,857 RAC: 0 Level Scientific publications	Message 36182 - Posted: 9 Apr 2014 \| 14:22:39 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) +1.....
	ID: 36182 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 36193 - Posted: 10 Apr 2014 \| 9:16:09 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) I should have increased it now. gianni
	ID: 36193 \| Rating: 0 \| rate: / Reply Quote

localizer Send message Joined: 17 Apr 08 Posts: 113 Credit: 1,656,514,857 RAC: 0 Level Scientific publications	Message 36194 - Posted: 10 Apr 2014 \| 9:28:23 UTC - in response to Message 36193. Last modified: 10 Apr 2014 \| 9:34:39 UTC
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) I should have increased it now. gianni ............... great - currently I have 36 CPU & 2 GPU in my queue. Thanks
	ID: 36194 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36195 - Posted: 10 Apr 2014 \| 9:54:35 UTC - in response to Message 36194.
	Excellent. We ought to see a big improvement in throughput now! Matt
	ID: 36195 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36196 - Posted: 10 Apr 2014 \| 10:44:23 UTC Last modified: 10 Apr 2014 \| 10:45:08 UTC
	Any progress on solving the issue I'm seeing on multiple machines?
	ID: 36196 \| Rating: 0 \| rate: / Reply Quote

Vagelis Giannadakis Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level Scientific publications	Message 36197 - Posted: 10 Apr 2014 \| 11:10:43 UTC
	These WUs have some weird behavior. While they run perfectly fine, their Progress jumps abruptly from ~5% to 100%. This not only means that their estimated processing work is grossly miscalculated, it also messes up BOINC's estimations. I discovered yesterday evening that I had downloaded a GERARD while the currently processing NATHAN had several more hours of work ahead. Looking at the BOINC manager, the cause for this was evident: while the GPU WU needed several more hours on my GTX 650Ti, BOINC thought it just needed 30 more minutes! Just like that, I would have lost the credit bonus, because BOINC messed up its estimations! I assert that this is due to these CPU WUs and their weird Progress "jump". I've been crunching GPUGRID together with WCG on the same host for a year now and this had never happened. It started happening as soon as I suspended WCG and enabled CPU apps on GPUGRID. To give you a concrete example, this WU took just over 10.5 minutes, but BOINC estimated it to need more than 4.5 hours! Doesn't this affect estimations of future WUs? ____________
	ID: 36197 \| Rating: 0 \| rate: / Reply Quote

Jozef J Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 259,041 Level Scientific publications	Message 36203 - Posted: 10 Apr 2014 \| 14:22:37 UTC - in response to Message 36197.
	These WUs have some weird behavior. While they run perfectly fine, their Progress jumps abruptly from ~5% to 100%. This not only means that their estimated processing work is grossly miscalculated, it also messes up BOINC's estimations. I discovered yesterday evening that I had downloaded a GERARD while the currently processing NATHAN had several more hours of work ahead. Looking at the BOINC manager, the cause for this was evident: while the GPU WU needed several more hours on my GTX 650Ti, BOINC thought it just needed 30 more minutes! Just like that, I would have lost the credit bonus, because BOINC messed up its estimations! I assert that this is due to these CPU WUs and their weird Progress "jump". I've been crunching GPUGRID together with WCG on the same host for a year now and this had never happened. It started happening as soon as I suspended WCG and enabled CPU apps on GPUGRID. To give you a concrete example, this WU took just over 10.5 minutes, but BOINC estimated it to need more than 4.5 hours! Doesn't this affect estimations of future WUs? That's exactly what I also found out yesterday and continues it.. I also had problems for a while when I enabled cpu GPUGRID tasks. Now I have to just few Nathan estimeted whose requires 16 hours to 680gtx NV..... Yesterday I've even some few extra long tasks, you can see them in my tasks.. I2010R1-SDOERR_BARNA-3-4-RND4946_0 6406280 62,142.59 Gpu time 21,988.47 cpu time..WOW We will be just happy if they'll catch solve these emerging issues..)))
	ID: 36203 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36207 - Posted: 10 Apr 2014 \| 16:38:44 UTC - in response to Message 36196.
	Any progress on solving the issue I'm seeing on multiple machines? No, not yet, but it's on the list. Matt
	ID: 36207 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36208 - Posted: 10 Apr 2014 \| 16:40:35 UTC - in response to Message 36203.
	Jozef, Are you saying that the estimates for the CPU application are somehow interfering with the estimates for the other applications? If so, I would be inclined to chalk that up as a client bug, because it doesn't make any sense for there to be coupling between applications. Matt
	ID: 36208 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36209 - Posted: 10 Apr 2014 \| 16:47:47 UTC - in response to Message 36208. Last modified: 10 Apr 2014 \| 16:48:08 UTC
	GPUGrid Admins: Regarding estimates: GPUGrid uses "Duration Correction Factor", which I believe makes client-side estimation corrections globally for the entire project. It is not a "per-application" concept. World Community Grid stopped using it, I believe when they introduced GPU tasks, because the estimates varied so much. If you do not intend on creating very close estimated/approximations for certain application types, then you might need to turn off the global project "Use Duration Correction Factor" switch, which I believe is at your control. Richard H. knows more about this than I do.
	ID: 36209 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36212 - Posted: 10 Apr 2014 \| 17:35:45 UTC - in response to Message 36209.
	"Duration Correction Factor" Oh yes, I remember how much fun we had with that last year. Matt
	ID: 36212 \| Rating: 0 \| rate: / Reply Quote

Vagelis Giannadakis Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level Scientific publications	Message 36228 - Posted: 11 Apr 2014 \| 12:51:44 UTC
	It is not only the "Duration Correction Factor" that applies for the entire project, but also the miscalculated (on the part of the WU implementors) total processing work required for these WUs. The abrupt jump from ~5% to 100% is proof of the miscalculation. I believe that it is this "jump" that drives BOINC's estimations berserk. ____________
	ID: 36228 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36234 - Posted: 11 Apr 2014 \| 15:11:22 UTC - in response to Message 36228.
	It is not only the "Duration Correction Factor" that applies for the entire project, but also the miscalculated (on the part of the WU implementors) total processing work required for these WUs. The abrupt jump from ~5% to 100% is proof of the miscalculation. I believe that it is this "jump" that drives BOINC's estimations berserk. Well, it's the global DCF which does the real damage to other applications within the project, and DCF is only adjusted on the basis of the total elapsed time for the task, and only on task completion at that. But I agree, the erratic jumps in progress %age would make it even harder to get an accurate <rsc_fpops_est> value for the tasks, and that's the fundamental driver for all these estimate problems. We are already using the 'APR' runtime estimation component of CreditNew here, so in theory the estimates should be normalised towards DCF=1.0000 by the server: but that's a very slow process, and DCF - which is a much faster-response mechanism - has already started fighting against it. We *could* disable DCF, as Jacob suggests, but that could apply a sudden shock to the system if applied while DCF is skewed. I'd suggest asking the few people who are running both CPU and NV to bear with the problems for a short time while the runtime estimates are fixed at source (not least, because that way we can watch and confirm that the solution has worked properly when finished). Then, set <dont_use_dcf>, to prevent it escaping again.
	ID: 36234 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36349 - Posted: 17 Apr 2014 \| 11:08:21 UTC - in response to Message 36207. Last modified: 17 Apr 2014 \| 11:11:49 UTC
	It's been another week, and this infuriating bug is still continually wasting tons of CPU time on the 2 laptops that I previously mentioned. Is there anything I can do to help or expedite the solution? I was really excited that they might get their first ever GPUGrid units processed, but so far, I've only been met with disappointment. Any progress on solving the issue I'm seeing on multiple machines? No, not yet, but it's on the list. Matt I doubt the Event Log will say much, but I'll try to monitor it sometime. From what I can see, this is an application error that GPUGrid (MJH) needs to fix. Each "attempt" is a waste of 10 CPU seconds, and you can see below just how many times each task is attempted before failure. It is literally wasting several minutes of CPU, for each task, only to fail. MJH: I hope you can solve this one, please! Looking at the timestamps below, I see: 01:22:53 ... 01:38:25 That's 16 minutes of wasted CPU, and the log file wasn't even captured entirely, as the introduction was truncated. So.... we're looking at 20+ minutes of wasted CPU per task. Please fix! Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 7236): BOINC client no longer exists - exiting 01:22:53 (7236): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:03 (216): BOINC client no longer exists - exiting 01:23:03 (216): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:14 (7032): BOINC client no longer exists - exiting 01:23:14 (7032): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:25 (7480): BOINC client no longer exists - exiting 01:23:25 (7480): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] ..... [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:15 (6340): BOINC client no longer exists - exiting 01:38:15 (6340): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:25 (7780): BOINC client no longer exists - exiting 01:38:25 (7780): timer handler: client dead, exiting </stderr_txt> ]]> Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting MJH: All of the GPUGrid CPU tasks are erroring out, on these 2 hosts I manage: http://www.gpugrid.net/results.php?hostid=137361 http://www.gpugrid.net/results.php?hostid=85944 ... Do you have any idea what is happening with the repeated exits near the step mentioned above? I'll try unattaching and reattaching the project, but... it'd be nice if you might mention if you know what might be causing this problem on 2 separate machines. Edit: Project unattach/reattach and resets.. did not help. Please help! I'm super excited about the possibility about having these non-GPU computers do work for your project, for the first time ever, but... They need your help!
	ID: 36349 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36352 - Posted: 17 Apr 2014 \| 13:23:21 UTC - in response to Message 36349.
	It's been another week, and this infuriating bug is still continually wasting tons of CPU time on the 2 laptops that I previously mentioned. Is there anything I can do to help or expedite the solution? I was really excited that they might get their first ever GPUGrid units processed, but so far, I've only been met with disappointment. Jacob, What's the problem here? The only laptop visible on your account - host 167515 - completed and validated a load of CPU tasks as recently as yesterday. The only errors visible date back to 27 March. Agreed, the two hosts 137361, 85944 on the KSMooney account are still throwing errors, but according to the message log you posted, the problem on both those machines is BOINC client no longer exists - exiting timer handler: client dead, exiting It would seem to me that the first step in curing this needs to happen at your end: why does the client fail to run continously? What messages, if any, can you retrieve from the Event Log or stdoutdae.txt? Do the machines run tasks from other projects without BOINC dying? Only if you can establish a causal link (and better yet, suggest a plausible mechanism) by which the GPUGrid CPU app is *responsible* for the failure of the client does it become the responsibility of the project to fix something. But I can't imagine what that 'something' could be: the app works for you, as it worked for me.
	ID: 36352 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36355 - Posted: 17 Apr 2014 \| 16:40:47 UTC - in response to Message 36352. Last modified: 17 Apr 2014 \| 16:49:34 UTC
	Sigh. All other projects work fine on those 2 laptops (KSMooney is my wife, and those 2 laptops are hers, and I manage them). The stderr is, as you saw. Namely: On those 2 machines, they get stuck in some sort of loop that wastes at least 20 minutes of CPU time per failed GPUGrid task. I believe the problem is something in the app, making it restart continuously. "Why does the client fail to run continously?" is a bit troubling... because the client itself is not crashing/failing, from what I can tell. I'd love to say "the problem is on my end", or even to say "I have more info that further identifies the problem", but I can't. All I have is the stderr.txt you see. It might be helpful if the admins looked to see if any other hosts were having similar problems. I wish I know how I could do that. Does this help? (shows I was freshly-attached, where the job repeatedly started until eventually giving up after wasting 20 minutes of CPU) A quick scan for the text "exited with zero status but no 'finished' file" ... shows that it "failed" exactly 100 times. So, maybe knowing that loop-limit somehow narrows down the problem? 08-Apr-2014 18:43:41 [---] Attaching to http://www.gpugrid.net/ 08-Apr-2014 18:43:43 [http://www.gpugrid.net/] Master file download succeeded 08-Apr-2014 18:43:49 [http://www.gpugrid.net/] Sending scheduler request: Project initialization. 08-Apr-2014 18:43:49 [http://www.gpugrid.net/] Requesting new tasks for CPU 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/ 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/ 08-Apr-2014 18:43:52 [GPUGRID] Scheduler request completed: got 1 new tasks 08-Apr-2014 18:43:54 [GPUGRID] Started download of gpugrid-vina-windows_intelx86.105 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-input 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-receptor 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-ligand 08-Apr-2014 18:43:58 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-input 08-Apr-2014 18:43:58 [GPUGRID] Started download of logogpugrid.png 08-Apr-2014 18:43:59 [GPUGRID] Finished download of logogpugrid.png 08-Apr-2014 18:43:59 [GPUGRID] Started download of project_1.png 08-Apr-2014 18:44:01 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-ligand 08-Apr-2014 18:44:01 [GPUGRID] Started download of project_2.png 08-Apr-2014 18:44:02 [GPUGRID] Finished download of project_1.png 08-Apr-2014 18:44:02 [GPUGRID] Started download of project_3.png 08-Apr-2014 18:44:03 [GPUGRID] Finished download of project_2.png 08-Apr-2014 18:44:04 [GPUGRID] Finished download of project_3.png 08-Apr-2014 18:44:07 [GPUGRID] Finished download of gpugrid-vina-windows_intelx86.105 08-Apr-2014 18:44:07 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-receptor 08-Apr-2014 18:44:07 [GPUGRID] Starting task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 08-Apr-2014 18:44:19 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:19 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:32 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:32 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:44 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:44 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:55 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:55 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:06 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:06 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:18 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:18 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:30 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:30 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:41 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:41 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:53 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:53 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:05 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:05 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:17 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:17 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:24 [---] Suspending network activity - time of day 08-Apr-2014 18:46:28 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:28 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:39 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:39 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:51 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:51 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:02 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:02 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:13 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:13 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:11 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:11 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:23 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:23 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:36 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:36 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:36 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:36 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:47 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:47 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:00 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:00 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:01 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:01 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:15 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:15 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:29 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:29 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:43 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:43 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:55 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:55 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:07 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:07 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:34 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:34 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:45 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:45 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:58 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:58 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:10 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:10 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:04 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:04 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:15 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:15 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:27 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:27 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:38 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:38 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:49 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:49 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:00 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:00 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:23 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:23 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:34 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:34 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:46 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:46 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:57 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:57 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:09 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:09 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:20 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:20 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:31 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:31 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:43 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:43 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:54 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:54 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:05 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:05 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:16 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:16 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:28 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:28 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:39 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:39 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:50 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:50 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:01 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:01 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:13 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:13 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:47 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:47 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:58 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:58 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:09 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:09 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:20 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:20 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:31 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:31 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:42 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:42 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:54 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:54 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:06 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:06 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:17 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:17 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:29 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:29 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:40 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:40 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:03 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:03 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:14 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:14 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:25 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:25 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:10 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:10 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:21 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:21 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:33 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:33 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:44 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:44 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:56 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:56 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:08 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:08 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:04:07 [GPUGRID] Computation for task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 finished 08-Apr-2014 19:04:07 [GPUGRID] Output file 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0_0 for task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 absent
	ID: 36355 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36358 - Posted: 17 Apr 2014 \| 17:08:43 UTC Last modified: 17 Apr 2014 \| 17:11:24 UTC
	Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API?
	ID: 36358 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36360 - Posted: 17 Apr 2014 \| 20:28:12 UTC - in response to Message 36358. Last modified: 17 Apr 2014 \| 20:30:53 UTC
	I seriously feel like I recall a recently-identified-(by-me)-and-fixed-(by-ROM) BOINC API Bug that caused service-installs to not work at all, for apps that were compiled against the bugged API. I think Rom even sent out an email to the projects. GPUGrid Admins: Is it possible that the CPU app was compiled against the bugged API? Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API?
	ID: 36360 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36361 - Posted: 17 Apr 2014 \| 20:32:47 UTC Last modified: 17 Apr 2014 \| 20:34:09 UTC
	Here's the email (it pretty much describes, exactly, the behavior I am seeing on these 2 service-install laptops) > Date: Wed, 6 Nov 2013 10:40:17 -0500 > From: r**@rom.org > To: boinc_proj**@ssl.berkeley.edu > Subject: [boinc_projects] BOINC API Change > > Yesterday we discovered a problem with the BOINC API where it would > shutdown applications after 10 seconds on Windows when the BOINC client > was installed as a service. > > > > The bug will affect all applications built with since July of this year. > If you have built your application between July and now with a recent > BOINC API you should probably deploy a new version of the application. > > > > The commit that fixed the issue is: > 3aaeadaf99669c6460a183e7e9f2063f39152031 > > > > Thanks in advance. > > > > ----- Rom
	ID: 36361 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36362 - Posted: 17 Apr 2014 \| 20:33:43 UTC Last modified: 17 Apr 2014 \| 20:36:22 UTC
	So... Looking at the email pasted above.... Are we using a bugged BOINC API? And, if so, was it aliens? And, if not, can we fix it yet please?
	ID: 36362 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36364 - Posted: 17 Apr 2014 \| 20:45:00 UTC - in response to Message 36360.
	I seriously feel like I recall a recently-identified-(by-me)-and-fixed-(by-ROM) BOINC API Bug that caused service-installs to not work at all, for apps that were compiled against the bugged API. I think Rom even sent out an email to the projects. GPUGrid Admins: Is it possible that the CPU app was compiled against the bugged API? Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API? Ah-ha. Now we're cooking. You're absolutely right. Rom Walton to BOINC Projects, 6 November 2013, subject line "BOINC API Change": Yesterday we discovered a problem with the BOINC API where it would shutdown applications after 10 seconds on Windows when the BOINC client was installed as a service. The bug will affect all applications built with since July of this year. If you have built your application between July and now with a recent BOINC API you should probably deploy a new version of the application. The commit that fixed the issue is: 3aaeadaf99669c6460a183e7e9f2063f39152031 Thanks in advance. ----- Rom http://lists.ssl.berkeley.edu/mailman/private/boinc_projects/2013-November/010493.html That should give Matt enough info to track it down ;)
	ID: 36364 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36366 - Posted: 17 Apr 2014 \| 22:07:00 UTC - in response to Message 36364.
	Looks plausible, Jacob. I think my fork of the client code dates from that window. I'll check and see. But, this is the same library as I have in the GPU application, and that's not affected, is it? Matt
	ID: 36366 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36367 - Posted: 17 Apr 2014 \| 22:48:51 UTC - in response to Message 36366. Last modified: 17 Apr 2014 \| 22:49:56 UTC
	It likely is affected (in the sense that a service-install Windows PC would suffer the heartbeat problem running the GPU app that was compiled against bad BOINC API), but not highly noticeable (since Windows GPU apps cannot run on service installs on OS's newer than Windows XP I think). So, your GPU app is affected, but only on Windows XP or prior... maybe. I look forward to you fixing this. These poor laptops are REALLY WANTING to do GPUGrid work :) :)
	ID: 36367 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36368 - Posted: 17 Apr 2014 \| 22:55:55 UTC - in response to Message 36366.
	Looks plausible, Jacob. I think my fork of the client code dates from that window. I'll check and see. But, this is the same library as I have in the GPU application, and that's not affected, is it? Matt I am running the GPU apps on two machines with service installs (Windows XP, BOINC client 6.12.34, which is the last combination for which this is allowed) Host 43404 - as yet cuda55 max, I haven't updated the driver recently Host 45218 - recently upgraded to GTX 750Ti, so getting cuda60 only. Both machines are running from the 'short task' queue only. The bug doesn't bite.
	ID: 36368 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36370 - Posted: 18 Apr 2014 \| 1:25:58 UTC - in response to Message 36368. Last modified: 18 Apr 2014 \| 1:27:17 UTC
	I'm not positive, but I believe it's possible the the bug possibly also only affects newer clients. Matt, any chance we could compile against newer non-bugged BOINC API, to see if it fixes the problem, please? PS: I'm sorry I didn't figure this out earlier, while the app was still in Testing. I just want it fixed :/
	ID: 36370 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36371 - Posted: 18 Apr 2014 \| 1:55:13 UTC Last modified: 18 Apr 2014 \| 1:56:04 UTC
	PS: I got confirmation from Rom that it's possible XP might not be susceptible to the "service-install BOINC API heartbeat bug", since it might return different error codes for OpenProcess() and have different default process ACLs than newer operating systems. All of my machines (and my wife's 2 service-install laptops) are running Windows 8.1 Update 1. Long story short: I'm still quite confident that compiling against a newer non-bugged BOINC API, will fix the issue. Hope you guys can do it soon - Rom said it was a relatively easy thing to fix. Thanks in advance, Jacob
	ID: 36371 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 36380 - Posted: 18 Apr 2014 \| 9:55:44 UTC - in response to Message 36371.
	Jacob - app updated. Matt
	ID: 36380 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36381 - Posted: 18 Apr 2014 \| 11:09:59 UTC Last modified: 18 Apr 2014 \| 11:30:00 UTC
	Thank you. For reference, the Windows CPU app version 1.05 and prior were bugged, and the Windows 1.06 version is the version that hopefully fixes it. I'll let you know once these 2 hosts get some 1.06 tasks :) They may be blacklisted at the moment lol. Thanks again, Jacob
	ID: 36381 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36496 - Posted: 21 Apr 2014 \| 17:26:53 UTC - in response to Message 36381.
	One of the 2 laptop clients was finally able to get some CPU work, and... IT WORKS! The 1.06 CPU app (which is compiled against the non-bugged BOINC API) appears to have solved the problems I was having. Thanks!
	ID: 36496 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36497 - Posted: 21 Apr 2014 \| 17:28:40 UTC - in response to Message 36496. Last modified: 21 Apr 2014 \| 17:28:50 UTC
	One of the 2 laptop clients was finally able to get some CPU work, and... IT WORKS! The 1.06 CPU app (which is compiled against the non-bugged BOINC API) appears to have solved the problems I was having. Thanks! Neat. Thanks for ferreting out the source of the problem Jacob. Matt
	ID: 36497 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36499 - Posted: 21 Apr 2014 \| 17:31:04 UTC - in response to Message 36497.
	You're welcome. Finding/fixing problems is the main reason I am attached to 26 projects across 4 devices of various Windows installations, running alpha versions of BOINC :)
	ID: 36499 \| Rating: 0 \| rate: / Reply Quote

Speedy Send message Joined: 19 Aug 07 Posts: 43 Credit: 40,991,082 RAC: 809,640 Level Scientific publications	Message 36661 - Posted: 26 Apr 2014 \| 3:22:43 UTC
	I am aware that we are partway through the weekend or leading into it. Will there be more CPU work in the coming days or does the returned at work need to be looked at to see where to focus is going next? There currently looks like that there are only re sends.
	ID: 36661 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : News : New CPU work units: 3HHM-1/ZINC

	About	Science	Volunteers	Performance	Forum	Join us	Donate

Author	Message
MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36115 - Posted: 6 Apr 2014 \| 10:38:57 UTC
	Hi gang, The testing for the CPU application is over and these are the first production WUs. In this project we are studying the PI3Kalpha, a mutation of which is implicated in tumor formation. You can read more about it, and see the structure, here: http://www.rcsb.org/pdb/explore.do?structureId=3HHM Over the course of this project we will be testing some 22 million commercially-available drug-like molecules, drawn from the ZINC database http://zinc.docking.org/, to find compounds which are predicted to bind strongly to the protein in a way which will inhibit its function. Once we have screen the whole database, we will take the best hits and test them for efficacy in a series of in vitro experiments. Hopefully we will find inhibitory compounds which can then serve as the basis for future drug development. Matt
	ID: 36115 \| Rating: 0 \| rate: / Reply Quote

Simba123 Send message Joined: 5 Dec 11 Posts: 147 Credit: 69,970,684 RAC: 0 Level Scientific publications	Message 36116 - Posted: 6 Apr 2014 \| 11:35:59 UTC - in response to Message 36115.
	Thanks for the information, looks exciting!
	ID: 36116 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36117 - Posted: 6 Apr 2014 \| 11:44:08 UTC - in response to Message 36115. Last modified: 6 Apr 2014 \| 11:48:46 UTC
	Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting
	ID: 36117 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36136 - Posted: 7 Apr 2014 \| 11:04:41 UTC
	I have set my preferences (all gone when saving changed one), to do these CPU tasks as well. However as these run quite fast, the remaining estimation of the GPU tasks is wrong. BOINC Manager now things my 770 and 780Ti can do a LR in 38 minutes, would be amazingly awesome though :) ____________ Greetings from TJ
	ID: 36136 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36138 - Posted: 7 Apr 2014 \| 11:45:14 UTC
	Watching more closely I see the following as well. Only about 6% is done from the CPU WU (in 15 minutes) and 5 minutes remain. The at about 8.5% done, estimation time is zero and within a few seconds later, the WU finishes to 100% and is uploaded and reported home, without error. Strange behavior. ____________ Greetings from TJ
	ID: 36138 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 36158 - Posted: 8 Apr 2014 \| 13:20:07 UTC - in response to Message 36138.
	On Linux the progress just jumps from 0% to 100%, runtime varies from around 9min to 13min (for me). The estimated runtime for these CPU workunits has dropped from several hours to just over 2h. The GPUGrid Long GPU tasks estimates have also dropped to just over 2h. The estimated computation size is the same as the Long GPU tasks! ____________ FAQ's HOW TO: - Opt out of Beta Tests - Ask for Help
	ID: 36158 \| Rating: 0 \| rate: / Reply Quote

TJ Send message Joined: 26 Jun 09 Posts: 815 Credit: 1,470,385,294 RAC: 0 Level Scientific publications	Message 36161 - Posted: 8 Apr 2014 \| 14:21:57 UTC - in response to Message 36158.
	On Linux the progress just jumps from 0% to 100%, runtime varies from around 9min to 13min (for me). The estimated runtime for these CPU workunits has dropped from several hours to just over 2h. The GPUGrid Long GPU tasks estimates have also dropped to just over 2h. The estimated computation size is the same as the Long GPU tasks! Thanks for conformation skgiven. Proves my eyes are still okay :) ____________ Greetings from TJ
	ID: 36161 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36173 - Posted: 8 Apr 2014 \| 22:37:42 UTC - in response to Message 36117. Last modified: 8 Apr 2014 \| 23:33:18 UTC
	Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting MJH: All of the GPUGrid CPU tasks are erroring out, on these 2 hosts I manage: http://www.gpugrid.net/results.php?hostid=137361 http://www.gpugrid.net/results.php?hostid=85944 ... Do you have any idea what is happening with the repeated exits near the step mentioned above? I'll try unattaching and reattaching the project, but... it'd be nice if you might mention if you know what might be causing this problem on 2 separate machines. Edit: Project unattach/reattach and resets.. did not help. Please help! I'm super excited about the possibility about having these non-GPU computers do work for your project, for the first time ever, but... They need your help!
	ID: 36173 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36174 - Posted: 8 Apr 2014 \| 23:20:05 UTC - in response to Message 36173.
	The ones I spot-checked all ran for 11 seconds, failed with that 'client dead' message, and then seem to have re-tried to process with the same input file. You'll remember my recent email exchange with David A on the subject of 'ERR_TOO_MANY_EXITS', of course? What does the BOINC Message (event) log have to say about all those client deaths?
	ID: 36174 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36175 - Posted: 8 Apr 2014 \| 23:23:33 UTC - in response to Message 36174. Last modified: 8 Apr 2014 \| 23:29:09 UTC
	I doubt the Event Log will say much, but I'll try to monitor it sometime. From what I can see, this is an application error that GPUGrid (MJH) needs to fix. Each "attempt" is a waste of 10 CPU seconds, and you can see below just how many times each task is attempted before failure. It is literally wasting several minutes of CPU, for each task, only to fail. MJH: I hope you can solve this one, please! Looking at the timestamps below, I see: 01:22:53 ... 01:38:25 That's 16 minutes of wasted CPU, and the log file wasn't even captured entirely, as the introduction was truncated. So.... we're looking at 20+ minutes of wasted CPU per task. Please fix! Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 7236): BOINC client no longer exists - exiting 01:22:53 (7236): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:03 (216): BOINC client no longer exists - exiting 01:23:03 (216): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:14 (7032): BOINC client no longer exists - exiting 01:23:14 (7032): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:25 (7480): BOINC client no longer exists - exiting 01:23:25 (7480): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] ..... [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:15 (6340): BOINC client no longer exists - exiting 01:38:15 (6340): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:25 (7780): BOINC client no longer exists - exiting 01:38:25 (7780): timer handler: client dead, exiting </stderr_txt> ]]>
	ID: 36175 \| Rating: 0 \| rate: / Reply Quote

(retired account) Send message Joined: 22 Dec 11 Posts: 38 Credit: 28,606,255 RAC: 0 Level Scientific publications	Message 36176 - Posted: 9 Apr 2014 \| 3:57:11 UTC Last modified: 9 Apr 2014 \| 4:09:26 UTC
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-)
	ID: 36176 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36177 - Posted: 9 Apr 2014 \| 8:44:37 UTC - in response to Message 36175.
	It doesn't seem to be universal. Some of the ones which fail go on to crash on every machine which tries them, but others run successfully. Look at WU 5573123: the first failure is from Jacob's list, but the task ran successfully on a machine with comparable specifications (i7, Windows 8.1 64-bit, BOINC v7.2.42). I'm trying to reproduce on host 93580, but so far the application is working properly (45% in 25 minutes, moving on from file to file as each segment is completed).
	ID: 36177 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36178 - Posted: 9 Apr 2014 \| 10:03:39 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? Looking into that now.. Matt
	ID: 36178 \| Rating: 0 \| rate: / Reply Quote

localizer Send message Joined: 17 Apr 08 Posts: 113 Credit: 1,656,514,857 RAC: 0 Level Scientific publications	Message 36182 - Posted: 9 Apr 2014 \| 14:22:39 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) +1.....
	ID: 36182 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 36193 - Posted: 10 Apr 2014 \| 9:16:09 UTC - in response to Message 36176.
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) I should have increased it now. gianni
	ID: 36193 \| Rating: 0 \| rate: / Reply Quote

localizer Send message Joined: 17 Apr 08 Posts: 113 Credit: 1,656,514,857 RAC: 0 Level Scientific publications	Message 36194 - Posted: 10 Apr 2014 \| 9:28:23 UTC - in response to Message 36193. Last modified: 10 Apr 2014 \| 9:34:39 UTC
	Since we seem to have plenty of work available, would you please consider increasing the limit per host for cpu work? I get max. 16 workunit for an i7 with 8 threads. Since they need less then 30 min. on average, that's hardly one hour worth of work. A cache of at least half a day would be nice (that's an approx. limit of 24 per thread in my case). I'm not always on with a fast connection. ;-) I should have increased it now. gianni ............... great - currently I have 36 CPU & 2 GPU in my queue. Thanks
	ID: 36194 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36195 - Posted: 10 Apr 2014 \| 9:54:35 UTC - in response to Message 36194.
	Excellent. We ought to see a big improvement in throughput now! Matt
	ID: 36195 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36196 - Posted: 10 Apr 2014 \| 10:44:23 UTC Last modified: 10 Apr 2014 \| 10:45:08 UTC
	Any progress on solving the issue I'm seeing on multiple machines?
	ID: 36196 \| Rating: 0 \| rate: / Reply Quote

Vagelis Giannadakis Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level Scientific publications	Message 36197 - Posted: 10 Apr 2014 \| 11:10:43 UTC
	These WUs have some weird behavior. While they run perfectly fine, their Progress jumps abruptly from ~5% to 100%. This not only means that their estimated processing work is grossly miscalculated, it also messes up BOINC's estimations. I discovered yesterday evening that I had downloaded a GERARD while the currently processing NATHAN had several more hours of work ahead. Looking at the BOINC manager, the cause for this was evident: while the GPU WU needed several more hours on my GTX 650Ti, BOINC thought it just needed 30 more minutes! Just like that, I would have lost the credit bonus, because BOINC messed up its estimations! I assert that this is due to these CPU WUs and their weird Progress "jump". I've been crunching GPUGRID together with WCG on the same host for a year now and this had never happened. It started happening as soon as I suspended WCG and enabled CPU apps on GPUGRID. To give you a concrete example, this WU took just over 10.5 minutes, but BOINC estimated it to need more than 4.5 hours! Doesn't this affect estimations of future WUs? ____________
	ID: 36197 \| Rating: 0 \| rate: / Reply Quote

Jozef J Send message Joined: 7 Jun 12 Posts: 112 Credit: 1,140,895,172 RAC: 259,041 Level Scientific publications	Message 36203 - Posted: 10 Apr 2014 \| 14:22:37 UTC - in response to Message 36197.
	These WUs have some weird behavior. While they run perfectly fine, their Progress jumps abruptly from ~5% to 100%. This not only means that their estimated processing work is grossly miscalculated, it also messes up BOINC's estimations. I discovered yesterday evening that I had downloaded a GERARD while the currently processing NATHAN had several more hours of work ahead. Looking at the BOINC manager, the cause for this was evident: while the GPU WU needed several more hours on my GTX 650Ti, BOINC thought it just needed 30 more minutes! Just like that, I would have lost the credit bonus, because BOINC messed up its estimations! I assert that this is due to these CPU WUs and their weird Progress "jump". I've been crunching GPUGRID together with WCG on the same host for a year now and this had never happened. It started happening as soon as I suspended WCG and enabled CPU apps on GPUGRID. To give you a concrete example, this WU took just over 10.5 minutes, but BOINC estimated it to need more than 4.5 hours! Doesn't this affect estimations of future WUs? That's exactly what I also found out yesterday and continues it.. I also had problems for a while when I enabled cpu GPUGRID tasks. Now I have to just few Nathan estimeted whose requires 16 hours to 680gtx NV..... Yesterday I've even some few extra long tasks, you can see them in my tasks.. I2010R1-SDOERR_BARNA-3-4-RND4946_0 6406280 62,142.59 Gpu time 21,988.47 cpu time..WOW We will be just happy if they'll catch solve these emerging issues..)))
	ID: 36203 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36207 - Posted: 10 Apr 2014 \| 16:38:44 UTC - in response to Message 36196.
	Any progress on solving the issue I'm seeing on multiple machines? No, not yet, but it's on the list. Matt
	ID: 36207 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36208 - Posted: 10 Apr 2014 \| 16:40:35 UTC - in response to Message 36203.
	Jozef, Are you saying that the estimates for the CPU application are somehow interfering with the estimates for the other applications? If so, I would be inclined to chalk that up as a client bug, because it doesn't make any sense for there to be coupling between applications. Matt
	ID: 36208 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36209 - Posted: 10 Apr 2014 \| 16:47:47 UTC - in response to Message 36208. Last modified: 10 Apr 2014 \| 16:48:08 UTC
	GPUGrid Admins: Regarding estimates: GPUGrid uses "Duration Correction Factor", which I believe makes client-side estimation corrections globally for the entire project. It is not a "per-application" concept. World Community Grid stopped using it, I believe when they introduced GPU tasks, because the estimates varied so much. If you do not intend on creating very close estimated/approximations for certain application types, then you might need to turn off the global project "Use Duration Correction Factor" switch, which I believe is at your control. Richard H. knows more about this than I do.
	ID: 36209 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36212 - Posted: 10 Apr 2014 \| 17:35:45 UTC - in response to Message 36209.
	"Duration Correction Factor" Oh yes, I remember how much fun we had with that last year. Matt
	ID: 36212 \| Rating: 0 \| rate: / Reply Quote

Vagelis Giannadakis Send message Joined: 5 May 13 Posts: 187 Credit: 349,254,454 RAC: 0 Level Scientific publications	Message 36228 - Posted: 11 Apr 2014 \| 12:51:44 UTC
	It is not only the "Duration Correction Factor" that applies for the entire project, but also the miscalculated (on the part of the WU implementors) total processing work required for these WUs. The abrupt jump from ~5% to 100% is proof of the miscalculation. I believe that it is this "jump" that drives BOINC's estimations berserk. ____________
	ID: 36228 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36234 - Posted: 11 Apr 2014 \| 15:11:22 UTC - in response to Message 36228.
	It is not only the "Duration Correction Factor" that applies for the entire project, but also the miscalculated (on the part of the WU implementors) total processing work required for these WUs. The abrupt jump from ~5% to 100% is proof of the miscalculation. I believe that it is this "jump" that drives BOINC's estimations berserk. Well, it's the global DCF which does the real damage to other applications within the project, and DCF is only adjusted on the basis of the total elapsed time for the task, and only on task completion at that. But I agree, the erratic jumps in progress %age would make it even harder to get an accurate <rsc_fpops_est> value for the tasks, and that's the fundamental driver for all these estimate problems. We are already using the 'APR' runtime estimation component of CreditNew here, so in theory the estimates should be normalised towards DCF=1.0000 by the server: but that's a very slow process, and DCF - which is a much faster-response mechanism - has already started fighting against it. We *could* disable DCF, as Jacob suggests, but that could apply a sudden shock to the system if applied while DCF is skewed. I'd suggest asking the few people who are running both CPU and NV to bear with the problems for a short time while the runtime estimates are fixed at source (not least, because that way we can watch and confirm that the solution has worked properly when finished). Then, set <dont_use_dcf>, to prevent it escaping again.
	ID: 36234 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36349 - Posted: 17 Apr 2014 \| 11:08:21 UTC - in response to Message 36207. Last modified: 17 Apr 2014 \| 11:11:49 UTC
	It's been another week, and this infuriating bug is still continually wasting tons of CPU time on the 2 laptops that I previously mentioned. Is there anything I can do to help or expedite the solution? I was really excited that they might get their first ever GPUGrid units processed, but so far, I've only been met with disappointment. Any progress on solving the issue I'm seeing on multiple machines? No, not yet, but it's on the list. Matt I doubt the Event Log will say much, but I'll try to monitor it sometime. From what I can see, this is an application error that GPUGrid (MJH) needs to fix. Each "attempt" is a waste of 10 CPU seconds, and you can see below just how many times each task is attempted before failure. It is literally wasting several minutes of CPU, for each task, only to fail. MJH: I hope you can solve this one, please! Looking at the timestamps below, I see: 01:22:53 ... 01:38:25 That's 16 minutes of wasted CPU, and the log file wasn't even captured entirely, as the introduction was truncated. So.... we're looking at 20+ minutes of wasted CPU per task. Please fix! Stderr output <core_client_version>7.2.42</core_client_version> <![CDATA[ <message> too many exit(0)s </message> <stderr_txt> 7236): BOINC client no longer exists - exiting 01:22:53 (7236): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:03 (216): BOINC client no longer exists - exiting 01:23:03 (216): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:14 (7032): BOINC client no longer exists - exiting 01:23:14 (7032): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:23:25 (7480): BOINC client no longer exists - exiting 01:23:25 (7480): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] ..... [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:15 (6340): BOINC client no longer exists - exiting 01:38:15 (6340): timer handler: client dead, exiting # (BOINC) Mapping [ligand.tar]->[ligand.tar] # (BOINC) Mapping [results.tar]->[../../projects/www.gpugrid.net/2784-DOCK_LAB_MJHARVEY_18211_3HHM2_ZINC9-0-1-RND7172_0_0] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [flexible.pdbqt]->[flexible.pdbqt] nextfile: reading [in] [.] [..] [ligand-9243.pdbqt] NEXTFILE : [out] [ligand-9243.pdbqt] copy [in/ligand-9243.pdbqt]->[ligand.pdbqt] # (BOINC) Mapping [input.dat]->[input.dat] # (BOINC) Mapping [progress.log]->[progress.log] # (BOINC) Mapping [protein.pdbqt]->[protein.pdbqt] # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 01:38:25 (7780): BOINC client no longer exists - exiting 01:38:25 (7780): timer handler: client dead, exiting </stderr_txt> ]]> Can you please have a look at this host's tasks (my wife's laptop) -- the tasks seem to be erroring out. http://www.gpugrid.net/results.php?hostid=85944 Exit status -226 (0xffffffffffffff1e) ERR_TOO_MANY_EXITS ... near step: # (BOINC) Mapping [ligand.pdbqt]->[ligand.pdbqt] 17:11:39 (1732): BOINC client no longer exists - exiting 17:11:39 (1732): timer handler: client dead, exiting MJH: All of the GPUGrid CPU tasks are erroring out, on these 2 hosts I manage: http://www.gpugrid.net/results.php?hostid=137361 http://www.gpugrid.net/results.php?hostid=85944 ... Do you have any idea what is happening with the repeated exits near the step mentioned above? I'll try unattaching and reattaching the project, but... it'd be nice if you might mention if you know what might be causing this problem on 2 separate machines. Edit: Project unattach/reattach and resets.. did not help. Please help! I'm super excited about the possibility about having these non-GPU computers do work for your project, for the first time ever, but... They need your help!
	ID: 36349 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36352 - Posted: 17 Apr 2014 \| 13:23:21 UTC - in response to Message 36349.
	It's been another week, and this infuriating bug is still continually wasting tons of CPU time on the 2 laptops that I previously mentioned. Is there anything I can do to help or expedite the solution? I was really excited that they might get their first ever GPUGrid units processed, but so far, I've only been met with disappointment. Jacob, What's the problem here? The only laptop visible on your account - host 167515 - completed and validated a load of CPU tasks as recently as yesterday. The only errors visible date back to 27 March. Agreed, the two hosts 137361, 85944 on the KSMooney account are still throwing errors, but according to the message log you posted, the problem on both those machines is BOINC client no longer exists - exiting timer handler: client dead, exiting It would seem to me that the first step in curing this needs to happen at your end: why does the client fail to run continously? What messages, if any, can you retrieve from the Event Log or stdoutdae.txt? Do the machines run tasks from other projects without BOINC dying? Only if you can establish a causal link (and better yet, suggest a plausible mechanism) by which the GPUGrid CPU app is *responsible* for the failure of the client does it become the responsibility of the project to fix something. But I can't imagine what that 'something' could be: the app works for you, as it worked for me.
	ID: 36352 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36355 - Posted: 17 Apr 2014 \| 16:40:47 UTC - in response to Message 36352. Last modified: 17 Apr 2014 \| 16:49:34 UTC
	Sigh. All other projects work fine on those 2 laptops (KSMooney is my wife, and those 2 laptops are hers, and I manage them). The stderr is, as you saw. Namely: On those 2 machines, they get stuck in some sort of loop that wastes at least 20 minutes of CPU time per failed GPUGrid task. I believe the problem is something in the app, making it restart continuously. "Why does the client fail to run continously?" is a bit troubling... because the client itself is not crashing/failing, from what I can tell. I'd love to say "the problem is on my end", or even to say "I have more info that further identifies the problem", but I can't. All I have is the stderr.txt you see. It might be helpful if the admins looked to see if any other hosts were having similar problems. I wish I know how I could do that. Does this help? (shows I was freshly-attached, where the job repeatedly started until eventually giving up after wasting 20 minutes of CPU) A quick scan for the text "exited with zero status but no 'finished' file" ... shows that it "failed" exactly 100 times. So, maybe knowing that loop-limit somehow narrows down the problem? 08-Apr-2014 18:43:41 [---] Attaching to http://www.gpugrid.net/ 08-Apr-2014 18:43:43 [http://www.gpugrid.net/] Master file download succeeded 08-Apr-2014 18:43:49 [http://www.gpugrid.net/] Sending scheduler request: Project initialization. 08-Apr-2014 18:43:49 [http://www.gpugrid.net/] Requesting new tasks for CPU 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_REF::parse(): unrecognized: 'rboinc/' 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/ 08-Apr-2014 18:43:52 [---] [unparsed_xml] FILE_INFO::parse(): unrecognized: rboinc/ 08-Apr-2014 18:43:52 [GPUGRID] Scheduler request completed: got 1 new tasks 08-Apr-2014 18:43:54 [GPUGRID] Started download of gpugrid-vina-windows_intelx86.105 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-input 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-receptor 08-Apr-2014 18:43:54 [GPUGRID] Started download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-ligand 08-Apr-2014 18:43:58 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-input 08-Apr-2014 18:43:58 [GPUGRID] Started download of logogpugrid.png 08-Apr-2014 18:43:59 [GPUGRID] Finished download of logogpugrid.png 08-Apr-2014 18:43:59 [GPUGRID] Started download of project_1.png 08-Apr-2014 18:44:01 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-ligand 08-Apr-2014 18:44:01 [GPUGRID] Started download of project_2.png 08-Apr-2014 18:44:02 [GPUGRID] Finished download of project_1.png 08-Apr-2014 18:44:02 [GPUGRID] Started download of project_3.png 08-Apr-2014 18:44:03 [GPUGRID] Finished download of project_2.png 08-Apr-2014 18:44:04 [GPUGRID] Finished download of project_3.png 08-Apr-2014 18:44:07 [GPUGRID] Finished download of gpugrid-vina-windows_intelx86.105 08-Apr-2014 18:44:07 [GPUGRID] Finished download of 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-receptor 08-Apr-2014 18:44:07 [GPUGRID] Starting task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 08-Apr-2014 18:44:19 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:19 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:32 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:32 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:44 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:44 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:44:55 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:44:55 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:06 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:06 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:18 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:18 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:30 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:30 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:41 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:41 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:45:53 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:45:53 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:05 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:05 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:17 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:17 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:24 [---] Suspending network activity - time of day 08-Apr-2014 18:46:28 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:28 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:39 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:39 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:46:51 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:46:51 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:02 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:02 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:13 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:13 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:47:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:47:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:11 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:11 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:23 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:23 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:36 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:36 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:48:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:48:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:36 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:36 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:49:47 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:49:47 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:00 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:00 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:50:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:50:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:01 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:01 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:15 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:15 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:29 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:29 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:43 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:43 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:51:55 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:51:55 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:07 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:07 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:34 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:34 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:45 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:45 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:52:58 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:52:58 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:10 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:10 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:53:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:53:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:04 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:04 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:15 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:15 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:27 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:27 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:38 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:38 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:54:49 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:54:49 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:00 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:00 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:12 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:12 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:23 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:23 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:34 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:34 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:46 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:46 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:55:57 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:55:57 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:09 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:09 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:20 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:20 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:31 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:31 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:43 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:43 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:56:54 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:56:54 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:05 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:05 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:16 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:16 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:28 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:28 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:39 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:39 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:57:50 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:57:50 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:01 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:01 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:13 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:13 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:24 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:24 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:35 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:35 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:47 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:47 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:58:58 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:58:58 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:09 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:09 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:20 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:20 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:31 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:31 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:42 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:42 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 18:59:54 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 18:59:54 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:06 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:06 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:17 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:17 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:29 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:29 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:40 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:40 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:00:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:00:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:03 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:03 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:14 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:14 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:25 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:25 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:48 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:48 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:01:59 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:01:59 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:10 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:10 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:21 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:21 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:33 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:33 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:44 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:44 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:02:56 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:02:56 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:08 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:08 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:22 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:22 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:37 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:37 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:03:52 [GPUGRID] Task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 exited with zero status but no 'finished' file 08-Apr-2014 19:03:52 [GPUGRID] If this happens repeatedly you may need to reset the project. 08-Apr-2014 19:04:07 [GPUGRID] Computation for task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 finished 08-Apr-2014 19:04:07 [GPUGRID] Output file 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0_0 for task 4900-DOCK_LAB_MJHARVEY_18206_3HHM2_ZINC4-0-1-RND3680_0 absent
	ID: 36355 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36358 - Posted: 17 Apr 2014 \| 17:08:43 UTC Last modified: 17 Apr 2014 \| 17:11:24 UTC
	Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API?
	ID: 36358 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36360 - Posted: 17 Apr 2014 \| 20:28:12 UTC - in response to Message 36358. Last modified: 17 Apr 2014 \| 20:30:53 UTC
	I seriously feel like I recall a recently-identified-(by-me)-and-fixed-(by-ROM) BOINC API Bug that caused service-installs to not work at all, for apps that were compiled against the bugged API. I think Rom even sent out an email to the projects. GPUGrid Admins: Is it possible that the CPU app was compiled against the bugged API? Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API?
	ID: 36360 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36361 - Posted: 17 Apr 2014 \| 20:32:47 UTC Last modified: 17 Apr 2014 \| 20:34:09 UTC
	Here's the email (it pretty much describes, exactly, the behavior I am seeing on these 2 service-install laptops) > Date: Wed, 6 Nov 2013 10:40:17 -0500 > From: r**@rom.org > To: boinc_proj**@ssl.berkeley.edu > Subject: [boinc_projects] BOINC API Change > > Yesterday we discovered a problem with the BOINC API where it would > shutdown applications after 10 seconds on Windows when the BOINC client > was installed as a service. > > > > The bug will affect all applications built with since July of this year. > If you have built your application between July and now with a recent > BOINC API you should probably deploy a new version of the application. > > > > The commit that fixed the issue is: > 3aaeadaf99669c6460a183e7e9f2063f39152031 > > > > Thanks in advance. > > > > ----- Rom
	ID: 36361 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36362 - Posted: 17 Apr 2014 \| 20:33:43 UTC Last modified: 17 Apr 2014 \| 20:36:22 UTC
	So... Looking at the email pasted above.... Are we using a bugged BOINC API? And, if so, was it aliens? And, if not, can we fix it yet please?
	ID: 36362 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36364 - Posted: 17 Apr 2014 \| 20:45:00 UTC - in response to Message 36360.
	I seriously feel like I recall a recently-identified-(by-me)-and-fixed-(by-ROM) BOINC API Bug that caused service-installs to not work at all, for apps that were compiled against the bugged API. I think Rom even sent out an email to the projects. GPUGrid Admins: Is it possible that the CPU app was compiled against the bugged API? Both of those machines are service-installs. I'm starting to think that "being a service install" might be relevant to the problem. Didn't we recently have some sort of "service-install heartbeat bug" in the BOINC API - is it possible this new CPU App is using a bugged version of the API? Ah-ha. Now we're cooking. You're absolutely right. Rom Walton to BOINC Projects, 6 November 2013, subject line "BOINC API Change": Yesterday we discovered a problem with the BOINC API where it would shutdown applications after 10 seconds on Windows when the BOINC client was installed as a service. The bug will affect all applications built with since July of this year. If you have built your application between July and now with a recent BOINC API you should probably deploy a new version of the application. The commit that fixed the issue is: 3aaeadaf99669c6460a183e7e9f2063f39152031 Thanks in advance. ----- Rom http://lists.ssl.berkeley.edu/mailman/private/boinc_projects/2013-November/010493.html That should give Matt enough info to track it down ;)
	ID: 36364 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36366 - Posted: 17 Apr 2014 \| 22:07:00 UTC - in response to Message 36364.
	Looks plausible, Jacob. I think my fork of the client code dates from that window. I'll check and see. But, this is the same library as I have in the GPU application, and that's not affected, is it? Matt
	ID: 36366 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36367 - Posted: 17 Apr 2014 \| 22:48:51 UTC - in response to Message 36366. Last modified: 17 Apr 2014 \| 22:49:56 UTC
	It likely is affected (in the sense that a service-install Windows PC would suffer the heartbeat problem running the GPU app that was compiled against bad BOINC API), but not highly noticeable (since Windows GPU apps cannot run on service installs on OS's newer than Windows XP I think). So, your GPU app is affected, but only on Windows XP or prior... maybe. I look forward to you fixing this. These poor laptops are REALLY WANTING to do GPUGrid work :) :)
	ID: 36367 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,923,377,372 RAC: 18,742,667 Level Scientific publications	Message 36368 - Posted: 17 Apr 2014 \| 22:55:55 UTC - in response to Message 36366.
	Looks plausible, Jacob. I think my fork of the client code dates from that window. I'll check and see. But, this is the same library as I have in the GPU application, and that's not affected, is it? Matt I am running the GPU apps on two machines with service installs (Windows XP, BOINC client 6.12.34, which is the last combination for which this is allowed) Host 43404 - as yet cuda55 max, I haven't updated the driver recently Host 45218 - recently upgraded to GTX 750Ti, so getting cuda60 only. Both machines are running from the 'short task' queue only. The bug doesn't bite.
	ID: 36368 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36370 - Posted: 18 Apr 2014 \| 1:25:58 UTC - in response to Message 36368. Last modified: 18 Apr 2014 \| 1:27:17 UTC
	I'm not positive, but I believe it's possible the the bug possibly also only affects newer clients. Matt, any chance we could compile against newer non-bugged BOINC API, to see if it fixes the problem, please? PS: I'm sorry I didn't figure this out earlier, while the app was still in Testing. I just want it fixed :/
	ID: 36370 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36371 - Posted: 18 Apr 2014 \| 1:55:13 UTC Last modified: 18 Apr 2014 \| 1:56:04 UTC
	PS: I got confirmation from Rom that it's possible XP might not be susceptible to the "service-install BOINC API heartbeat bug", since it might return different error codes for OpenProcess() and have different default process ACLs than newer operating systems. All of my machines (and my wife's 2 service-install laptops) are running Windows 8.1 Update 1. Long story short: I'm still quite confident that compiling against a newer non-bugged BOINC API, will fix the issue. Hope you guys can do it soon - Rom said it was a relatively easy thing to fix. Thanks in advance, Jacob
	ID: 36371 \| Rating: 0 \| rate: / Reply Quote

GPUGRID Role account Send message Joined: 15 Feb 07 Posts: 134 Credit: 1,349,535,983 RAC: 0 Level Scientific publications	Message 36380 - Posted: 18 Apr 2014 \| 9:55:44 UTC - in response to Message 36371.
	Jacob - app updated. Matt
	ID: 36380 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36381 - Posted: 18 Apr 2014 \| 11:09:59 UTC Last modified: 18 Apr 2014 \| 11:30:00 UTC
	Thank you. For reference, the Windows CPU app version 1.05 and prior were bugged, and the Windows 1.06 version is the version that hopefully fixes it. I'll let you know once these 2 hosts get some 1.06 tasks :) They may be blacklisted at the moment lol. Thanks again, Jacob
	ID: 36381 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36496 - Posted: 21 Apr 2014 \| 17:26:53 UTC - in response to Message 36381.
	One of the 2 laptop clients was finally able to get some CPU work, and... IT WORKS! The 1.06 CPU app (which is compiled against the non-bugged BOINC API) appears to have solved the problems I was having. Thanks!
	ID: 36496 \| Rating: 0 \| rate: / Reply Quote

MJH Project administrator Project developer Project scientist Send message Joined: 12 Nov 07 Posts: 696 Credit: 27,266,655 RAC: 0 Level Scientific publications	Message 36497 - Posted: 21 Apr 2014 \| 17:28:40 UTC - in response to Message 36496. Last modified: 21 Apr 2014 \| 17:28:50 UTC
	One of the 2 laptop clients was finally able to get some CPU work, and... IT WORKS! The 1.06 CPU app (which is compiled against the non-bugged BOINC API) appears to have solved the problems I was having. Thanks! Neat. Thanks for ferreting out the source of the problem Jacob. Matt
	ID: 36497 \| Rating: 0 \| rate: / Reply Quote

Jacob Klein Send message Joined: 11 Oct 08 Posts: 1127 Credit: 1,901,927,545 RAC: 0 Level Scientific publications	Message 36499 - Posted: 21 Apr 2014 \| 17:31:04 UTC - in response to Message 36497.
	You're welcome. Finding/fixing problems is the main reason I am attached to 26 projects across 4 devices of various Windows installations, running alpha versions of BOINC :)
	ID: 36499 \| Rating: 0 \| rate: / Reply Quote

Speedy Send message Joined: 19 Aug 07 Posts: 43 Credit: 40,991,082 RAC: 809,640 Level Scientific publications	Message 36661 - Posted: 26 Apr 2014 \| 3:22:43 UTC
	I am aware that we are partway through the weekend or leading into it. Will there be more CPU work in the coming days or does the returned at work need to be looked at to see where to focus is going next? There currently looks like that there are only re sends.
	ID: 36661 \| Rating: 0 \| rate: / Reply Quote