Message boards : Number crunching : Lost 8 hours of work on a long WU why?
Author | Message |
---|---|
While looking at Boinctasks, I noticed that boinc was not running on one of my systems. I logged into that wn10x64 system and checked the event viewer and did not see any problem in any of its logs. <global_preferences> <source_project>https://grcpool.com/</source_project> <mod_time>0.000000</mod_time> <battery_charge_min_pct>90.000000</battery_charge_min_pct>
<source_project>http://www.worldcommunitygrid.org/</source_project> <source_scheduler>https://scheduler.worldcommunitygrid.org/boinc/wcg_cgi/fcgi</source_scheduler> <mod_time>1503442910</mod_time> <run_on_batteries>0</run_on_batteries>
| |
ID: 48801 | Rating: 0 | rate: / Reply Quote | |
1. Who actually does the checkpoint?The GPUGrid app does the checkpoint. When you have a very fast GPU, it does its checkpoints so frequently that the (Windows) disk cache never writes the actual data to the drive (this could be resolved by disabling the write caching of the BOINC drive). This behavior results in an error only if the procession of a GPUGrid task is broken unexpectedly (by a power failure, or a system hang) here are 3 lines of the service request xml from gpugrid. Note that grcpool is identified as a project. It is actually a manager.No, it says that your most recent computing preferences are on WCG. | |
ID: 48802 | Rating: 0 | rate: / Reply Quote | |
1. Who actually does the checkpoint?The GPUGrid app does the checkpoint. When you have a very fast GPU, it does its checkpoints so frequently that the (Windows) disk cache never writes the actual data to the drive (this could be resolved by disabling the write caching of the BOINC drive). This behavior results in an error only if the procession of a GPUGrid task is broken unexpectedly (by a power failure, or a system hang) WCG is not on any of my systems nor has it ever been since I switched to grcpool. Possibly grcpool uses WCG for all its settings which does not surprise me as I have no control over any project parameters unlike BAM! where I can log into the project and change things. That cannot be done on grcpool. [EDIT] Assuming there is no hardware problem, why is it that gpugrid did not pick up at a recent checkpoint? I clearly saw it start at %0 progress and slowly start up even with over 8 hours of existing "progress". OTH, it is conceivable a hardware problem such as the gtx1070 clock dropping to 300mhz could show very little progress even after 8 hours. I have not seen a problem like that on this system. Surely in 8 hours of time, at least one checkpoint was flushed out of the cache I would think. Also, I went back over several days looking through the event log and did not see any restart. About the time that I noticed that boinc was not running, there was a minor windows update. The update did not required a reboot. | |
ID: 48803 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Lost 8 hours of work on a long WU why?