Message boards : Number crunching : Disk usage limit exceeded
Author | Message |
---|---|
So I'm starting to see these errors on the <core_client_version>7.6.22</core_client_version> Anyone else? | |
ID: 48021 | Rating: 0 | rate: / Reply Quote | |
Name e2s4_e1s43p0f362-ADRIA_FOLDUBQ80_crystal_ss_contacts_50_ubiquitin_0-0-1-RND8662_3 <core_client_version>7.6.33</core_client_version> <![CDATA[ <message> Disk usage limit exceeded </message> <stderr_txt> Exit status 196 (0xc4) EXIT_DISK_LIMIT_EXCEEDED Entry form out of Boinc manager log 23-Oct-17 1:43:58 PM | GPUGRID | Aborting task e2s4_e1s43p0f362-ADRIA_FOLDUBQ80_crystal_ss_contacts_50_ubiquitin_0-0-1-RND8662_3: exceeded disk limit: 286.70MB > 286.10MB Run time 52,759.76 CPU time 9,484.81 | |
ID: 48026 | Rating: 0 | rate: / Reply Quote | |
These seem to be the same sequence of workunits as we've been discussing in Bad batch of tasks?. | |
ID: 48027 | Rating: 0 | rate: / Reply Quote | |
Most of the tasks from this batch are faulty. | |
ID: 48028 | Rating: 0 | rate: / Reply Quote | |
Thanks to both you and Richard. I see now that they part of that bad batch. Ok, nothing we can do then.... | |
ID: 48031 | Rating: 0 | rate: / Reply Quote | |
Thanks to both you and Richard. I see now that they part of that bad batch. Ok, nothing we can do then....They are the part of that bad batch, but they fail for different reasons. The other (the 'Bad batch of tasks?') thread is about the tasks which fail right after the start with "the simulation became unstable" error. Perhaps the algorithm to check the simulation's stability set to overly sensitive for this part of the batch. This thread is about the tasks which run for hours, until they exceed the disk usage limits set for the tasks on the server and then error out. This is much more annoying than the 'original' one, as it wastes electricity and time, and it can be easily fixed by raising the disk usage limit of a task (if it is really necessary, that is the high disk usage is not a result of another error). | |
ID: 48034 | Rating: 0 | rate: / Reply Quote | |
Dears, | |
ID: 48045 | Rating: 0 | rate: / Reply Quote | |
Thank you Toni for the response and update. | |
ID: 48047 | Rating: 0 | rate: / Reply Quote | |
Thanks... please consider that it's a consequence of the fact that we are interested into a variety of systems and conditions, implying that we can not make all workunits exactly the same. | |
ID: 48048 | Rating: 0 | rate: / Reply Quote | |
I've just received a brand-new ADRIA short run task - e46s1_e44s2p0f280-ADRIA_FOLDPG80_crystal_ss_contacts_50_proteinG_2-0-1-RND2909_0. Hoping to catch any disk usage errors before they happen, I had a look at the file sizes. | |
ID: 48074 | Rating: 0 | rate: / Reply Quote | |
Reached 12.3 MB and 37.5% progress after 2 hours 15 minutes - I think this one is going to make it. | |
ID: 48075 | Rating: 0 | rate: / Reply Quote | |
Yes, it's inconsistent, but the real problem is that the _9 file should not become that large. I thought the workunits were cancelled... are they still around? | |
ID: 48076 | Rating: 0 | rate: / Reply Quote | |
Also, do I mistake or they should have been long WUs? | |
ID: 48077 | Rating: 0 | rate: / Reply Quote | |
Also, do I mistake or they should have been long WUs? Speedy's workunit, from the previous bad batch, ran for between 32,429 seconds (GTX 1080) and 115,002 seconds (GTX 960). Yes, I think those should have been 'long queue' values. My current task from the new batch today is on course for a 6-hour run (GTX 970) and a 33 MB final file size - I think we're going to make it :-) | |
ID: 48078 | Rating: 0 | rate: / Reply Quote | |
Also, do I mistake or they should have been long WUs? The task that ran for 150,002 seconds was on a 970. I also agree these should have been in the long queue | |
ID: 48080 | Rating: 0 | rate: / Reply Quote | |
The task that ran for 150,002 seconds was on a 970. I did look at the stderr_txt, and the first three starts all say 960. It's only now I look more carefully that I see that the final part of the run was done on a 970. | |
ID: 48081 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : Disk usage limit exceeded