Advanced search

Message boards : Number crunching : Where did all my task history go?

Author Message
lohphat
Send message
Joined: 21 Jan 10
Posts: 44
Credit: 1,314,207,889
RAC: 5,423,904
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46594 - Posted: 6 Mar 2017 | 2:50:53 UTC

It has all my totals but the task detail is mostly missing only showing a handful of tasks completed despite me running the app for years.

Jacob Klein
Send message
Joined: 11 Oct 08
Posts: 1127
Credit: 1,901,927,545
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46595 - Posted: 6 Mar 2017 | 3:11:16 UTC - in response to Message 46594.
Last modified: 6 Mar 2017 | 3:11:45 UTC

It is normal for the "db_purge" job to clean/purge/remove tasks, whenever the project decides to run it. They must have run it recently.

Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46596 - Posted: 6 Mar 2017 | 3:53:22 UTC

It sure would be nice if they purged all the outdated error tasks from 2014-2016.

Profile skgiven
Volunteer moderator
Volunteer tester
Avatar
Send message
Joined: 23 Apr 09
Posts: 3968
Credit: 1,995,359,260
RAC: 0
Level
His
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46597 - Posted: 6 Mar 2017 | 12:40:39 UTC - in response to Message 46596.
Last modified: 6 Mar 2017 | 12:42:25 UTC

It sure would be nice if they purged all the outdated error tasks from 2014-2016.


Lets not forget the cuda 5.5 errors from 2013:

http://www.gpugrid.net/results.php?hostid=139265&offset=0&show_names=0&state=5&appid=

7418062 4883298 31 Oct 2013 | 13:17:36 UTC 31 Oct 2013 | 15:14:52 UTC Error while computing 2.21 0.20 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7417225 4883425 31 Oct 2013 | 8:59:24 UTC 31 Oct 2013 | 9:43:26 UTC Error while computing 2.21 0.20 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7414821 4883195 30 Oct 2013 | 23:57:35 UTC 31 Oct 2013 | 0:06:05 UTC Error while computing 2.17 0.17 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7413026 4883662 31 Oct 2013 | 12:54:58 UTC 31 Oct 2013 | 12:57:13 UTC Error while computing 2.32 0.20 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7413016 4883652 31 Oct 2013 | 9:43:26 UTC 31 Oct 2013 | 9:44:51 UTC Error while computing 2.40 0.19 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7412905 4883541 31 Oct 2013 | 1:22:10 UTC 31 Oct 2013 | 3:18:05 UTC Error while computing 2.21 0.17 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7412626 4883263 30 Oct 2013 | 21:15:39 UTC 30 Oct 2013 | 21:23:58 UTC Error while computing 2.25 0.19 --- Short runs (2-3 hours on fastest card) v8.14 (cuda55)
7247115 4751509 4 Sep 2013 | 10:12:01 UTC 4 Sep 2013 | 13:53:56 UTC Error while computing 6,407.73 1,130.65 --- ACEMD beta version v8.11 (cuda55)
7247113 4751587 4 Sep 2013 | 10:12:01 UTC 4 Sep 2013 | 12:07:05 UTC Error while computing 6,408.04 1,214.25 --- ACEMD beta version v8.11 (cuda55)
7202422 4715242 24 Aug 2013 | 12:51:21 UTC 24 Aug 2013 | 13:39:31 UTC Error while computing 2,467.68 2,360.33 --- ACEMD beta version v8.00 (cuda55)

____________
FAQ's

HOW TO:
- Opt out of Beta Tests
- Ask for Help

lohphat
Send message
Joined: 21 Jan 10
Posts: 44
Credit: 1,314,207,889
RAC: 5,423,904
Level
Met
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46598 - Posted: 6 Mar 2017 | 19:16:50 UTC - in response to Message 46595.

Ah. But it also shows old failed tasks from before my host existed but associated with my active host.

Crosslink error?

Bedrich Hajek
Send message
Joined: 28 Mar 09
Posts: 485
Credit: 11,159,863,370
RAC: 15,044,174
Level
Trp
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46599 - Posted: 6 Mar 2017 | 23:41:53 UTC

There are several other threads on this. Here is a link to one of them:


http://www.gpugrid.net/forum_thread.php?id=4485


And hopefully, there is someone who can fix this.



Profile Beyond
Avatar
Send message
Joined: 23 Nov 08
Posts: 1112
Credit: 6,162,416,256
RAC: 0
Level
Tyr
Scientific publications
watwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwatwat
Message 46600 - Posted: 7 Mar 2017 | 1:32:35 UTC

Hopefully these standard BOINC server side utilities will work:

https://boinc.berkeley.edu/trac/wiki/DbPurge

Database purging utility

As a BOINC project operates, the size of its workunit and result tables increase. To limit this growth, BOINC provides a utility db_purge that writes result and WU records to XML-format archive files, then deletes them from the database. Workunits are purged only when their input files have been deleted. Because of BOINC's file-deletion policy, this implies that all results are completed. So when a workunit is purged, all its results are purged too.

db_purge creates an archive/ directory and stores archive files there.

db_purge is normally run as a daemon, specified in the config.xml file. It has the following command-line options:

--app appname
Purge only workunits of the given app.
-d N
Set logging verbosity to N (1,2,3,4)
--daily_dir
Write archives in a new directory (YYYY_MM_DD) each day
--dont_delete
Don't delete from DB, for testing only
--gzip
Compress archive files using gzip
--max N
Purge at most N WUs, then exit
--max_wu_per_file N
Write at most N WUs to each archive file. Recommended value: 10,000 or so.
--min_age_days X
Purge only WUs with mod_time at least X (can be < 1) days in the past. Recommended value: 7 or so. This lets users examine their recent results.
--mod N R
Process only WUs with ID mod N == R. This lets you run multiple instances for increased throughput. Must be used with --no_archive.
--no_archive
Don't archive workunits or results
--one_pass
Do one pass, then quit
--zip
Compress archive files using zip


https://boinc.berkeley.edu/trac/wiki/FileDeleter

Server-side file deletion
Files are deleted from the data server's upload and download directories by two programs:

The file_deleter daemon deletes input and output files as jobs are completed.
The antique file deleter deletes files that were missed by the file_deleter and "fell through the cracks".
The File Deleter

Typically you don't need to customize this. The default file deletion policy is:

A workunit's input files are deleted when all results are 'over' (reported or timed out) and the workunit is assimilated.
A result's output files are deleted after the workunit is assimilated. The canonical result is handled differently, since its output files may be needed to validate results that are reported after assimilation; hence its files are deleted only when all results are over, and all successful results have been validated.
If <delete_delay_hours> is specific in config.xml, the deletion of file is delayed by that interval.
Command-line options:

-d N
set debug output level (1/2/3/4)
--mod M R
handle only WUs with ID mod M == R
--one_pass
exit after one pass through DB
--dry_run
don't update DB (for debugging only)
--download_dir D
override download_dir from project config with D
--sleep_interval N
sleep for N seconds between scans (default 5)
--appid N
only process workunits with appid=N
--app S
only process workunits of app with name S
--dont_retry_errors
don't retry file deletions that failed previously
--preserve_wu_files
update the DB, but don't delete input files
--preserve_result_files
update the DB, but don't delete output files
--dont_delete_batches
don't delete anything with positive batch number
--input_files_only
don't delete output files If you store input and output files on different servers, you can improve performance by running separate file deleters, each one on the machine where the corresponding files are stored.
--output_files_only
don't delete input files
--xml_doc_like L
only process workunits where xml_doc LIKE 'L'
In some cases you may not want files to be deleted. There are three ways to accomplish this:

Use the --preserve_wu_files and/or the --preserve_result_files command-line options.
Include <no_delete/> in the <file_info> element for a file in a workunit or result template. This lets you suppress deletion on a file-by-file basis.
Include nodelete in the workunit name.

The Antique File Deleter

Runs as a periodic task. Removes 'antiques': output files that are older than the oldest WU in the database (not including "no_delete" WUs). These files are created when BOINC clients return after the corresponding WU has been deleted from the database.

The antique files are deleted by using a Unix 'find' command to locate files that are older than the oldest workunit. The find command will work on NFS mounted file systems, and will ignore .nfs stale file markers. The output of find is limited by a 'head' to 50000 files by default.

If the web-server account on your system is not 'apache', add a <httpd_user> element to your config.xml file. Otherwise antique deletion won't work.

Command-line options:

-d N
set debug output level (1/2/3/4)
--dry_run
don't delete any files, just log what would be deleted
--usleep N
sleep this number of usecs after each examined file (Throttles I/O if there are many files.)

Post to thread

Message boards : Number crunching : Where did all my task history go?

//