Author |
Message |
EdwardPFSend message
Joined: 24 Nov 12 Posts: 17 Credit: 453,679,903 RAC: 0 Level
Scientific publications
|
I have been running SETi and using GPUGRID as my backup project for quite some time.
Since the demise of SETI I have been running GPUGRID for my GPUs and rosetta on my CPUs and all has been fine. I have been running 2 WUs (one each) on my 2 nvidia 1070s and 9 wus on my 12 CPU AMD computer (leaving 1 CPU and 12 Logical CPUs idle). That gave me 2 GPUs busy and 2 WUs "in the wings" ready to run. BOINC 7.14.2 Win-10.
BUT
Since the introduction of ACEMD 2.10 I am only running 1 WU (on GPU 0) nothing on GPU 1 and 3 WUs "in the wings".
I have plenty of memory (32Gb total) plenty of page file (64Gb) and plenty of free disk space (232Gb) as well as "13 process slots". I have scheduled 4.5 days of work plus .1 day of "additional' work".
As far as I know the only thing to change is the move to V2.10 .
running win-10 with most current nvidia driver 445.75.
BOINC is set to use 100% memory and 75% page/swap space.
also ...
GPUGRID will run 2 NVIDIA GPU's (0 and 1) IFF I suspend rosetta. (now there's a twist ... some sort of memory shortage?? I shut rosetta down to 4 cpu's and it didn't change the GPUGRID problem at all)
Ed Frybarger
P.S.
Ed, You may need to add this line to your cc_config.xml file:
<use_all_gpus>1</use_all_gpus>
This line was already in cc_config as implied by:
I have been running 2 WUs (one each) on my 2 nvidia 1070s
but thanks for the thought
|
|
|
Zalster Send message
Joined: 26 Feb 14 Posts: 211 Credit: 4,496,324,562 RAC: 0 Level
Scientific publications
|
Without knowing how Rosetta works, it's hard to say. But if turning it off gets both GPUs to run, then you already have your answer. It might be how much memory each work unit is using. Maybe they require a large amount of CPU cycles above what they say they use.(seen this in several projects where they say 1 thread but end up using 2-3 per work unit) Maybe it's how much is being written to the Hard drive each time. There could be a number of reasons. One method would be to start with only 1 work unit of Rosetta running and increase by 1 each time until the second GPU drops off. Then go down by 1 to keep both GPUs running.
____________
|
|
|
|
how much memory is actually being used on the system while it's running?
I've heard that Rosetta can use quite a lot of memory.
____________
|
|
|
|
Since the introduction of ACEMD 2.10 I am only running 1 WU (on GPU 0) nothing on GPU 1 and 3 WUs "in the wings".
I see that you have set your preferences to leave enough CPU threads unused, and your system has as much as 32 GB RAM.
Try for a while to add this "app_config.xml" on your C:\ProgramData\BOINC\projects\www.gpugrid.net directory and restart.
<app_config>
<app>
<name>acemd3</name>
<gpu_versions>
<gpu_usage>1</gpu_usage>
<cpu_usage>0.49</cpu_usage>
</gpu_versions>
</app>
</app_config>
ACEMD3 still will use two full CPU threads through wrapper, but BOINC Manager will "think" it needs only one CPU thread for two concurrent GPU tasks...
Another more exotic remedy to try:
If your monitor has two inputs, you can try to connect each graphics card to one of them (one to DVI input and the other to HDMI input, for example), for both cards to be headed. |
|
|
EdwardPFSend message
Joined: 24 Nov 12 Posts: 17 Credit: 453,679,903 RAC: 0 Level
Scientific publications
|
I have tried running with just 1 rosetta WU and BOINC will only run 1 GPUGRID WU.
I'll hold my breath BUT for now specifying .49 CPU equiv's for each GPUGRID IS WORKING for me now. Two CPUs AND 2 GPUs running!!
THANKS!!
If anything new crops up I'll give a call.
Ed F
|
|
|
EdwardPFSend message
Joined: 24 Nov 12 Posts: 17 Credit: 453,679,903 RAC: 0 Level
Scientific publications
|
Almost!
GPUGRID runs 2 GPU WUs 'till one of them finish and hangs with 1 WU running and 3 "Ready to Start".
If I suspend GPUGRIT and restart it ... nojoy
If I suspend Rosetta ... nojoy. (not as good as before)
If I shutdown BOINC and restart it ... Joy!!
If I set CPU loas from .49 to .24 and shutdown BOINC and restart it ... nojoy
Ideas??
Ed F |
|
|
|
Time for some other kind user with experience on Windows multiGPU systems.
I have three of them currently running with no such problems, but all of them are Linux systems. |
|
|
Keith Myers Send message
Joined: 13 Dec 17 Posts: 1284 Credit: 4,928,881,959 RAC: 6,466,228 Level
Scientific publications
|
I found Rosetta cpu tasks tie up memory, even after you have suspended them or reduced the amount running. And that was not with leave tasks in memory setting in preferences.
I had umpteen tries at reducing the cpu tasks running and still getting the not enough memory message and gpus idling.
Finally found that running 50% of the number I was originally running finally freed up enough memory for all the gpus to start crunching.
Running Rosetta and any gpu project is a challenge and won't be able to run the amount of tasks you think your cpu and memory you think should be capable of running. |
|
|
Erich56Send message
Joined: 1 Jan 15 Posts: 1090 Credit: 6,603,906,926 RAC: 18,783,925 Level
Scientific publications
|
I found Rosetta cpu tasks tie up memory ...
my experience with Rosetta, over the years, is that maximum RAM a task has ever used was about 1GB (mostly far below).
Of course, the more Rosetta tasks are being processed concurrently, the more RAM they take. |
|
|
EdwardPFSend message
Joined: 24 Nov 12 Posts: 17 Credit: 453,679,903 RAC: 0 Level
Scientific publications
|
Rosetta takes a lot of RAM ... that's for sure ... I had 6 of them running in the 1 to 1.5 Gb range at the same time on occasion.
I would think (what do I know) 32 GB memory 64Gb swap on a NVME M.2 and 13 free CPUs would just shrug off the load ... I'll keep looking ...
Please keep feeding ideas to me!!
Thanks
Ed F |
|
|
Jim1348Send message
Joined: 28 Jul 12 Posts: 819 Credit: 1,591,285,971 RAC: 0 Level
Scientific publications
|
I am running Rosettas on 9 Ubuntu machines and 1 Win7 at the moment (over 100 cores). There is no problem running them with a GPU; you just reserve a core, as usual. I switch between GPUGrid and Folding on the GPU, depending on who has work (GPUGrid at the moment).
The Rosettas take up a lot of memory when a new series is first introduced. They reduce that after a few days to a more manageable level, usually less than 500 MB. But at the moment, I have one running at 2979 MB, and another at 2923 MB; a few more around 2 GB. All my machines have 32 GB memory, except for a Ryzen 2700 with only 16 GB. On that one, I run a mix of Rosettas (100% resource share) and TN-Grid (60% resource share) to keep the memory within bounds. They actually run quite well. |
|
|
EdwardPFSend message
Joined: 24 Nov 12 Posts: 17 Credit: 453,679,903 RAC: 0 Level
Scientific publications
|
Since my last post I have done nothing but wait ...
This past A.M., 4/16/2020, I checked in on GPUGRID and it was running on 2 GPUs with 2 Ready to start!!
Now, as I retire, things are STILL running fine ... must be some kind of counter that needed to fix itself(??!!)
Ed F |
|
|