Advanced search

Message boards : Graphics cards (GPUs) : No GPU load on 3080?

Author Message
Geethebluesky
Send message
Joined: 12 Nov 14
Posts: 2
Credit: 165,799,186
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwat
Message 59256 - Posted: 15 Sep 2022 | 2:09:00 UTC
Last modified: 15 Sep 2022 | 2:16:14 UTC

Hello! I'd appreciate some help ensuring I have the right config to use as much of my card as possible (well, any of it would be nice since it's not being used at all...) I haven't run BOINC since 2019 and am pretty much out of the loop now.

I have a RTX 3080 that shows 0% GPU utilization, with a work unit that was at 64% with about 12 hours of runtime and 19 hours left, that is before I started tweaking things (rebooted with SWAN_SYNC = 1, tried the app_config.xml.

I was using BAM! since I have more than 1 PC running BOINC and more than 1 project but canceled or suspended everything else, now I have GPUGrid running alone with a restarted WU at 1.5% after 20 mins of runtime, still 0% GPU utilization according to GPU-z and I'm at a loss.

Wanted to add: Am on win10, Task manager CUDA graph also shows 0%, disabled both options to suspend computing or GPU usage for any reason.

Thanks for any help!

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1300
Credit: 5,492,556,959
RAC: 10,137,164
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59257 - Posted: 15 Sep 2022 | 2:24:48 UTC - in response to Message 59256.

Since your hosts are hidden, just going to make a guess that you are running Python on GPU tasks?

What you are seeing is normal for these tasks. They are primarily a cpu task with small bursts of gpu use.

You can stop using the swan sync environment parameter as that was only helpful with the old tasks from several years ago.

You should get up to speed by reading the posts in the News forum, specifically this thread.

https://www.gpugrid.net/forum_thread.php?id=5233

Geethebluesky
Send message
Joined: 12 Nov 14
Posts: 2
Credit: 165,799,186
RAC: 0
Level
Ile
Scientific publications
watwatwatwatwatwatwatwat
Message 59258 - Posted: 15 Sep 2022 | 2:30:53 UTC - in response to Message 59257.

Yes, I see my WU as a "Python apps for GPU hosts 4.03 (cuda1131)" application.

Thanks for the news thread!

Igor Misic
Send message
Joined: 12 Apr 11
Posts: 4
Credit: 1,051,731,835
RAC: 2,320,001
Level
Met
Scientific publications
wat
Message 59687 - Posted: 7 Jan 2023 | 19:47:10 UTC

Since SWAN_SYNC = 1 doesn't help anymore with Python App I was hoping to utilize 3 tasks in parallel since there is plenty of memory at this HOST both as RAM and GPU RAM. But when I added a third task with an additional Boinc instance GPUGRID server started rejecting my tasks that are visible with the status "Abandoned".
Did anyone figure out how to do it?

http://www.gpugrid.net/results.php?hostid=601014

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1300
Credit: 5,492,556,959
RAC: 10,137,164
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59688 - Posted: 8 Jan 2023 | 0:29:48 UTC - in response to Message 59687.

I am assuming you have the 12GB version of the 3060. Can't tell by looking at your host since you are running an older version of BOINC that can only report 4GB of VRAM on Nvidia cards.

But these tasks can take as much as 4GB of memory to run each and around 60GB of system memory at 3X utilization.

So on the face of that assumption you don't have both enough GPU VRAM memory and system memory to run 3X.

My teammate was able to run 3X on his 12GB 3060's with the CUDA MPS server on his 128GB Epyc hosts. Those 3060's are now A4000's so plenty of memory on the gpu now.

Igor Misic
Send message
Joined: 12 Apr 11
Posts: 4
Credit: 1,051,731,835
RAC: 2,320,001
Level
Met
Scientific publications
wat
Message 59692 - Posted: 8 Jan 2023 | 12:18:27 UTC - in response to Message 59688.
Last modified: 8 Jan 2023 | 12:21:17 UTC

Thx for helping. You are right it is a 12GB 3060 version.

I'll write my observation, maybe it will also help someone else.

Previously when I had only 16 GB of RAM I observed that tasks would crash and BOINC would report errors. So I added an additional 32GB (now 48 GB) and 2 tasks in parallel works fine.

Then I added 2 tasks in parallel (a total of 4 tasks) and then I started to see errors again. So I figured out, ok, this can't fit both in RAM and in GPU VRAM.

Then tried 3 tasks in parallel.
I was observing GPU's VRAM that was at the current set of tasks not going over 10GB.
And then I was hoping that reported RAM + SWAP could take care of memory usage, but then after 1 hour of running all 3 tasks in parallel (exactly 1 hour running), first get aborted tasks that started first, and 10 minutes later 2 that started later also running in total for 1 hour each.

Then I started changing a bit of configuration in the hope that maybe I misconfigured something, and then I noticed at GPUGRID statistic that tasks are aborted even before BOINC gets information about it.
So what is the exact reason, I don't know.

Ian&Steve C.
Avatar
Send message
Joined: 21 Feb 20
Posts: 1038
Credit: 40,148,957,483
RAC: 47,466,315
Level
Trp
Scientific publications
wat
Message 59694 - Posted: 8 Jan 2023 | 15:02:04 UTC - in response to Message 59688.

My teammate was able to run 3X on his 12GB 3060's with the CUDA MPS server on his 128GB Epyc hosts. Those 3060's are now A4000's so plenty of memory on the gpu now.


with some further tweaking, I'm actually now running 4x on the 3060. and 5x on the A4000s.

____________

zooxit
Send message
Joined: 4 Jul 21
Posts: 23
Credit: 6,355,945,392
RAC: 20,811,548
Level
Tyr
Scientific publications
wat
Message 59726 - Posted: 17 Jan 2023 | 14:56:37 UTC

Last time I tried running more than 2 tasks on the same GPU it only ran 2 tasks (I understood that GPUGRID limits the user at 2 tasks per GPU).

Did something change or did I misunderstand something? (must confess I didn't have time to read the newer posts yet...)

Keith Myers
Send message
Joined: 13 Dec 17
Posts: 1300
Credit: 5,492,556,959
RAC: 10,137,164
Level
Tyr
Scientific publications
watwatwatwatwat
Message 59731 - Posted: 17 Jan 2023 | 20:29:21 UTC - in response to Message 59726.

Still the limit AFAIK. And a total of 16 tasks per host.

You can get around that by spoofing the number of cards a host has up to that 16 task per host limit.

zooxit
Send message
Joined: 4 Jul 21
Posts: 23
Credit: 6,355,945,392
RAC: 20,811,548
Level
Tyr
Scientific publications
wat
Message 59732 - Posted: 17 Jan 2023 | 20:43:30 UTC - in response to Message 59731.

Thanks. Didn't know, will look into that.

Post to thread

Message boards : Graphics cards (GPUs) : No GPU load on 3080?

//