Author |
Message |
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
I built a new Rig-27 last night with an E5-2603v4 with 6c/6t (no hyperthreading). The MSI X99 motherboard has 3 slots with an EVGA 1080 Ti in each. Without thinking about it I just set it up like I did pre-SWAN_SYNC but with SWAN_SYNC enabled:
<app_config>
<app>
<name>acemdlong</name>
<gpu_versions>
<cpu_usage>1.0</cpu_usage>
<gpu_usage>0.5</gpu_usage>
</gpu_versions>
</app>
<app>
<name>acemdshort</name>
<gpu_versions>
<cpu_usage>1.0</cpu_usage>
<gpu_usage>0.5</gpu_usage>
</gpu_versions>
</app>
</app_config>
The System Monitor shows all 6 CPUs pegged at 100%. I looked at all their stderr files and see no signs of problems.
Am I wasting 3 CPUs???
Is there a way to tell if each CPU is synced (or mapped) to a single WU???
Or are 3 CPUs mapped to 3 GPUs and the other 3 are just sitting there repeating, "Is anyone home? Are we anywhere yet?", i.e. wasted. |
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
aurum@Rig-27:~$ nvidia-smi -a
==============NVSMI LOG==============
Timestamp : Mon Dec 31 11:01:30 2018
Driver Version : 415.25
CUDA Version : 10.0
Attached GPUs : 3
GPU 00000000:01:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x01
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:01:00.0
Sub System Id : 0x63913842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 8x
Processes
Process ID : 2950
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 663 MiB
Process ID : 24944
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
GPU 00000000:02:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x02
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:02:00.0
Sub System Id : 0x65933842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Processes
Process ID : 28280
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
Process ID : 28304
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
GPU 00000000:03:00.0
Product Name : GeForce GTX 1080 Ti
PCI
Bus : 0x03
Device : 0x00
Domain : 0x0000
Device Id : 0x1B0610DE
Bus Id : 00000000:03:00.0
Sub System Id : 0x63933842
GPU Link Info
PCIe Generation
Max : 3
Current : 3
Link Width
Max : 16x
Current : 16x
Processes
Process ID : 999
Type : G
Name : /usr/lib/xorg/Xorg
Used GPU Memory : 41 MiB
Process ID : 1283
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 665 MiB
Process ID : 1573
Type : G
Name : cinnamon
Used GPU Memory : 13 MiB
Process ID : 2086
Type : C
Name : ../../projects/www.gpugrid.net/acemd.919-80.bin
Used GPU Memory : 673 MiB |
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
Looking for a Linux command analogous to nvidia-smi for a CPU report.
Tried to install CPU-X but it didn't run.
GPU 1 Process ID: 2950 and 24944
GPU 2 Process ID: 28280 and 28304
GPU 3 Process ID: 1283 and 2086
Now if I knew a way to map these processes to their CPUs... |
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
aurum@Rig-27:~$ inxi -t cm10
Processes: CPU: % used - top 10 active
1: cpu: 96.0% command: ..acemd.919-80.bin pid: 28280
2: cpu: 95.9% command: ..acemd.919-80.bin pid: 24944
3: cpu: 95.9% command: ..acemd.919-80.bin pid: 1283
4: cpu: 95.8% command: ..acemd.919-80.bin pid: 28304
5: cpu: 95.8% command: ..acemd.919-80.bin pid: 2086
6: cpu: 94.3% command: ..acemd.919-80.bin pid: 2950
7: cpu: 8.8% command: Xorg pid: 999
8: cpu: 8.2% daemon: ~kworker/3:2~ pid: 2961
9: cpu: 3.1% command: nxnode.bin pid: 1711
10: cpu: 2.5% daemon: ~kworker/3:0~ pid: 24148
Memory: MB / % used - Used/Total: 3703.6/7876.0MB - top 10 active
1: mem: 425.96MB (5.4%) command: ..acemd.919-80.bin pid: 2086
2: mem: 417.73MB (5.3%) command: ..acemd.919-80.bin pid: 2950
3: mem: 411.29MB (5.2%) command: ..acemd.919-80.bin pid: 1283
4: mem: 404.81MB (5.1%) command: ..acemd.919-80.bin pid: 24944
5: mem: 404.81MB (5.1%) command: ..acemd.919-80.bin pid: 28304
6: mem: 404.75MB (5.1%) command: ..acemd.919-80.bin pid: 28280
7: mem: 315.56MB (4.0%) command: nxnode.bin pid: 1711
8: mem: 234.98MB (2.9%) command: firefox pid: 3943
9: mem: 214.26MB (2.7%) command: cinnamon pid: 1573
10: mem: 175.70MB (2.2%) command: firefox pid: 4035 |
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
aurum@Rig-27:~$ inxi -v 3
System: Host: Rig-27 Kernel: 4.15.0-43-generic x86_64
bits: 64 gcc: 7.3.0
Desktop: Cinnamon 3.8.9 (Gtk 3.22.30-1ubuntu1)
Distro: Linux Mint 19 Tara
Machine: Device: desktop Mobo: MSI model: X99S MPOWER (MS-7885) v: 4.0 serial: N/A
UEFI: American Megatrends v: M.C0 date: 06/14/2018
CPU: 6 core Intel Xeon E5-2603 v4 (-MT-MCP-)
arch: Broadwell rev.1 cache: 15360 KB
flags: (lm nx sse sse2 sse3 sse4_1 sse4_2 ssse3 vmx) bmips: 20399
clock speeds: max: 1700 MHz 1: 1699 MHz 2: 1699 MHz 3: 1699 MHz
4: 1699 MHz 5: 1699 MHz 6: 1699 MHz
Graphics: Card-1: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 01:00.0
Card-2: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 02:00.0
Card-3: NVIDIA GP102 [GeForce GTX 1080 Ti] bus-ID: 03:00.0
Display Server: x11 (X.Org 1.19.6 )
drivers: modesetting,nvidia,nouveau (unloaded: fbdev,vesa)
Resolution: 640x480
OpenGL: renderer: GeForce GTX 1080 Ti/PCIe/SSE2
version: 4.6.0 NVIDIA 415.25 Direct Render: Yes
Network: Card: Intel I210 Gigabit Network Connection
driver: igb v: 5.4.0-k port: b000 bus-ID: 05:00.0
IF: enp5s0 state: up speed: 1000 Mbps duplex: full
mac: d8:cb:8a:1c:62:79
Drives: HDD Total Size: 500.1GB (2.2% used)
ID-1: model: WDC_WDS500G1B0B
Info: Processes: 231 Uptime: 3:47 Memory: 3708.8/7876.0MB
Init: systemd runlevel: 5 Gcc sys: 7.3.0
Client: Shell (bash 4.4.191) inxi: 2.3.56 |
|
|
kksplaceSend message
Joined: 4 Mar 18 Posts: 53 Credit: 2,623,894,005 RAC: 6,112,756 Level
Scientific publications
|
Newbie here to this but a shot at a reply:
1. With your setup, all 6 cores at 100% makes sense to me. You have two WUs running per GPU. To my knowledge, SWAN_SYNC isn't just 'reserving' a CPU core for a GPU, but instead dedicating a core to a job related to a GPU instead of having to interrupt it every time it needs it. I saw this on a Windows machine when I did it with Milkway@Home and SWAN_SYNC enabled -- two cores were used but only one GPU.
2. I am not sure of the best way to see what CPU is doing what job, but try using "TOP" in the terminal. It will show what is being executed and the CPU% it is using. I would expect you will see "acemd.919-80." on all six cores.
3. Regarding hyperthreading: one of my hosts has an i7-7820x with 8 cores. When I first used it, I did not enable H-T due to reading posts on this and other BOINC related forums. However, out of curiosity, I enabled H-T after a couple of weeks, but limited BOINC core usage to 50% to see what happened. It seemed to help a little. My theory is that the Linux scheduler is able to use the H-T 'cores' for the little extra stuff going on without interrupting the BOINC tasks as much. (Not being a techy guy, I am standing by for the critiques of that statement!) |
|
|
Erich56Send message
Joined: 1 Jan 15 Posts: 1132 Credit: 10,363,897,676 RAC: 29,196,793 Level
Scientific publications
|
3. Regarding hyperthreading: one of my hosts has an i7-7820x with 8 cores. When I first used it, I did not enable H-T due to reading posts on this and other BOINC related forums. However, out of curiosity, I enabled H-T after a couple of weeks, but limited BOINC core usage to 50% to see what happened. It seemed to help a little.
My experience on a Windows_10 PC is similar: my CPU has 6 cores. And 6 BOINC tasks (2 GPUGRID + 4 LHC) seem to run somewhat faster with HT on than with HT off.
|
|
|
Aurum Send message
Joined: 12 Jul 17 Posts: 401 Credit: 16,755,010,632 RAC: 121,512 Level
Scientific publications
|
The Xeon E5-2603 v4 has no hyperthreading capability. It's only 6c/6t. I just wanted to mention that to eliminate a variable.
Trying to figure out how to define my own columns in top to show %cpu,psr,pid.
top -u boinc -H
This is interesting but not quite enough:
ps -u boinc -o pid,%cpu,sgi_p,psr,fname
If I watch them running it appears that acemd.91 uses about 15% of a CPU. It seems to me that a single core could service 2, 3, 4 even 5 acemd WUs. That would free up other CPUs to do CPU WUs. My guess it must be programmed or require babysitting each WU as it starts. |
|
|
rod4x4Send message
Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level
Scientific publications
|
To select and order columns in top...
Press 'f' (without quotes) whilst top is running.
Then use arrow keys and space bar to select move and display desired columns.
Press q when done.
if you display the 'Last used cpu' column you will see a cpu is not dedicated to a process. This is by design. In Linux, Processor affinity can be set using taskset. Use at your own risk.
Whilst top is running press h for more options. |
|
|
rod4x4Send message
Joined: 4 Aug 14 Posts: 266 Credit: 2,219,935,054 RAC: 0 Level
Scientific publications
|
The System Monitor shows all 6 CPUs pegged at 100%. I looked at all their stderr files and see no signs of problems
I would suspect that SWAN_SYNC and running multi jobs per GPU should not be combined. Using both together would result in 6 CPUs used 100%. Inspecting your task run times will indicate whether you are wasting 3 CPUs.
My understanding of SWAN_SYNC is the processor is SPINning on the one GPU task process waiting for any CPU work.
As per this link, the difference between BLOCKING and SPIN is described for CUDA:
https://www.cs.cmu.edu/afs/cs/academic/class/15668-s11/www/cuda-doc/html/group__CUDART__DEVICE_g18074e885b4d89f5a0fe1beab589e0c8.html
and also here, discussion taken from Nvidia Dev forum:
https://devtalk.nvidia.com/default/topic/794833/100-cpu-usage-when-running-cuda-code/
Nvidia Moderator stated:
busy in a polling loop inside the driver function associated with `cudaDeviceSynchronize()`, waiting for the GPU to finish
In your case either turn off SWAN_SYNC or only run 1 task per GPU, depending on your preferences. |
|
|
|
Am I wasting 3 CPUs??? Yes. There's no point to run two GPUGrid workunits simultaneously per GPU while using SWAN_SYNC under non-WDDM OS.
To be more specific: I rather use SWAN_SYNC than running two GPUGrid workunits per GPU.
|
|
|