Author |
Message |
WWSend message
Joined: 20 Sep 09 Posts: 3 Credit: 17,302 RAC: 0 Level
Scientific publications
|
Hi,
recently, I've added a new project - GPUGRID - to my BOINC and I've discovered some weird behavior.
Crunching works without any problems, for tens of hours, but as soon as I connect to my desktop via Windows Remote Desktop (RDP), the working unit instantly crashes (that is if I don't work on RDP for 3 minutes and BOINC starts the computation). This process occurs repeatedly as BOINC is trying to start the unit over and over again.
OS: Windows Vista Business x64 SP2 CZ
GPU: nVidia GeForce 9600GT 512MB (drivers 190.62)
BOINC: 6.6.36 x64
Crashing app:
acemd_6.71_windows_intelx86__cuda
Task name:
19-KASHIF_HIVPR_n1_for_1hhp_open_ba2-49-100-RND1280_0 using acemd version 671
While I don't generally care if the computation is stopped while I work via RDP (there are other CUDA projects in my BOINC that run OK), what bothers me most is the application crash itself and annyoing Windows "oh, we are sorry!" message popping out every 2 or so minutes.
If anyone has any idea, how to deal with this problem, I'd be more than happy for the solution. |
|
|
|
Known problem on Vista. Remote Desktop loads different (Microsoft) video drivers, and wrecks the nVidia (CUDA) drivers - so it will be the GPUGrid (CUDA) tasks which crash, other CPU BOINC projects should be OK.
Only solution is to use local desktop only, or an alternative remote access solution - people have reported success with VNC. |
|
|
|
Known problem on Vista. Remote Desktop loads different (Microsoft) video drivers, and wrecks the nVidia (CUDA) drivers - so it will be the GPUGrid (CUDA) tasks which crash, other CPU BOINC projects should be OK.
Only solution is to use local desktop only, or an alternative remote access solution - people have reported success with VNC.
RD will also crash XP machines GPU work ...
Just about any of the VNC versions will work ... I used TinyVNC off and on for some time ...
There are a couple other remote control programs that work if you don't want to use VNC but I can't recall which ones at the moment ... there are also some that will do what RD does and crash the GPU tasks.
IF you load RD usually you will have to restart your machine to get BOINC to work with the GPU again.
|
|
|
WWSend message
Joined: 20 Sep 09 Posts: 3 Credit: 17,302 RAC: 0 Level
Scientific publications
|
Thanks for your fast replies!
I thought that different drivers (for virtual desktop) could be the reason, but on the other hand, I wasn't sure if CUDA applications (like GPUGRID) are "RDP-aware".
Unfortunately, as much as I enjoy some VNC advantages (like it doesn't resize your desktop when you connect from the different resolution), usage of RDP is (probably) the only thing I can't change in this setup.
So the question now is - what's the best approach to solve this problem. Is there a possibility to send BOINC a signal? Because afaik Windows are capable (thru that new Vista/7 Task Manager) to react to user connecting via RDP (i.e. crating RDP session) and executing desired task in response.
Also - is this a known-and-not-going-to-be-solved problem, or can we expect somewhere in the future fix for this? |
|
|
|
An idea to think of on how to get past such a problem: Modify acemd so that it watches for the signal that RD or RDP is in use, and if so, knows that it must suspend its use of the GPU until the end of this signal, then restart from the last checkpoint.
Another idea: If acemd frequently checks what type of GPU it is connected to, whenever it changes to the RD or RDP type, use this as the needed signal.
Glad to know that at least some versions of VNC avoid this problem, though. |
|
|
|
Thanks for your fast replies!
I thought that different drivers (for virtual desktop) could be the reason, but on the other hand, I wasn't sure if CUDA applications (like GPUGRID) are "RDP-aware".
Unfortunately, as much as I enjoy some VNC advantages (like it doesn't resize your desktop when you connect from the different resolution), usage of RDP is (probably) the only thing I can't change in this setup.
So the question now is - what's the best approach to solve this problem. Is there a possibility to send BOINC a signal? Because afaik Windows are capable (thru that new Vista/7 Task Manager) to react to user connecting via RDP (i.e. crating RDP session) and executing desired task in response.
It probably depends on why, and how often, the remote desktop sessions are run.
If it's only to manage and monitor BOINC itself - don't! Use other tools, like the tried-and-trusted BoincView, the in-development BoincTasks, or even just BOINC Manager itself.
If you have to connect via RDP for other purposes, but not too often, you could still use those BOINC-specific tools to suspend BOINC before you connect, and resume it afterwards. Otherwise, your idea of hooking it up to RDP session detection sounds good: look up 'boinccmd' (usable from batch files) as a possible mechanism.
Also - is this a known-and-not-going-to-be-solved problem, or can we expect somewhere in the future fix for this?
I think you're relying on collaborative nVidia-Microsoft development. Your guess is as good as mine. |
|
|
|
My recollection is that to recover the video drivers after RD sessions you have to reboot ... so you would not just be able to end the session and then restart BOINC. |
|
|
WWSend message
Joined: 20 Sep 09 Posts: 3 Credit: 17,302 RAC: 0 Level
Scientific publications
|
Hi,
sorry for replying after a while, I've had other problems (than dealing with crashing computation in BOINC, which is kind of marginal to me).
1) I'd like to state that I don't use RDP only to manage BOINC.
2) I've observed behaviour of BOINC over past few days and it has come to my attention that other CUDA-enabled projects don't have a problem with computation while I am connected via RDP to my desktop. So it seems that it is the gpugrid only problem after all.
My conclusion is that I'll stop gpugrid computation for a while until a solution for this problem comes out. Seems like switching drivers (or more like display devices) is not a problem, as Vista is capable to restart graphic environment without reboot (switching between Aero and classic desktop for example). |
|
|