Message boards : Graphics cards (GPUs) : Desktop freezes
Author | Message |
---|---|
While I'm crunching for GPUGRID my rig freezes. I mean - the whole system (nor BOINC client nor smth else). It makes me a "bit" pissed off, coz I can not work or even browse the internet. | |
ID: 12815 | Rating: 0 | rate: / Reply Quote | |
OK, forgetting about BOINC and CUDA for a minute... | |
ID: 12820 | Rating: 0 | rate: / Reply Quote | |
Sure I know that 190.xx is beta, but I never heard that it's buggy. | |
ID: 12822 | Rating: 0 | rate: / Reply Quote | |
There is also a bug with the use while in use options that may mean that BOINC does not get out of the way correctly. You will not see that until post 6.10.9 ... | |
ID: 12825 | Rating: 0 | rate: / Reply Quote | |
Sure I know that 190.xx is beta, but I never heard that it's buggy. All Linux nVidia releases are buggy to various degrees! (I don't say that to be argumentative or derogatory, that's just the way it is.) Being an early adopter whn nVidia release new drivers is asking for trouble. Aside from some vdpau fixes or unless you need CUDA 2.3 (which you don't for GPUGRID) and especially as you are wanting to use the machine interactively, if you don't need 190.xx, go back to 185.18.36 and CUDA 2.2. I inserted this string into xorg.conf but it becomes even worse: freezes are all the time (or better to say - it's one huge freeze) and sometimes mouse starts to moves, but it's not clicking at all, so I had to reset the rig by button on the case. I didn't actually want you to insert that string into the xorg.conf, just wanted to know if it is there. ;) If it is there it will exacerbate or actually cause freezes! There are two things which make me think that the reason behind is BOINC: the 1st - if I'm suspending GPUGRID WU's (but rosetta continue to work) I never ever had any single freeze whatever I'm going (even watching movie). And the 2nd - even if I deselected "use GPU..." elapsed time continue to run and status says "running" but no "waiting to run", so looks BOINC does not intercepting this event and does not suspending GPUGRID... Right, as Paul points out (in the post above) there is a bug related to 'in use' config options in BOINC, so that's why I said let's leave BOINC and CUDA out of this for the moment. So there are 2 option: either to wait for good driver (A) or to install 185.xx with CUDA 2.2 (B). Am I right in understanding what u r trying to say? :-) Or should I continue to try to be patient and (when I'm completely pissed of) - just to suspend GPUGRID WU's? Firstly, I wouldn't recommend running the 190.xx series. If it wasn't for the fact that GPUGRID now requires CUDA 2.2, I'd still be running 180.60 on all my Linux machines. This has been by far the most stable nVidia driver release on Linux for years! (My experience with the first 190.xx driver release was a hard lock within 5 mins. That tells me all I need to know about it. I'll come back again and try again when it is out of BETA and is promoted to the current release.) Secondly, the video card can only do so much. I do not run desktop effects on any of my crunchers. With a single video card you are asking for and expecting too much. The fancy GUI effects put load on the video card at the same time as it is loaded by the CUDA app. And you expect the desktop to be responsive? My advice, don't do it. I can make my desktop "stutter" with a CUDA app running, desktop effects enabled, and spinning the mouse wheel to scroll in Firefox. Compositing, hardware acceleration, OpenGL, etc. etc. at the same time as CUDA is asking too much. Thirdly, as Paul pointed out, a fix for the 'in use' option not working correctly. Can you install the latest 6.10.9 BOINC release? Summary: downgrade to nVidia driver release 185.18.36. Turn off desktop effects. Update to BOINC 6.10.9. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12826 | Rating: 0 | rate: / Reply Quote | |
There is also a bug with the use while in use options that may mean that BOINC does not get out of the way correctly. You will not see that until post 6.10.9 ... so it's not my stupidity? and there is bug in BOINC. Hope one day it will be fixed... Paul, where I can get 6.10.9? I try from here: http://boinc.berkeley.edu/dl/ but there are version for windows and MacOS but not for linux... Anyway i'm checking this link every day t will be available, i'll post result here immediately. ____________ | |
ID: 12841 | Rating: 0 | rate: / Reply Quote | |
it's pity that all nvidia drivers are buggy, but at least they are not making "normal life" of linux users so terrible like ATI one. should I remove CUDA completely from the computer and CUDA2.2 on top of 185.xx driver or I do not need CUDA at all? soupid question, I'm really new to CUDA and nvidia... I didn't actually want you to insert that string into the xorg.conf, just wanted to know if it is there. ;) oops... At least I tried to fix a how I can replace xorg.conf by previous version from console :) Right, as Paul points out (in the post above) there is a bug related to 'in use' config options in BOINC, so that's why I said let's leave BOINC and CUDA out of this for the moment. Hope 6.10.9 version will be available soon and I can check if the bug is still there or not. [/quote]Firstly, I wouldn't recommend running the 190.xx series. If it wasn't for the fact that GPUGRID now requires CUDA 2.2, I'd still be running 180.60 on all my Linux machines. This has been by far the most stable nVidia driver release on Linux for years! (My experience with the first 190.xx driver release was a hard lock within 5 mins. That tells me all I need to know about it. I'll come back again and try again when it is out of BETA and is promoted to the current release.) Secondly, the video card can only do so much. I do not run desktop effects on any of my crunchers. With a single video card you are asking for and expecting too much. The fancy GUI effects put load on the video card at the same time as it is loaded by the CUDA app. And you expect the desktop to be responsive? My advice, don't do it. I can make my desktop "stutter" with a CUDA app running, desktop effects enabled, and spinning the mouse wheel to scroll in Firefox. Compositing, hardware acceleration, OpenGL, etc. etc. at the same time as CUDA is asking too much. Thirdly, as Paul pointed out, a fix for the 'in use' option not working correctly. Can you install the latest 6.10.9 BOINC release? Summary: downgrade to nVidia driver release 185.18.36. Turn off desktop effects. Update to BOINC 6.10.9.[/quote] 1. I'll install 185.xx version 2morrow (it's bit late now :-) ) 2. This is the very last thing I'd lik do understand that the effects causes this freezes, at least for now. I've got ustient (ha-ha) and if I heed to do a lot on my rig I'm just suspending GPUGRID WU's. I know it's not right way to do, but .. this is life. If due to some reasons this option will not work I'll turn effects off. In fact, I'm almost ready to it. 3. When 6.10.9 will appear somewhere - trust me, I'll be the very 1st guy in the line to get it :-) and one thing more. thanx a lot for your help to the noob in GPUGRID:-) ____________ | |
ID: 12842 | Rating: 0 | rate: / Reply Quote | |
@CTAPbli here is the address | |
ID: 12845 | Rating: 0 | rate: / Reply Quote | |
The fix for "in use" bug is NOT in 6.10.9 ... 6.10.9 has minor tweaks for ATI usage only (and these are not usable unless server side changes are made by the project). | |
ID: 12849 | Rating: 0 | rate: / Reply Quote | |
OK, Paul I can wait :-) | |
ID: 12855 | Rating: 0 | rate: / Reply Quote | |
The fix for "in use" bug is NOT in 6.10.9 ... Paul, I'm confused. Before I open my mouth again, exactly what are you referring to, so I'm on the same page. The original change to check for a running graphics app.... +// check whether each GPU is running a graphics app (assume yes) +// return true if there's been a change since last time +// +bool COPROC_CUDA::check_running_graphics_app() { + int retval, j; + bool change = false; + for (j=0; j<count; j++) { + bool new_val = true; + int device, kernel_timeout; + retval = (*__cuDeviceGet)(&device, j); + if (!retval) { + retval = (*__cuDeviceGetAttribute)(&kernel_timeout, CU_DEVICE_ATTRIBUTE_KERNEL_EXEC_TIMEOUT, device); + if (!retval && !kernel_timeout) { + new_val = false; + } + } + if (new_val != running_graphics_app[j]) { + change = true; + } + running_graphics_app[j] = new_val; + } +} Or the I: BM 6.10.4 - Cuda task doesn't suspend - The same as in BM 6.10.5 message on boinc_alpha that talks of backing out the check for a running graphics app and reverting to previous behaviour, (if it doesn't work as expected) - which hasn't been done yet? PS. I had to chuckle this morning. It appears your posts to boinc mailing list are listened to. ;) + if (display_driver_version) { + sprintf(vers, "%d", display_driver_version); + } else { + strcpy(vers, "unknown"); + } ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12856 | Rating: 0 | rate: / Reply Quote | |
should I remove CUDA completely from the computer and CUDA2.2 on top of 185.xx driver or I do not need CUDA at all? soupid question, I'm really new to CUDA and nvidia... If you use the 185.xx driver you need the CUDA 2.2 toolkit. oops... At least I tried to fix a how I can replace xorg.conf by previous version from console :) Did your editor create a backup file, xorg.conf~, that you could move back. Alternatively, just edit it again and remove the "UseEvents" config option. It defaults to "false" if it's not there at all. Hope 6.10.9 version will be available soon and I can check if the bug is still there or not. I just asked Paul for clarification in the post above. 1. I'll install 185.xx version 2morrow (it's bit late now :-) ) OK. 2. This is the very last thing I'd lik do understand that the effects causes this freezes, at least for now. I've got ustient (ha-ha) and if I heed to do a lot on my rig I'm just suspending GPUGRID WU's. I know it's not right way to do, but .. this is life. I understand why you'd rather not turn off desktop events. But the thing is, they tend to expose driver bugs like nothing else can, as well as consuming GPU resources. 3. When 6.10.9 will appear somewhere - trust me, I'll be the very 1st guy in the line to get it :-) I wasn't aware that a generic 6.10.9 hadn't been built for Linux. In any case, hold off from changing BOINC software versions at the moment. I'd like to understand exactly what Paul is talking about. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12857 | Rating: 0 | rate: / Reply Quote | |
If you use the 185.xx driver you need the CUDA 2.2 toolkit. OK. I'll do it this night and I'll post results. oops... At least I tried to fix a how I can replace xorg.conf by previous version from console :) Did your editor create a backup file, xorg.conf~, that you could move back. Alternatively, just edit it again and remove the "UseEvents" config option. It defaults to "false" if it's not there at all. it's pretty easy: I deleted xorg.conf, renamed xorg.conf~ into xorg.conf. Nothing special, really :-) I understand why you'd rather not turn off desktop events. But the thing is, they tend to expose driver bugs like nothing else can, as well as consuming GPU resources. So, effects are really helpful for finding different bugs :-) If drivers will not help I'll do it. I'm pretty pissed off. Anyway, turning this effects off I consider as a temporary, coz in my understanding they should not interfere with BOINC while I wasn't aware that a generic 6.10.9 hadn't been built for Linux. In any case, hold off from changing BOINC software versions at the moment. I'd like to understand exactly what Paul is talking about. OK. ____________ | |
ID: 12860 | Rating: 0 | rate: / Reply Quote | |
So, effects are really helpful for finding different bugs :-) If drivers will not help I'll do it. I'm pretty pissed off. Desktop effects use hardware acceleration. The same hardware that's running the CUDA code. Your asking the same GPU to compute spinning the desktop cube at the same time as it is crunching numbers with CUDA. It's great having OpenGL hardware acceleration for desktop effects, offloading the decoding to hardware with vdpau when playing a movie, and being able to crunch with CUDA. But ask a GPU to do all 3 at the same time and it won't do any of them particularly well! I'm not saying don't do it. What I'm trying to say is that if you want to crunch, then dedicate the resource (as much as you can) to crunching. Being able to use a consumer grade graphics card for CUDA crunching as well as displaying your desktop is good to have, but there is a reason that nVidia sell dedicated CUDA hardware solutions that do not have graphics output. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12862 | Rating: 0 | rate: / Reply Quote | |
First, 6.10.10 for windows is out. client/scheduler/web: add per-project preferences for whether Changeset 19198: lient: fix bug in CPU prefs enforcement: Jack, As to that code segment and the backing out of the "fix" that is not something I have been referring to anywhere. There *IS* a change in that neck of the woods but it is a change in the logic and involves two flags and is Changeset 19137 Or I have completely lost the whole thread ... | |
ID: 12873 | Rating: 0 | rate: / Reply Quote | |
As to that code segment and the backing out of the "fix" that is not something I have been referring to anywhere. Well, from the limited testing I've done - check_running_graphics_app(), which is being used for 'Use GPU while computer in use' is not having the desired effect. I need to do some more testing before reporting this. Or I have completely lost the whole thread ... No, I don't think so. Too many changes in a short time period. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12874 | Rating: 0 | rate: / Reply Quote | |
I installed 185.36.128 and CUDA Toolkit 2.2, but freezes still there. | |
ID: 12875 | Rating: 0 | rate: / Reply Quote | |
and for MacOS also, but not for linux. I can wait | |
ID: 12882 | Rating: 0 | rate: / Reply Quote | |
I installed 185.36.128 and CUDA Toolkit 2.2, but freezes still there. Yeah, get a whiz-bang system and then you have to shut off everything to make BOINC work... BOINC is supposed to be working in the background as idle and not interfere with anything ... then again, what do I know ... :) "As deeper your head in the sand, as more unprotected your ass" I had never heard that one ... or at least I cannot recall hearing it ... Hope Paul will manage to fix this bug one day. Sadly guys I can fix nothing... and for the most part my reports are like whistling on the wind ... the good news and bad news is that it looks like my disability is on the upswing again so there will be much rejoycing as I will not be able to do as much ... of course I get the benefit of feeling like I am drunk all the time free of charge ... BTW Paul, there is 6.10.10 version for windows only. Hopefully one day linux one will be also available. I mentioned that some place here ... in addition to as I noted above, my attention span is also shot so that is why I am not sure I was tracking this thread well ... Anyway, based on the notes the 6.10.10 release aside from the UI changes (that may or may not be working) there is nothing in .10 over .7 for us here ... the main development issues in that release is on ATI cards and how they interact with the server. If you want to follow there are notes on Collatz as to what is going on (though they may have referred out to the BOINC board. In that I am not all that pressed and my set-up is working (with the exception to the FIFO bug, still unacknowledged as far as I know) 6.10.7 would be what I would recommend. The FIFO bug will only get you if you run TWO GPU projects at the same time ... and then the most annoying thing is that it kills the resource share allocations... run time will be biased towards (as near as I can tell) the project that will download the most work for your queue size ... In my case I am running with 0.1 extra work and it seems to respect shares, sort of ... if MW goes off the air I get several hours of Collatz work instead of only one or two tasks ... I run those off and then I get, if they are back MW work till it goes off the air again ... | |
ID: 12891 | Rating: 0 | rate: / Reply Quote | |
I installed 185.36.128 and CUDA Toolkit 2.2, but freezes still there. Not a lot about CUDA apps? ;) The CPU component of the GPU task may be idling but the GPU is not idling when a CUDA kernel is executing on it. Desktop effects (compositing) also take resource. (Far more than they should for the sake of eye candy.) Hit the GPU with 'texture from pixmap' while executing CUDA kernels and expect the desktop to stutter. As an aside, if you read the CUDA release notes, they tell you that individual CUDA kernel launches are limited to a 5 sec run time restriction when a display is attached to the GPU. For this reason it is recommended that CUDA is run on a GPU that is NOT attached to an X display. If you choose to ignore the recommendation, I'd suggest doing everything possible not to add extra load to a GPU while it's running CUDA and connected to a display, like turning off desktop effects. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12894 | Rating: 0 | rate: / Reply Quote | |
I installed 185.36.128 and CUDA Toolkit 2.2, but freezes still there. Yep, never said I did... but I do know a lot about the conceptual idea of how BOINC should operate... has operated, and does operate... The CPU component of the GPU task may be idling but the GPU is not idling when a CUDA kernel is executing on it. Desktop effects (compositing) also take resource. (Far more than they should for the sake of eye candy.) Hit the GPU with 'texture from pixmap' while executing CUDA kernels and expect the desktop to stutter. This I know As an aside, if you read the CUDA release notes, they tell you that individual CUDA kernel launches are limited to a 5 sec run time restriction when a display is attached to the GPU. For this reason it is recommended that CUDA is run on a GPU that is NOT attached to an X display. If you choose to ignore the recommendation, I'd suggest doing everything possible not to add extra load to a GPU while it's running CUDA and connected to a display, like turning off desktop effects. On the other hand, though some will say it is not BOINC's fault but the project's ... there is a wide variance with the way BOINC is operating with the various projects in that for most I have no issues at all and see significant effects with one, maybe two ... my point being that as usual the UCB team is abdicating the responsibility to help the projects with the notion that this kind of thing is a project responsibility ... Maybe so, but that only means that we now have 50 teams that have to figure this stuff out on their own instead of one ... | |
ID: 12916 | Rating: 0 | rate: / Reply Quote | |
As an aside, if you read the CUDA release notes, they tell you that individual CUDA kernel launches are limited to a 5 sec run time restriction when a display is attached to the GPU. For this reason it is recommended that CUDA is run on a GPU that is NOT attached to an X display. If you choose to ignore the recommendation, I'd suggest doing everything possible not to add extra load to a GPU while it's running CUDA and connected to a display, like turning off desktop effects. I understand what you are saying, but at the end of the day the boinc core is a glorified launcher, (the middleware, if you like), and the projects are responsible for the diverse clients. The UCB team can't really be responsible for the projects, and their client software. The one size fits all approach does not work so well, (eg. FIFO GPU scheduling), and to be frank individual projects are going to want to see optimizations that suit their own purposes rather than generalizations. Sounds like it is a no win situation to me. CUDA (and GPGPU in general) is such a young technology that how many people do know it inside out and backwards? I mean, at least the boinc core can set the process prioities for the CPU task clients. That's pretty tricky to do with GPU side of the GPU tasks. ;) They're either on or off. Unlike the multitude of options that are built into eg. the Linux kernel CPU scheduler, you just don't have that functionality available on the GPU. (And the jury is still out on whether the 'GPU in use' config property and underlying code is actually doing what the developer expects that it is doing. I'm not 100% convinced it is but was too busy today to spend more time testing this.) Anyway, I hope I made the point I was trying to make. If people would change their expectations, and think of running CUDA on a GPU that's doing something else, (like driving a display via X) as a less than optimal way of doing things, that would go some way towards it. (If you want optimal on a consumer grade card, forget about whether desktop effects are switched on or off and don't use it to drive a display at all!) Giving it a chance, by not using other hardware acceleration functionality (desktop effects) at the same time as using CUDA computing capability, seems obvious to me. Lobbing bricks in the general direction of nVidia drivers and BOINC because the desktop stutters when they have no understanding of how their hardware actually works, or what is a reasonable expectation, just shows a lack of education. (I expect to get bashed for that last sentence, and I'm not trying to be insulting, but it does seem that some peoples expectations are set way beyond what their hardware is actually capable of. YMMV.) ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12921 | Rating: 0 | rate: / Reply Quote | |
my system i have noticed, when running 1 or 2 apps in high priority in windows task manager, i get a freeze up for 1-2 secs every so often; but i live with it... crunch time is decreased by 5-10% depending on the app and wu. | |
ID: 12922 | Rating: 0 | rate: / Reply Quote | |
OK, that's clear that there will no quick fix for "use GPU..." in nearest future, right? | |
ID: 12927 | Rating: 0 | rate: / Reply Quote | |
@Jack, I sure wish I had enough left to answer you ... I understand your point, but, as middleware BOINC has more responsibilities when more than one project is affected or needs a feature. Then, that is exactly where middleware is supposed to step up to avoid reinventing the wheel ... | |
ID: 12930 | Rating: 0 | rate: / Reply Quote | |
OK, that's clear that there will no quick fix for "use GPU..." in nearest future, right? I'm kind of limited to what I can and cannot say having signed a rather draconian NDA. We use CUDA commercially in a software product. We actually outsource our software development now, so I've asked someone who I consider to be a CUDA expert to take a look at what that code currently does and if there is a better way of achieving the objective. When I get a response I'll pass it to UCB. In fact, the 2nd day I'm surviving w/o desktop effects and you know what? I'm still alive :-) sure it's less functional, but there are NO freezes which made me pissed off so much. So, thx JackOfAll and Paul for your help. Glad you can live without the 'bling' for the moment. May I asked couple of questions while such people are around? in Q4 this year nvidia will present GT300 cards. so, here are actually two questions: Details are still a little thin on the ground and depending on who you believe we might not even see the new architecture cards until next year. Right now, IMHO is a bad time to be buying new nVidia cards. I'd advocate holding off for a couple of months. (Especially with the high end cards, > GTX275.) - will BOINC app for GPUGRID work on SLI? I mean I'd like to get 2 cards and I'm not sure if I should SLI bridge or not (like in Folding and i must NOT connect cards with the bridge); Paul answered this above. The 190.xx driver series and CUDA 2.3 allows you to access individual GPU's (for CUDA purposes) whilst the cards are in SLI mode. Not tried it personally, but I know it does work. ____________ Crunching on Linux: Fedora 11 x86_64 / nVidia 185.18.36 driver / CUDA 2.2 | |
ID: 12935 | Rating: 0 | rate: / Reply Quote | |
I've seen something similar to the display freeze problem under the 64-bit Windows versions of BOINC (at least 6.10.3), but the freeze is permanent enough that I've been unable to check if all the other software freezes as well. Seems to occur only when running both a GPUGRID workunit and a CPU workunit from some other project, and only if the CPU workunit has graphics that are big enough to fill the screen. I'm not familiar with the terms used to describe avoiding any use of the screensaver that comes with recent BOINC versions, if you want to do this under Linux versions, but I'd suggest trying this if you know how. | |
ID: 13035 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : Desktop freezes