Message boards : Graphics cards (GPUs) : Beware: BOINC 7.0.45 and later
Author | Message |
---|---|
I was wondering why my GPU WUs take longer time than before, then I found it. | |
ID: 28533 | Rating: 0 | rate: / Reply Quote | |
Maybe open a ticket on the BOINC bugpage? ____________ Team Belgium | |
ID: 28534 | Rating: 0 | rate: / Reply Quote | |
Oh believe me we requested it. Unfortunately Dr A didn't want to do it properly and have a separate preference for GPU. The more people that ask for it the more likely it will get added. | |
ID: 28539 | Rating: 0 | rate: / Reply Quote | |
I think I read those requests for a GPU throttle and I believe the response was it's not possible through BOINC which to me translates to "it's possible but it's so messy we don't want to do it". We need to keep in mind BOINC needs to run on 3 different OSs so what seems trivial at first isn't always so. | |
ID: 28540 | Rating: 0 | rate: / Reply Quote | |
The simple and blatantly obvious answer to 'cant separate CPU and GPU crunching in Boinc' is two different programs; One for CPU and one for GPU. | |
ID: 28542 | Rating: 0 | rate: / Reply Quote | |
Two or even more instances of BOINC are possible on Linux and in the T4T forums Crystal Pellet claims he is able to do so on Windows as well though I've never been able to replicate his results, probably because I didn't try very hard. | |
ID: 28549 | Rating: 0 | rate: / Reply Quote | |
Many GPU projects are now telling Boinc that they need a full CPU. This avoids a lot of unwanted issues that had previously arisen, but it requires cross-project participation and adherence to 'the rules' (there aren't any). The situation is improving but some projects still operate in a way that stresses other projects. | |
ID: 28555 | Rating: 0 | rate: / Reply Quote | |
This time-based throttling is very ineffective anyway. It's like driving your car at 6000 rpm, then seeing that the load on the engine is too high.. and instead of running it at 5500 or 5000 rpm you still run at 6000 rpm most of the time and no throttle every few seconds. | |
ID: 28556 | Rating: 0 | rate: / Reply Quote | |
While the time-based throttling might not be the best method in terms of scheduling and task processing, it's universal for any kind of CPU. Moreover, even if the CPU is idle 1/10 of time, it's able to cool down during this short period. cTDP is only available on few certain CPU models and there are other factors limiting CPU power limiting (PL1, PL2). | |
ID: 28568 | Rating: 0 | rate: / Reply Quote | |
Many GPU projects are now telling Boinc that they need a full CPU. This avoids a lot of unwanted issues that had previously arisen, but it requires cross-project participation and adherence to 'the rules' (there aren't any). The situation is improving but some projects still operate in a way that stresses other projects. I have a somewhat related question. I am running both a GTX 560 and a GTX 650 Ti on the same motherboard, with only long que jobs selected. The GTX 560 gets along with only 5% CPU utilization (E8400 dual core at 3.0 GHz), and allows WCG jobs to run on that core at the same time. But the GTX 650 Ti reserves a whole core (50% CPU), and does not allow other projects to run on that core. Does anyone know the reason why? (I am running BOINC 7.0.52 x64 on Win7 64-bit, Nvidia 310.90 drivers.) | |
ID: 28571 | Rating: 0 | rate: / Reply Quote | |
Are you allowing BOINC to use all CPU time? | |
ID: 28573 | Rating: 0 | rate: / Reply Quote | |
Different GPU architecture. On the GTX600 cards more CPU usage is required than with the previous generation of GPU. You would need to ask the researchers exactly what they are running on the CPU. | |
ID: 28574 | Rating: 0 | rate: / Reply Quote | |
I have a GTX 650 Ti too and the same BOINC version. Long tasks (NOELIA) reserve only 0.594 CPU. So I'm currently running WCG CPU tasks on all CPU cores + NOELIA on GPU. Maybe you have an app_config.xml file not properly set? | |
ID: 28575 | Rating: 0 | rate: / Reply Quote | |
While the time-based throttling might not be the best method in terms of scheduling and task processing, it's universal for any kind of CPU. True but it still sucks so bad even the BOINC devs recommend using TThrottle instead. It works at the OS level rather than the app level. It's the kind of throttling the BOINC devs would like to do but it requires different code for each OS so they decided to not do it and give us the crappy app level throttling instead. If the president of Ford Motor Company said "our cars suck, don't buy them, buy a Chevy instead" would you then buy a Ford? Moreover, even if the CPU is idle 1/10 of time, it's able to cool down during this short period. Yeah but then it heats up again when it's not idle and you get a continuous cycling between hot and cold which induces cyclic expansion and contraction which is a known cause of hardware failure. TThrottle gives much finer grained on/off or idle/run cycles which yields a far more even temperature and virtually eliminates expansion/contraction. And it will allow you to run any version of BOINC and still have CPU throttling independent of GPU throttling. Unfortunately it only runs on Windows but that seems to be your OS anyway. ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 28576 | Rating: 0 | rate: / Reply Quote | |
Are you allowing BOINC to use all CPU time? Yes, 100% on both cores. | |
ID: 28577 | Rating: 0 | rate: / Reply Quote | |
I have a GTX 650 Ti too and the same BOINC version. Long tasks (NOELIA) reserve only 0.594 CPU. So I'm currently running WCG CPU tasks on all CPU cores + NOELIA on GPU. I have no app_config.xml, but am using a cc_config.xml to get both cards to run: <cc_config> <options> <use_all_gpus>1</use_all_gpus> </options> </cc_config> That might have something to do with it. Whether it is a bug or feature I have no idea. | |
ID: 28578 | Rating: 0 | rate: / Reply Quote | |
Different GPU architecture. On the GTX600 cards more CPU usage is required than with the previous generation of GPU. You would need to ask the researchers exactly what they are running on the CPU. Yes, I have found that on my CPU, leaving it free is a good idea for various reasons. Even my Video LAN player does not like it when I use both cores. The deck may be stacked differently when Haswell comes along. | |
ID: 28579 | Rating: 0 | rate: / Reply Quote | |
Sure, cTDP is not yet the solution. However, it would only be a matter of marketing (i.e. the will to implement this), the technology is already there and is in the chips anyway. | |
ID: 28580 | Rating: 0 | rate: / Reply Quote | |
Edit@Jim: if I remember correctly GPU-Grid decided that the 600 series GPUs were becoming so fast, that not reserving an entire core would slow the GPU down too much. Thanks. I vaguely remember seeing something along those lines too, but couldn't find it in a search. It will all be irrelevant when Haswell comes along and provides all the cores I need anyway. | |
ID: 28581 | Rating: 0 | rate: / Reply Quote | |
PL1/PL2 are long and short duration Power Limits for limiting turbo boost in Intel CPUs. These work dynamically by limiting clocks using EWMA of measured actual CPU power (IMON). Additionally there's an "On-demand clock modulation" feature in most CPUs, which provides kind of static throttling (unlike dynamic power throttling using PLs).
Then why does it for me reserve "0.594 CPUs + 1 NVIDIA GPU" on 650 Ti ? | |
ID: 28582 | Rating: 0 | rate: / Reply Quote | |
Mumak, | |
ID: 28584 | Rating: 0 | rate: / Reply Quote | |
Mumak, That could be the reason... Maybe those tasks depend on CPU type (or certain features) and then the system decides how much CPU resources are required. My CPU is an i5-750 (Lynnfield, no HT). But I don't have much experience with GPUGrid yet - I joined only 5 days ago... | |
ID: 28585 | Rating: 0 | rate: / Reply Quote | |
Many GPU projects are now telling Boinc that they need a full CPU. This avoids a lot of unwanted issues that had previously arisen, but it requires cross-project participation and adherence to 'the rules' (there aren't any). The situation is improving but some projects still operate in a way that stresses other projects. I've been saying it for a long time... we crunchers hold more power than all of the projects and David Anderson combined but we don't exercise it. All we have to do is open discussions, hear the concerns, establish some rules and then ostracize any project that doesn't wanna play nice. It's our hardware, we pay the power bills, we spend time installing, configuring and fixing stuff when the scheduler throws a wobbly. We should have a major say in how things work and what projects are allowed to do and if they don't like it they can go get their CPU cycles from someone else. I think David would support that along with 95% of the projects. The 5% that won't will when they realize the consequences. I think a lot of people are starting to realize the train is off the rails and something needs to be done. The difficult part will be getting crunchers to change their attitude from the prevailing "oh I am just so privileged to have you scientist gods use and abuse my hardware, power and time any way you want" to something more realistic that recognizes the needs of all the projects and all the crunchers. And in the end maybe it doesn't matter if they change. The power belongs to those who take it and rightfully so. We attempt to establish a participatory democracy, we nurture that and grow it always but if it doesn't happen then a benevolent cadre takes power and wields it as it sees fit in consultation with the projects. Get the right players in the cadre and there will no arm that cannot be twisted. CPU project adherence to good principles of crunching is an issue but the demands from different GPU projects is a different type of problem. There are very different GPU project system requirements (high vs low CPU requirements, high or low PCIE requirements, system memory bandwidth, GPU GDDR usage...). The impacts of these on each other and CPU projects isn't something that Boinc can readily ascertain never mind accommodate. You would need Boinc to be kitted out with a tool that can read these things (something I asked for years ago). Let's get the discussions going and let all the players explain what they need and what their concerns are. There can be a consensus and agreement upon what needs to be done and rules established and procedures for punting rogues off the playing field. I think if we give David Anderson that then he will respond with appropriate code and kit the client out as you describe. The situation now is mayhem... how can he code for that? It needs rules as well as code. And if he doesn't rise to the task then we fork a branch and code it ourselves. Easy? Hell no, it'll take a lot of time, effort and dialogue. But it needs doing and soon. we can have anything we want including top notch in built throttling for CPU and GPU independent. The only limit is our imagination, time and the willingness to maintain it after it's coded. Not easy but we are a big community chock full of talent. So if you run several CPU and GPU projects from one instance of Boinc, you're going to land in all sorts of trouble. For example, a GPU project such as POEM starts running, is setup to run several tasks, suddenly starts to use all the CPU cores, does some disk reading and writing (just when the CPU projects want to do this) and pop, the system or Boinc crashes. Run one GPU project and you can watch video, play games... Run another GPU project and you struggle to even web browse. I don't have the technical expertise to supply the answers/fixes but I suspect many others here do. And if there is nobody then we recruit the talent we need or we form a study group to research and find the answers ourselves. We call in AMD, nVIDIA and Intel if we have to. The other option is do nothing and continue to let the train plough up the dirt beside the rails it's supposed to be on. An easy way to have two instances configured in Linux, is to tell one to only run GPU tasks and set it to use a specific number of CPU cores. Then tell the other client to use the remaining CPU cores. This way the scheduler doesn't do stupid things like stop running GPU tasks so that badly packaged CPU tasks can run. If your GPU is only attached to one project then you're not going to experience 'not highest priority' issues. If you attach to more projects (say with an ATI card) then increase the cache a bit, to allow Boinc to sort its feet out. With a very low cache Boinc has 'bar-stool moments' (when someone who appears ok, tries to walk after sitting on a bar-stool too long). Too high a cache and it's all over the place, disk I/O soars, RAM usage rises and some projects don't get a look in. That would be an option too except that 90% or more of the crunchers are lemmings firmly in the control of the Gates Crime Family and won't have anything to do with Linux, won't even discuss it. As you are well aware they won't even run Linux in a VM unless it's at T4T where they don't have to stick their little paws in it. I guess you could even set a GPU app to be an exclusive application on one instance, to stop the GPU being used in another; use different Boinc instances for different GPU project and thus exclude GPU crunching when one GPU app is running and you start a video, but allow the other GPU app to run. Very clever. I see you've been giving this all a lot of thought. I might tinker with some of those ideas myself soon. Got my GTX 570 and now my AMD 7970 to experiment with and putting together an order for more. Anyway, you have to know your projects and the demands they make on your system. I struggle to work these out, most people don't know much, and Boinc hasn't a notion. That's good stuff, I like it! That chart should be developed further and published in a prominent place like the official BOINC wiki. I can do that. Actually anybody can do it but I happen to have the account and password already. Want it in the wiki? I'll write it, you proof read it, let me know when it needs updating. Tell me what to test and how and I'll help with that too. ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 28587 | Rating: 0 | rate: / Reply Quote | |
Then why does it for me reserve "0.594 CPUs + 1 NVIDIA GPU" on 650 Ti ? How much CPU is it actually using? I bet one full logical core, as is the case on my i7 with a GTX660Ti. BOINC says "0.69 CPU + 1 GPU" but those are meaningless numbers. It's enough to make the BOINC scheduler not assign another task to this core. Not sure if it would be better if GPU-GRid actually set this to 1. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 28599 | Rating: 0 | rate: / Reply Quote | |
Hard to say how much CPU is really utilized, because I'm running WCG CPU tasks there too. But the CPU has 4 cores (no HT) and WCG runs on all CPU cores + NOELIA on the GPU. Then why does it for me reserve "0.594 CPUs + 1 NVIDIA GPU" on 650 Ti ? | |
ID: 28603 | Rating: 0 | rate: / Reply Quote | |
Dagorath, I think we have hijacked this thread for long enough. Certainly warrants another thread, if not site. | |
ID: 28604 | Rating: 0 | rate: / Reply Quote | |
with app_info.xml or app_config.xml you can specify manual cpu/gpu count per task. app_info is pretty confusing but app_config seems to be much simpler. but i don't use it I just see it being discussed over on xtremesystems WCG section. | |
ID: 28609 | Rating: 0 | rate: / Reply Quote | |
So I suspended CPU WCG tasks and let only ACEMD_LONG run. Now this is interesting - the task utilizes 2 cores: 1 for ~50% and another one ~30% (up/down). | |
ID: 28610 | Rating: 0 | rate: / Reply Quote | |
So I suspended CPU WCG tasks and let only ACEMD_LONG run. Now this is interesting - the task utilizes 2 cores: 1 for ~50% and another one ~30% (up/down). Everybody talks as if apps start running on a certain core and never change to another core but that isn't the way it actually works. The task scheduler in the OS shifts apps/tasks around from one core to another according to a very complicated scheduling algorithm designed to use resources in an optimal way to keep the work flowing as fast/efficiently as possible. Not talking about the BOINC scheduler, talking about the OS's task scheduler. BOINC can specify to the OS that an app should run on 2 cores, 1 core, 5 cores or whatever but it cannot say which cores. In a pre-emptive multi-tasking OS it is possible that for brief periods of time an app, any app not just BOINC apps, actually isn't executing on any cores at all. ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 28611 | Rating: 0 | rate: / Reply Quote | |
There's an opportunity to assign a software thread to a particular CPU thread using the CPU Affinity Mask (SetThreadAffinityMask, SetThreadIdealProcessor, SetThreadIdealProcessorEx, SetThreadGroupAffinity), so I disagree that BOINC is unable to do this. The question is whether it does that, but that's easy to determine using Task Manager...
| |
ID: 28612 | Rating: 0 | rate: / Reply Quote | |
Fixing affinity usually doesn't yield any benefits for BOINC crunching. THe reason is that the OS scheduler works on the time scale of ms, which is an eternity from the viewpoint of a CPU (1 million times slower at 1 GHz). There are coner-cases where where manually fixed core affinity did help considerably.. but this mainly applied to ill-balanced older multi-cpu hardware. | |
ID: 28618 | Rating: 0 | rate: / Reply Quote | |
There's an opportunity to assign a software thread to a particular CPU thread using the CPU Affinity Mask (SetThreadAffinityMask, SetThreadIdealProcessor, SetThreadIdealProcessorEx, SetThreadGroupAffinity), so I disagree that BOINC is unable to do this. The question is whether it does that, but that's easy to determine using Task Manager... Thank you. I stand corrected. ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 28620 | Rating: 0 | rate: / Reply Quote | |
I still don't understand where the "0.594 CPUs" number comes from.. Any idea? | |
ID: 28621 | Rating: 0 | rate: / Reply Quote | |
I still don't understand where the "0.594 CPUs" number comes from.. Any idea? I've always thought the number referred to how much of 1 core's time is needed to drive the GPU? Also, what does the "Maximum CPU % for graphics" setting in GPUGRID preferences exactly mean? Many projects have that setting and I think it refers to how much CPU time should be allocated to their screensaver. ____________ BOINC <<--- credit whores, pedants, alien hunters | |
ID: 28624 | Rating: 0 | rate: / Reply Quote | |
I still don't understand where the "0.594 CPUs" number comes from.. Any idea? It's automatically generated, I think by the GPU-Grid app by some algorithm, and passed to BOINC to display it. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 28641 | Rating: 0 | rate: / Reply Quote | |
I still don't understand where the "0.594 CPUs" number comes from.. Any idea? i think that the number "0.594" means that 1 GPU is using aslo a bit more than 50% CPU to crunch correctly For example my own rig is using the same amouth %cpu power.. Iam only crunching the 6.17 Long runs 8-12 hours wu versions My GTX 690 is crunching 2 wus at once and is using 1 cpu in total for that. Here you have a copy off the boinctasks:
Iam using the program BoincTasks to overview all my crunching And to keep the temperture of the CPU/GPU in hand iam also using the addon programm TThrottle.. BoincTask - http://www.efmer.eu/boinc/boinc_tasks/index.html TThrottle - http://www.efmer.eu/boinc/index.html | |
ID: 28701 | Rating: 0 | rate: / Reply Quote | |
I still don't understand where the "0.594 CPUs" number comes from.. Any idea? It means that amount of CPU resource is allocated (on a per core basis), or reserved if you like. It's probably actually using much less. So on a 6 core machine that particular BOINC process is reserving .592 of 1 CPU core and 5.418 cores are still available for other processes. In practical terms you could still have 6 BOINC CPU processes running in addition to the GPU process. If you have two GPUs with processes claiming .592 CPU then only 5 BOINC CPU processes will be allowed to run as 2 x .592 > 1. | |
ID: 28704 | Rating: 0 | rate: / Reply Quote | |
Unfortunately the reality looks different. When I turn off all CPU tasks, the GPUGrid (Noelia) task utilized almost a full core (total CPU load on 4 cores was 25%, which means an entire core is under full load). So I believe that allocation number (0.594 CPU) is not correct and these tasks should reserve 1 CPU (thread/core). | |
ID: 28715 | Rating: 0 | rate: / Reply Quote | |
This is a strange phenomenon with this project. Older GPUs use a very low amount of CPU time while the newer GPUs such as your 650 TI use a lot. I can see it with my various machines. Skgiven also has this going on with his GPUs: the 470 uses very little CPU while the 660 TI uses a great deal. In fact, looking through the database it looks like only the 6xx series GPUs exhibit this high CPU usage. Maybe he can give us an idea why this is happening. | |
ID: 28721 | Rating: 0 | rate: / Reply Quote | |
As I said above: "GPU-Grid decided that the 600 series GPUs were becoming so fast, that not reserving an entire core would slow the GPU down too much." | |
ID: 28735 | Rating: 0 | rate: / Reply Quote | |
It's also about stability; too many people were trying to run high end cards while using every last percentage of their CPU's. The result was system and Boinc instability, task failures and more problems for the projects to deal with. Failures aside, this CPU over-commitment also resulted in GPU performance reductions and in some cases a decline in CPU project performances. | |
ID: 28744 | Rating: 0 | rate: / Reply Quote | |
As I said above: "GPU-Grid decided that the 600 series GPUs were becoming so fast, that not reserving an entire core would slow the GPU down too much." I missed that. MrS, thanks for the explanation. Question, why is it saying 0.481C and really reserving 1 CPU core? How does that work? And "I don't think it's a good choice for GK107 based cards" I would agree. I would much rather manage and reserve the core myself if it helps the speed. | |
ID: 28745 | Rating: 0 | rate: / Reply Quote | |
A few posts above:
The number shown is probably based on some estimation used for the older cards with SWAN_SYNC and not yet updated for Keplers. Reserving a CPU core your self is fine, but not everyone running a fast GPU (in need of this) will know he/she should do this. I guess that's where the idea came from to do it automatically. MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 28747 | Rating: 0 | rate: / Reply Quote | |
Thanks guys, now it makes sense to me :-) | |
ID: 28750 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : Beware: BOINC 7.0.45 and later