Message boards : Number crunching : GPUGRID active users are falling dramatically!
Author | Message |
---|---|
Why are active users falling on this project? | |
ID: 47358 | Rating: 0 | rate: / Reply Quote | |
Maybe the occasional very long work units are discouraging some people? I don't think the bonuses are that important, but it is a psychological thing. Also, the reduced output of the app under Windows may be a problem, though I am mainly on Linux now and am glad to have the work. | |
ID: 47359 | Rating: 0 | rate: / Reply Quote | |
It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future... | |
ID: 47361 | Rating: 0 | rate: / Reply Quote | |
It is called summer in the north hemisphere. High temps = air conditioning and air conditioning doesn't work well with high end computers crunching for the better future... I have had ambient temperatures up to 32c in summer and still run computers without air conditioning. Are you trying to tell me that's responsible for a 33% drop in users that started in April? I would like to believe you but I can't just yet. | |
ID: 47362 | Rating: 0 | rate: / Reply Quote | |
What happened between 21st & 22nd of April? (actually a month before this date) | |
ID: 47363 | Rating: 0 | rate: / Reply Quote | |
Why are active users falling on this project? really surprised? I am not! It all started out around mid-April when all of a sudden the crunching software became invalid and stopped working (which should never ever happen that way). So, a new software needed to be put together, and since this had to be done in a hurry, it was rather buggy. Here just a few examples of what one could read in various threads of the forum, and what I myself have been experiencing since: - the new software is around 30% slower :-( - GPU overclocking is much less possible than before (at least for Maxwell cards; no idea how it is with Pascals). - tasks stop for unknown reasons, and only continue if they are switched off (suspended) and switched on again manually (so, if a cruncher does not notice such a stop for say 10 hours, the system runs idle for 10 hours). - the new software does not work well with BOINC: when pushing the "suspense" button in the BOINC manager (either in "Tasks" or in "Projects"), it takes several minutes until the task reacts and stops. In the recent past, GPUGRID tasks have become even more GPU-straining and long-lasting; for example "ADRIA_FOLDGREED10_crystal_ss_contacts_100_ubiquitin" (also the _50_ubiquitin) - on a GTX750ti (with some unvoluntary stops inbetween, as mentioned above), it can take 3 days or more until this task gets finished. That's why I had suggested that on the Project Preference page, besides "short runs" (which virtually don't exist any more) and "long runs", a third category like "extra long runs" (or whatever wording suits) is being implemented, so that the many GTX750Ti crunchers can exclude such long tasks from download. And here we are at the next problem: back at GPUGRID, no-one really seems to care which problems the crunchers have and which suggestens they are presenting. Reading much in the forum, I can think of so many other people writing about all kinds of problems, making useful suggestions and also putting questions now and then. However: NO REACTION AT ALL ! So, coming back to the beginning of this posting: I am NOT surprised that people are turning away from this project. Sorry to say this :-( By accident, in the forum of another BOINC project I participate, yesterday I read a statement from the project people there: "Of course having happy volunteers is very important for the health of a project; so it is something that should be addressed ..." Why is this different with GPUGRID? | |
ID: 47364 | Rating: 0 | rate: / Reply Quote | |
I have been waiting a long time to get a task. | |
ID: 47367 | Rating: 0 | rate: / Reply Quote | |
I have been waiting a long time to get a task. I guess this is kind of not quite the right thread to post your problem. A lot of statements and opinions about the problem of not getting tasks are contained in this thread here: http://gpugrid.net/forum_thread.php?id=4574 you may look this up, perhaps you get an idea what's wrong. | |
ID: 47368 | Rating: 0 | rate: / Reply Quote | |
... I am curious how much longer it will take the GPUGRID people to acknowledge that the current software is buggy and needs to be repaired! More or less every day, I get annoyed by these bugs cited above :-( | |
ID: 47404 | Rating: 0 | rate: / Reply Quote | |
Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan. | |
ID: 47407 | Rating: 0 | rate: / Reply Quote | |
Thanks for replying. | |
ID: 47411 | Rating: 0 | rate: / Reply Quote | |
Fixing bugs with BOINC is relatively pointless from our perspective (and time-intensive). We are considering rather other options like moving out of it, but don't ask when or how as it's more an idea than a scheduled plan. Sorry Stefan for contradicting. I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong. As said before, this software was obviously compiled in a hurry, overnight so to speak, without much (thorough) testing. All the bugs had not existed with the previous software. The content of the second paragraph of your postings makes me worry even more. Again, as I said in another posting, a project of the magnitude of GPUGRID definitely needs a certain amount of infrastructure expertise. Just having the scientits there is not enough. If, for example, no one at GPUGRID is able to reply to my posting http://gpugrid.net/forum_thread.php?id=4561&nowrap=true#47204 from a month ago, then something needs to be improved. Definitely so. Otherwise, GPUGRID really risks to loose more and more crunchers. Which would be too bad - I personally feel that GPUGRID is a fantastic project! And that's why I am participating :-) So, please put your heads together to come up with a solution! | |
ID: 47433 | Rating: 0 | rate: / Reply Quote | |
I don't think that any of the deficits in the crunching software 9.18 have to do with BOINC. So blaming BOINC, at least the way I see it, is simply wrong. FWIW, I have always advocated optimizing the apps for the latest hardware, since I think you get more bang for the crunching buck that way. If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here. I usually have fairly new cards, and you will get a lot of complaints from people with older cards that they are being abandoned, or that they are being "forced" to buy new cards (I love that one). So you have a choice. Make the one that is best for the science. | |
ID: 47434 | Rating: 0 | rate: / Reply Quote | |
If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here. one thing that's interesing: the GTX750Ti in the host with Windows10 now shows problems with the new software. the GTX750Ti in the host with WindowsXP does NOT show any problems - although this software is also new, but not the same as for Windows10. I guess there won't be many crunchers using WindowsXP; so, many of the crunchers using their GTX750Ti with Windows10 might have problems now. And I also guess that there are many crunchers with a GTX750Ti. What can be done: throw the GTX750Ti's away? :-( Last year, I bought two GTX780Ti just for GPUGRID crunching, Euro 700 each. So far, they work perfectly with WindowsXP. When GPUGRID support will end in April of next year, I'll need to change to Windows10. And then all the problems will begin. However, I don't think that I will exchange them for two new Pascals. Paying some 1400 Euros every two years just to have the latest generation of cards in order to have GPUGRID running smoothly? | |
ID: 47435 | Rating: 0 | rate: / Reply Quote | |
the GTX750Ti in the host with Windows10 now shows problems with the new software. I guess that says something about WDDM, but I don't know what. It would be fun to trace it down, but GPUGrid just does not have the staff it seems. That is why they have to avoid unnecessary risks if they can. It is not a perfect solution, but seems to be the best under the circumstances. I was planning to wait for Volta, but that will be a long time, so I migrated out of the lower-end cards into a few Pascals for higher efficiency in the warmer months, though it is still a mix. The prices are much more reasonable in the U.S., especially on sales. But everything has gone through the roof now, apparently with high demand for AMD cards even spilling over into Nvidia. | |
ID: 47436 | Rating: 0 | rate: / Reply Quote | |
If it leaves the older cards behind, so be it. You avoid precisely the type of problems that we are seeing here. On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too. | |
ID: 47470 | Rating: 0 | rate: / Reply Quote | |
For what it is worth: no issues with Linux / CentOS on my 980Ti,1080 and 1080Ti ... come over to the bright side of life ;-) | |
ID: 47471 | Rating: 0 | rate: / Reply Quote | |
On other projects some people have gone back to older drivers for their older gpu's and that brings back the gpu's under Win10 again. In short try older drivers and see if your Win10 machine can crunch again, it may just work for you too. The new crunching software acemd 918.80 only works with the latest drivers. My two Windows 10 machines had run with 376.53 before, and with the new crunching software I had to update to 381.65 to get GPUGRID run. Furthermore, Matt was pointing out clearly that the new software requires the newest drivers. In other words: no way to install older drivers for getting problems solved :-( | |
ID: 47473 | Rating: 0 | rate: / Reply Quote | |
I stopped your project, for my part far too many units end up in error ... especially after 12 hours of calculations (Titan 2013) is it not possible to still have points for calculated time? More I have no problems with Asteroids, Folding, milkyway etc. | |
ID: 47505 | Rating: 0 | rate: / Reply Quote | |
I stopped your project, for my part far too many units end up in error ...Your GPU is too hot (88°C), that's the reason for the too many errors. (see your stderr.txt output I've attached at the end of this post.) You should increase the cooling of your card: increase the airflow by increasing the RPM of the GPU's fan, or install extra fans in your PC case, or remove its side panel. Alternatively you can reduce the clock speed (or the power target) of your card to decrease its power consumption (=its heat output). especially after 12 hours of calculations (Titan 2013) is it not possible to still have points for calculated time?No. A partial result is useless for GPUGrid, as it can't be used for generating the next step of the simulation, so it has to be calculated again (on another host). More I have no problems with Asteroids, Folding, milkyway etc.Those projects do not stress the GPU as much as GPUGrid does. <core_client_version>7.6.33</core_client_version>
<![CDATA[
<message>
(unknown error) - exit code -55 (0xffffffc9)
</message>
<stderr_txt>
# GPU [GeForce GTX TITAN] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX TITAN
# ECC : Disabled
# Global mem : 6144MB
# Capability : 3.5
# PCI ID : 0000:02:00.0
# Device clock : 875MHz
# Memory clock : 3004MHz
# Memory width : 384bit
# Driver version : r382_48 : 38253
# GPU 0 : 70C
# GPU 0 : 77C
# GPU 0 : 80C
# GPU 0 : 81C
# GPU 0 : 82C
# GPU 0 : 83C
# GPU 0 : 86C
# GPU 0 : 87C
# GPU [GeForce GTX TITAN] Platform [Windows] Rev [3212] VERSION [80]
# SWAN Device 0 :
# Name : GeForce GTX TITAN
# ECC : Disabled
# Global mem : 6144MB
# Capability : 3.5
# PCI ID : 0000:02:00.0
# Device clock : 875MHz
# Memory clock : 3004MHz
# Memory width : 384bit
# Driver version : r382_48 : 38253
# GPU 0 : 77C
# GPU 0 : 80C
# GPU 0 : 82C
# GPU 0 : 83C
# GPU 0 : 84C
# GPU 0 : 87C
# GPU 0 : 88C
SWAN : FATAL : Cuda driver error 702 in file 'swanlibnv2.cpp' in line 1965.
# SWAN swan_assert 0
</stderr_txt>
]]> | |
ID: 47506 | Rating: 0 | rate: / Reply Quote | |
Thank you for your reply, I am considering changing the case for a Corsair Obsidian 900D (15 fans), or waiting for the outside temperature to return to normal (30 ° presently). | |
ID: 47507 | Rating: 0 | rate: / Reply Quote | |
Thank you for your reply, I am considering changing the case for a Corsair Obsidian 900D (15 fans), or waiting for the outside temperature to return to normal (30 ° presently). The case with better airflow will do a lot more than a few degree ambient celcius change | |
ID: 47508 | Rating: 0 | rate: / Reply Quote | |
All good advice. It is possible to run them without error. | |
ID: 47509 | Rating: 0 | rate: / Reply Quote | |
The outside temperature dropped by 10 ° Celsius and I managed to finish a Gpugrid unit. | |
ID: 47519 | Rating: 0 | rate: / Reply Quote | |
As I prefer medical research projects, I will move on to Folding ... This I did several days ago with one of my host - the PC with Windows10 and a GTX750Ti inside. As I wrote in some other postings in the forum, the problem was that since the new crunching software (acmed 918.80) was implemented beginning of April, GPUGRID crunching on this host did not work well any longer. Probably for several reasons: - the crunching software is buggy: one in a while, crunching stops, and can only be resumed by manually pausing and then resuming the task (if the task stops in the late evening or during the night, the host runs idle till next morning). - overclocking is a lot less possible than before. In fact, with the GTX750Ti I even had to underclock - and still crunching stopped once in a while. - this happens very often with one of the newer, extremely demanding WU's like ADRIA_FOLDGREED50_crystal_ss_contacts_100_ubiquitin_ (and some others) Crunching such a WU, with numerous interruptions inbetween, takes up to 3 days, which does not make a lot of sense. Hence, some time ago, I suggested that a third tier of tasks (like "extra longruns" or so) be introduced in the personal settings page, so that people with cards like a GTX750Ti can exclude such heavy tasks from being downlaoaded. I also suggested that the buggy software (which had to be put together in a hurry, and was probably optimized for Pascal cards) be improved. So far, there was NO REACTION AT ALL from the GPUGRID people - no comment, nothing. As I and others here stated several times: they don't listen to their crunchers; they are not interested in receiving any useful suggestions, in learning about problems the crunchers may face, etc. - which is really too bad. I am wondering how this project will survive longterm. So, I was simply curious, and with this host (Windows10 and GTX750Ti) I changed to Folding@Home. Just to see whether the same problems show up there as well. They die NOT! Crunching works smoothly, no interruptions, nothing ... And questions to the Folding@Home team are being answered within half a day or shorter (BTW - by reading in their forums, I realized that quite a number of people there had changed from GPUGRID to Folding@Home in the past weeks and months - I guess, they wouldn't have done that without reason). GPUGRID crunching still runs well on my two Windows XP hosts with two GTX980Ti and one GTX750Ti - most likely as the XP crunching software (acemd 849.65) did not need to be optimized for Pascal cards (they don't work with XP). However, it was already officially announced that this crunching software will only be available until April 2018. This date will then be the end of GPUGRID crunching on XP (which is too bad in a sense as crunching with XP is some 15-20% faster due to the non existing WDDM overhead). So I am curious what will happen when with my two hosts which now run on XP I change to Windows10. Maybe there will exist an improved software by then. Or maybe not. In any case, should I face all these problems I am having now with the Windows10 hosts, I can always change to Folding@Home. At the end, I'd like to repeat what I posted before, after having read it in the forum of another BOINC project: Of course have happy volunteers is very important for the health of a project so this is something that should be addressed | |
ID: 47531 | Rating: 0 | rate: / Reply Quote | |
I do both FAH and GPUgrid these days; but seems no more work needed here so can also switch all cards back to FAH (there are problems too; at times) | |
ID: 47532 | Rating: 0 | rate: / Reply Quote | |
... FAH (there are problems too; at times) I guess any project has problems now and then. However, it makes a difference whether they are being solved within short time or not at all. I am using part of my CPUs for LHC@Home - and yes, they have (all kinds of) problems about once or even twice a week. However, these problems get solved within a few hours. And when bringing things up in their forum, the team reacts within a few hours. | |
ID: 47534 | Rating: 0 | rate: / Reply Quote | |
We are trying to fix the issue with the WUs now | |
ID: 47562 | Rating: 0 | rate: / Reply Quote | |
We are trying to fix the issue with the WUs now Thanks for replying. Which issue are you trying to fix? | |
ID: 47563 | Rating: 0 | rate: / Reply Quote | |
Erich56, | |
ID: 47568 | Rating: 0 | rate: / Reply Quote | |
ahhh, the dry period is over; I got again assignments on my linux-based GPUs; thanks for fixing and giving my GPU meaningful work | |
ID: 47569 | Rating: 0 | rate: / Reply Quote | |
Erich56, but, as far as I understand, no SWAN_SYNC possible with Linux :-( | |
ID: 47570 | Rating: 0 | rate: / Reply Quote | |
Erich56, Yes that's true. You should try it after support for XP with gpugrid stops. Even with swan_sync on Vista, 7, 8 ,10, I notice it doesn't use as much of the gpu as Linux did for me. For windows 7 I personally get in the 80's WITH swan_sync. Linux, I get in the low to mid 90's. Percent I mean. ____________ Cruncher/Learner in progress. | |
ID: 47571 | Rating: 0 | rate: / Reply Quote | |
Erich56, That's because it's not needed in Linux. ;) | |
ID: 47574 | Rating: 0 | rate: / Reply Quote | |
Some time back (couple of months) I was processing WU's like crazy. I received notice that GPUGRD was moving to a new server - we'll be right back. I've never received another WU. | |
ID: 47614 | Rating: 0 | rate: / Reply Quote | |
Some time back (couple of months) I was processing WU's like crazy. I received notice that GPUGRD was moving to a new server - we'll be right back. I've never received another WU. You need to upgrade to latest drivers and then you should get WU's | |
ID: 47615 | Rating: 0 | rate: / Reply Quote | |
What happened between 21st & 22nd of April? (actually a month before this date) Apologies for being late to the party. Assuming boincstats are accurate (and noticing other projects are losing active hosts and users as well) the Win10 Creator's Edition update must have occurred around that time. That certainly broke some hosts. Number continues to fall :( | |
ID: 47666 | Rating: 0 | rate: / Reply Quote | |
Let's not blame a Windows Update unless we have proof, please. | |
ID: 47667 | Rating: 0 | rate: / Reply Quote | |
The Creator's update is not a mandatory update (yet), and it is distributed over a long period of time. I think that those who forced this update could also manage those problems which could prevent GPUGrid from working correctly; if the problems persist they could even revert back to the previous version. The only issue I've faced after updating to the Creator's update is that the Windows Search broke down in Outlook; but it turned out later that it wasn't the Creator's update (as the error showed up on not updated PCs). So my experiences do not support this explanation.What happened between 21st & 22nd of April? (actually a month before this date) Number continues to fall :(No, the number of active GPUGrid users is fluctuates around 1800 since two weeks. I think it will rise again in September. | |
ID: 47673 | Rating: 0 | rate: / Reply Quote | |
When I installed the Creator's update, I had to re-install NVidia's driver to get CUDA and OpenCL support back. Just re-ran the installer for the driver I'd downloaded already and had been using before the update. | |
ID: 47676 | Rating: 0 | rate: / Reply Quote | |
When I installed the Creator's update, I had to re-install NVidia's driver to get CUDA and OpenCL support back. Same happened to me after the recent large Windows10 Updates. The reason is that the graphic driver that comes with the update is crippled in order to keep the update size to a minimum. Only re-installing the original driver from NVIDIA helps. | |
ID: 47681 | Rating: 0 | rate: / Reply Quote | |
I did: OS FC25, NVIDIA 381.22. I have 3 other projects that use my GPU but GpuGrid keeps saying "got 0 new tasks". | |
ID: 47699 | Rating: 0 | rate: / Reply Quote | |
PS3s were used for crunching here, before Sony locked down third party operating systems. | |
ID: 47701 | Rating: 0 | rate: / Reply Quote | |
problem is that none of the links ever returns, doesn't ping (IP), nslookup returns an IP. It's like everybody just walked. Also emails to "contact us" go unanswered. looks very much like GPUGRID :-( | |
ID: 47704 | Rating: 0 | rate: / Reply Quote | |
06/03/2018 | |
ID: 49591 | Rating: 0 | rate: / Reply Quote | |
Just curious, what is the brand and make of your card. Laptop or desktop? | |
ID: 49592 | Rating: 0 | rate: / Reply Quote | |
Are you trying to run multiple WU's at once? I only ask because you have 4 in progress with a 1 card host. | |
ID: 49593 | Rating: 0 | rate: / Reply Quote | |
Message boards : Number crunching : GPUGRID active users are falling dramatically!