Message boards : Graphics cards (GPUs) : GPU work units [network connection issue]
Author | Message |
---|---|
Are GPU work units dependent on the internet other than sending and/or receiving. The reason for this question is because our internet went out a three times over the last month and the workunits that were currently being processed continued on until completion and they were reported as client error/compute error. The last internet outage, I turned off the computer and restarted it after the outage and the work unit processed correctly. Are the workunits dependent on a good internet connection while being processed? I have lost 6 workunits due to this. | |
ID: 8812 | Rating: 0 | rate: / Reply Quote | |
Um, no, and, um, well, yes ... | |
ID: 8828 | Rating: 0 | rate: / Reply Quote | |
Thanks for that I was coincidentally having latency issues last night when Boinc lost connection to the client. | |
ID: 8834 | Rating: 0 | rate: / Reply Quote | |
Thanks for the info, I'll shut down the internet connection next time and let the application continue to process. | |
ID: 8838 | Rating: 0 | rate: / Reply Quote | |
I think the reference server it tries is google. I think i saw it somewhere but no idea where. | |
ID: 8857 | Rating: 0 | rate: / Reply Quote | |
I think the reference server it tries is google. I think i saw it somewhere but no idea where. The default reference site is indeed Google. You can change that if you wish; I believe it's an option you can put in the cc_config.xml file (or whatever the filename is.) The place you saw that information was probably the documentation for the cc_config.xml file. Mike | |
ID: 8882 | Rating: 0 | rate: / Reply Quote | |
Mhh, I also had occasional network failures but I didn't see the apps loosing connection to the BOINC client, even though I didn't suspend network activity (after all, there's always the chance the connection is restored and I won't have to run dry..). | |
ID: 8910 | Rating: 0 | rate: / Reply Quote | |
Mhh, I also had occasional network failures but I didn't see the apps loosing connection to the BOINC client, even though I didn't suspend network activity (after all, there's always the chance the connection is restored and I won't have to run dry..). That is why it is such a hard problem to find. It does not happen to all people all the time. But, if you look into the past the "no Heartbeat" has been a annoying bug for a long time in the BOINC world. Just like the "Can't acquire lockfile" ... another pest ... | |
ID: 8931 | Rating: 0 | rate: / Reply Quote | |
I don't think its a problem with the Windows setup. This is just a recent problem and it only happens when the internet in the area goes down. It has gone down three times in the last month and the two work units being processed at the time crashed and burned. The onlt problem I can think of ia a bug in BOINC Application 6.6.20 files (which was also installed about the time the problems started). I will revert back to a previous version and see what happens. I am also running the most current nvidia drivers for the video cards. | |
ID: 8939 | Rating: 0 | rate: / Reply Quote | |
I don't think its a problem with the Windows setup. This is just a recent problem and it only happens when the internet in the area goes down. It has gone down three times in the last month and the two work units being processed at the time crashed and burned. The onlt problem I can think of ia a bug in BOINC Application 6.6.20 files (which was also installed about the time the problems started). I will revert back to a previous version and see what happens. I am also running the most current nvidia drivers for the video cards. No Heartbeat has been around for years ... but trying another client may work ... :) | |
ID: 8941 | Rating: 0 | rate: / Reply Quote | |
Yes, if inet was really broken then it surely was not an issue with windows and the installed programs. However, what I was thinking: what if some program went bezerk and blocked your inet access as well as your local servers and thus the no heartbeat issue. In this case it would also look like a broken inet from your point of view. | |
ID: 9013 | Rating: 0 | rate: / Reply Quote | |
Except if you have different computers and / or you know the neighbours inet is also gone or you see the service guys working or whatever. I don't know your situation, so this was just an idea.. maybe a crazy one ;) Not really. ANOTHER bug I am chasing causes OS-X versions of BOINC Manger to lose conection to the Client though it continues to run, apparently properly ... but the manager cannot connect to the client. I have sent Charlie Fenton I think 4 reports now of what I have discovered and what I suspect ... nothing back from him yet ... (sadly) ... BUt, Charlie is a good guy I think so patience is a virtue which is probably why I don't have much of it ... {edit} Forgot to mention, it looks like a TCP/IP bug also ... | |
ID: 9024 | Rating: 0 | rate: / Reply Quote | |
Well, I reverted back to BOINC 6.4.7 and everything has been running properly for the last two days. No problems to report at all. In my opinion there is a problem in the BOINC 6.6.20 code as it is applied to GPU/CUDA functions. BOINC 6.6.20 runs fine on my other computers, however they are not running GPU/CUDA functions. | |
ID: 9150 | Rating: 0 | rate: / Reply Quote | |
In my opinion there is a problem in the BOINC 6.6.20 code as it is applied to GPU/CUDA functions. There is a problem? Boy, we could all finally be happy again if it was only one ;) MrS ____________ Scanning for our furry friends since Jan 2002 | |
ID: 9158 | Rating: 0 | rate: / Reply Quote | |
Maybe I should have said "some problems' onstead of "a problem." Bad choice of words on my part. However, these problems have not manifested themselves on my computers that are not using CUDA. | |
ID: 9162 | Rating: 0 | rate: / Reply Quote | |
There are two major problems with 6.6.20; neither of which is recognized by UCB as far as I know. One of them has been fixed in 6.6.23 and later, though 6.6.24 introduced a new issue, addressed in 6.6.25 (multi-GPU users only). | |
ID: 9168 | Rating: 0 | rate: / Reply Quote | |
Could the problem with long running tasks be only for those tasks that don't have fairly frequent checkpoints available yet? | |
ID: 9233 | Rating: 0 | rate: / Reply Quote | |
Could the problem with long running tasks be only for those tasks that don't have fairly frequent checkpoints available yet? No, because it is universal with tasks from more than one project. I saw it only with GPU Grid for sure (it may have affected other tasks, I just did not see it). Another user saw his AP tasks of SaH with estimated times of 187 hours plus change to 63 hours by going back to 6.4.7 (I think) ... I have not seen it with 6.6.23 and there was a change in the resource scheduler (in the release notes) an though I forget what it said it certainly sounded like the issue. I have been running 6.6.23 and 6.6.25 for several weeks so I can get at the other issues (if you watched the mailing list this week end I sure did fill that up), and the only way you can get the developers attention is to run the latest versions (or close to it, there is no significant change in 6.6.26 so I have not tried it yet). | |
ID: 9237 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : GPU work units [network connection issue]