Message boards : Graphics cards (GPUs) : Don't understand why it is failing
Author | Message |
---|---|
Greetings, <core_client_version>7.6.31</core_client_version> <![CDATA[ <message> process exited with code 197 (0xc5, -59) </message> <stderr_txt> # SWAN Device 0 : # Name : GeForce GTX 470 # ECC : Disabled # Global mem : 1279MB # Capability : 2.0 # PCI ID : 0000:02:00.0 # Device clock : 1250MHz # Memory clock : 1701MHz # Memory width : 320bit #SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 200 </stderr_txt> ]]> I have read and searched online but have not found anything that is relevant to my case. Can someone point out where things might be going bad please? Here is the host: https://www.gpugrid.net/show_host_detail.php?hostid=362128 Also, another bit of relevant data if your are digging through the host logs. I picked up two NVidia cards: A 470 and a 460. After a few weeks of crunching, the 460 crapped out hard core yesterday and shat out all of the GPU work the computer had queued up. It will work for an hour or two after a reboot, then die again. It has since been removed. It was only 30$, what can I expect? *sigh* :-) Anyway, the point is ignore the 460 workloads; not relevant. The 470, however, is doing great on other projects. Thanks! | |
ID: 44203 | Rating: 0 | rate: / Reply Quote | |
#SWAN: FATAL: cannot find image for module [.nonbonded.cu.] for device version 200 This error message says that the application does not include the parts needed for compute capability 2.0 (called "version 200" above) GPUs. As the GPUGrid app works with CC3.0~CC5.2 cards, your card is too old for this project. I don't recommend to crunch on these very old cards, as their energy efficiency is terrible compared to recent cards. | |
ID: 44215 | Rating: 0 | rate: / Reply Quote | |
Thanks for that info! I tried searching for that message but never found a good explanation of what it was trying to tell me. | |
ID: 44219 | Rating: 0 | rate: / Reply Quote | |
Thanks for that info! I tried searching for that message but never found a good explanation of what it was trying to tell me. There are a couple of useful threads for novices in the FAQ, for example: FAQ - Recommended GPUs for GPUGrid crunching However, this error message is not listed there. You should try to use the advanced search, and extend the time limit for the search for more results. Out of curiosity, BOINC tells GPUGRID what card I have. Why doesn't GPUGRID throw an error /before/ it sends the work? I feel kinda bad that I took up work that just errored out like that. That's just sloppy business from GPUGrid's part, you should not feel bad. This behavior applies to the brand new GTX 10X0 cards as well, because these are CC6.1 cards, and fail every workunit with the same error, still the GPUGrid scheduler will send them work (until their daily quota reaches 0). | |
ID: 44221 | Rating: 0 | rate: / Reply Quote | |
Thanks, I've updated the FAQ. | |
ID: 44228 | Rating: 0 | rate: / Reply Quote | |
Just a FYI I get the same error on a new shiny Tesla V100 using Ubuntu 16.04 and NVidia drivers 390.30 which are fairly new: | |
ID: 49296 | Rating: 0 | rate: / Reply Quote | |
I could be wrong, but I don't yet think CUDA 9.0 is supported in this version of ACEMD which is the application for this project. | |
ID: 49297 | Rating: 0 | rate: / Reply Quote | |
I believe you are correct. I spun up a Ubuntu using a P100 and Cuda 8 and they are now working. | |
ID: 49343 | Rating: 0 | rate: / Reply Quote | |
ID: 50597 | Rating: 0 | rate: / Reply Quote | |
And now also for a GTX 1660 Ti. Einstein@Home takes it, which is my answer for now. | |
ID: 51778 | Rating: 0 | rate: / Reply Quote | |
Message boards : Graphics cards (GPUs) : Don't understand why it is failing