Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

Message boards : Graphics cards (GPUs) : Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

Author	Message
Dayle Diamond Send message Joined: 5 Dec 12 Posts: 84 Credit: 1,663,883,415 RAC: 0 Level Scientific publications	Message 57382 - Posted: 26 Sep 2021 \| 15:09:52 UTC
	3090 FE. Driver Date Aug. 27th, 2021 ---------------------------------- Name e1s247_I282-ADRIA_AdB_KIXCMYB_HIP-1-2-RND4280_1 Workunit 27079548 Created 26 Sep 2021 \| 7:33:50 UTC Sent 26 Sep 2021 \| 7:33:56 UTC Received 26 Sep 2021 \| 7:36:06 UTC Server state Over Outcome Computation error Client state Compute error Exit status 195 (0xc3) EXIT_CHILD_FAILED Computer ID 140554 Report deadline 1 Oct 2021 \| 7:33:56 UTC Run time 10.12 CPU time 0.00 Validate state Invalid Credit 0.00 Application version New version of ACEMD v2.18 (cuda101) Stderr output <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> (unknown error) - exit code 195 (0xc3)</message> <stderr_txt> 00:34:25 (21456): wrapper (7.9.26016): starting 00:34:25 (21456): wrapper: running bin/acemd3.exe (--boinc --device 0) ACEMD failed: Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch) 00:34:28 (21456): bin/acemd3.exe exited; CPU time 0.000000 00:34:28 (21456): app exit status: 0x1 00:34:28 (21456): called boinc_finish(195) 0 bytes in 0 Free Blocks. 268 bytes in 4 Normal Blocks. 1144 bytes in 1 CRT Blocks. 0 bytes in 0 Ignore Blocks. 0 bytes in 0 Client Blocks. Largest number used: 0 bytes. Total allocations: 190200 bytes. Dumping objects -> {323252} normal block at 0x0000018D079E9B30, 126 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 ..\api\boinc_api.cpp(309) : {323249} normal block at 0x0000018D079A6B10, 8 bytes long. Data: < > 00 00 95 07 8D 01 00 00 {322607} normal block at 0x0000018D079E9A70, 126 bytes long. Data: <<project_prefere> 3C 70 72 6F 6A 65 63 74 5F 70 72 65 66 65 72 65 {321996} normal block at 0x0000018D079A6E80, 8 bytes long. Data: <ÀÄ > C0 C4 9E 07 8D 01 00 00 ..\zip\boinc_zip.cpp(122) : {147} normal block at 0x0000018D079ADD40, 260 bytes long. Data: < > 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 {134} normal block at 0x0000018D079A7290, 16 bytes long. Data: <p« > 70 AB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {133} normal block at 0x0000018D079AAB70, 40 bytes long. Data: < r conda-pa> 90 72 9A 07 8D 01 00 00 63 6F 6E 64 61 2D 70 61 {126} normal block at 0x0000018D079AA9B0, 48 bytes long. Data: <--boinc --device> 2D 2D 62 6F 69 6E 63 20 2D 2D 64 65 76 69 63 65 {125} normal block at 0x0000018D079A6930, 16 bytes long. Data: <8ì > 38 EC 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {124} normal block at 0x0000018D079A7330, 16 bytes long. Data: < ì > 10 EC 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {123} normal block at 0x0000018D079A76A0, 16 bytes long. Data: <èë > E8 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {122} normal block at 0x0000018D079A68E0, 16 bytes long. Data: <Àë > C0 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {121} normal block at 0x0000018D079A6F20, 16 bytes long. Data: < ë > 98 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {120} normal block at 0x0000018D079A7600, 16 bytes long. Data: <pë > 70 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {119} normal block at 0x0000018D079A6F70, 16 bytes long. Data: <Pë > 50 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {118} normal block at 0x0000018D079A6FC0, 16 bytes long. Data: <(ë > 28 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {117} normal block at 0x0000018D079A6840, 16 bytes long. Data: < ë > 00 EB 9A 07 8D 01 00 00 00 00 00 00 00 00 00 00 {116} normal block at 0x0000018D079AEB00, 496 bytes long. Data: <@h bin/acem> 40 68 9A 07 8D 01 00 00 62 69 6E 2F 61 63 65 6D {66} normal block at 0x0000018D079A6890, 16 bytes long. Data: < êfè÷ > 80 EA 66 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {65} normal block at 0x0000018D079A75B0, 16 bytes long. Data: <@éfè÷ > 40 E9 66 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {64} normal block at 0x0000018D079A6D40, 16 bytes long. Data: <øWcè÷ > F8 57 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {63} normal block at 0x0000018D079A7560, 16 bytes long. Data: <ØWcè÷ > D8 57 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {62} normal block at 0x0000018D079A6B60, 16 bytes long. Data: <P cè÷ > 50 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {61} normal block at 0x0000018D079A6A20, 16 bytes long. Data: <0 cè÷ > 30 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {60} normal block at 0x0000018D079A71F0, 16 bytes long. Data: <à cè÷ > E0 02 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {59} normal block at 0x0000018D079A7740, 16 bytes long. Data: < cè÷ > 10 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {58} normal block at 0x0000018D079A6AC0, 16 bytes long. Data: <p cè÷ > 70 04 63 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 {57} normal block at 0x0000018D079A6C00, 16 bytes long. Data: < Àaè÷ > 18 C0 61 E8 F7 7F 00 00 00 00 00 00 00 00 00 00 Object dump complete. </stderr_txt> ]]>
	ID: 57382 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1340 Credit: 7,653,123,724 RAC: 13,404,739 Level Scientific publications	Message 57384 - Posted: 26 Sep 2021 \| 16:53:51 UTC - in response to Message 57382.
	Known issue. The CUDA101 app will fail on Ampere cards. See this thread. https://www.gpugrid.net/forum_thread.php?id=5246
	ID: 57384 \| Rating: 0 \| rate: / Reply Quote

Dayle Diamond Send message Joined: 5 Dec 12 Posts: 84 Credit: 1,663,883,415 RAC: 0 Level Scientific publications	Message 57386 - Posted: 26 Sep 2021 \| 18:43:25 UTC
	Maybe I'm missing some context, but the link shows that issue had been fixed and does not mention my error code specifically. This host has run several successful tasks since then, but perhaps they were another GPUGrid application. I'm surprised there are still known issues with Ampere cards. While getting mine was a struggle, the architecture has released for over a year at this point.
	ID: 57386 \| Rating: 0 \| rate: / Reply Quote

Keith Myers Send message Joined: 13 Dec 17 Posts: 1340 Credit: 7,653,123,724 RAC: 13,404,739 Level Scientific publications	Message 57388 - Posted: 26 Sep 2021 \| 18:57:27 UTC - in response to Message 57386. Last modified: 26 Sep 2021 \| 18:58:28 UTC
	The thread does in fact mention exactly the error message title of this thread in the latest posts. https://www.gpugrid.net/forum_thread.php?id=5246&nowrap=true#57363 ACEMD failed: Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch) The CUDA1121 application runs fine on Ampere cards. Only when the scheduler sends a task assigned with the CUDA 101 application do the tasks fail. The issue is that the driver level does not match the CUDA101 application. Simplest solution is to remove the CUDA101 app from the scheduler and force all hosts to use the CUDA1121 application which requires minimum CUDA 11.2 level of drivers.
	ID: 57388 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 57412 - Posted: 1 Oct 2021 \| 10:11:17 UTC - in response to Message 57388.
	We have now changed the scheduler, let's see if now it's better. gdf
	ID: 57412 \| Rating: 0 \| rate: / Reply Quote

PDW Send message Joined: 7 Mar 14 Posts: 15 Credit: 4,909,794,525 RAC: 31,560,326 Level Scientific publications	Message 57418 - Posted: 1 Oct 2021 \| 13:50:46 UTC - in response to Message 57412.
	We have now changed the scheduler, let's see if now it's better. gdf Is this a result of the scheduler changes or something else ? The result http://gpugrid.net/result.php?resultid=32646962 failed (see below) to launch CUDA which isn't surprising as the host doesn't show a GPU. Host: http://gpugrid.net/show_host_detail.php?hostid=514156 New version of ACEMD v2.18 (cuda1121) Stderr output <core_client_version>7.16.11</core_client_version> <![CDATA[ <message> process got signal 67</message> <stderr_txt> 14:40:06 (57305): wrapper (7.7.26016): starting 14:40:06 (57305): wrapper (7.7.26016): starting 14:40:06 (57305): wrapper: running /bin/tar (xf conda-pack.tar.bz2) 14:42:47 (57305): /bin/tar exited; CPU time 127.344146 14:42:47 (57305): wrapper: running bin/acemd3 (--boinc --device 0) ACEMD failed: Error invoking kernel: CUDA_ERROR_LAUNCH_FAILED (719) 19:16:23 (57305): bin/acemd3 exited; CPU time 6047.267986 19:16:23 (57305): app exit status: 0x1 19:16:23 (57305): called boinc_finish(195) </stderr_txt> ]]>
	ID: 57418 \| Rating: 0 \| rate: / Reply Quote

Richard Haselgrove Send message Joined: 11 Jul 09 Posts: 1620 Credit: 8,822,866,430 RAC: 19,442,844 Level Scientific publications	Message 57424 - Posted: 2 Oct 2021 \| 8:49:10 UTC
	I'm seeing my Linux machines receive the cuda1121 plan class more consistently, but my Windows machines receive cuda101 - I don't think I've ever seen cuda1121 under Windows. Cards are from the same range (GTX 1660), and drivers are up-to-date - Linux 470.63, Windows 472.12
	ID: 57424 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : Graphics cards (GPUs) : Error compiling program: nvrtc: error: invalid value for --gpu-architecture (-arch)

	About	Science	Volunteers	Performance	Forum	Join us	Donate