New beta Nvidia application 60% faster.

Message boards : News : New beta Nvidia application 60% faster.

Author	Message
GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14947 - Posted: 3 Feb 2010 \| 10:02:00 UTC
	The new Nvidia application is now out in beta. It is available by accepting beta work from GPUGRID.
	ID: 14947 \| Rating: 0 \| rate: / Reply Quote

scrap Send message Joined: 3 Jan 10 Posts: 3 Credit: 29,391,330 RAC: 0 Level Scientific publications	Message 14948 - Posted: 3 Feb 2010 \| 10:39:07 UTC - in response to Message 14947.
	Really good news !
	ID: 14948 \| Rating: 0 \| rate: / Reply Quote

ocgbargas Send message Joined: 18 Jun 09 Posts: 12 Credit: 4,327,530 RAC: 0 Level Scientific publications	Message 14952 - Posted: 3 Feb 2010 \| 14:52:07 UTC - in response to Message 14948. Last modified: 3 Feb 2010 \| 14:56:32 UTC
	¿¿Solucionara esto el problema con las gtx260?? Les pongo lo que expuse en un post en el foro de mi equipo: "Somos muchisimas personas las que se han quejado en sus foros con las gtx260, pero no parece que haya voluntad de solucionarlo. He conseguido hacer 4 unidades. 3 de ellas beta y la otra normal. Pero tambien me ha tirado 4 betas y una normal. Llevo sin procesar en su proyecto desde hace mas de 6 meses. Parece mentira que gente que quiere apoyar tu proyecto y procesar en el, no obtenga respuesta por parte de los admin de su proyecto. Y que no me vengan con historias porque en mis estadisticas hay mas de 386.000 puntos con gpugrid y desde que hicieron algun cambio en sus unidades (cuando pasaron a cuda 2.3) se acabo, no puedo procesar. Me pregunto si realmente tendran interes en buscar el fin que buscan, porque si yo fuera admin de un proyecto y tuviera por ejemplo 2000 personas que quieren prestarme su grafica para avanzar en mi proyecto,(el unico medico ahora mismo sobre boinc) me partiria los cuernos en solucionarselo, que estan cediendo su dinero (grafica+luz+tiempo). Por otro lado eso que se esta llevando folding, al menos por mi parte, pero me gustaria tambien hacer mas puntos en boinc globalmente, pero no se puede." No pretengo crear malos royos ni nada parecido, pero creo que al menos hay que buscar una solucion. Lo pongo tambien en ingles traducido por google para el que no entienda español Does this problem with the GTX260? I put what I stated in a post on the forum from my computer: "We are many people that have complained in their forums with the GTX260, but there seems no will to fix it. I managed to make 4 pieces. 3 of them and one normal beta. But I was also thrown and a normal beta 4. I've been raw in his project for over 6 months. It's unbelievable that people who want to support your project and in the process, get no response from the admin of your project. And do not come to me with stories because in my statistics there are more than 386,000 points GPUGRID and since they made any changes in their units (they passed a cuda 2.3) is just, I can not be processed. I wonder if they really have an interest in seeking the order they want, because if I was admin of a project and had for example 2000 people who want to borrow your graphics to advance in my project (the only doctor on boinc now) I would depart the horns on solutions that are giving money (graphic + light + time). On the other hand that you are dragging folding, at least for me, but I would also like to do more overall points in boinc, but you can not. " Royos pretengo not create bad or anything, but I think at least one must find a solution. I put it also in English translation by google for the Spanish do not understand
	ID: 14952 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14955 - Posted: 3 Feb 2010 \| 15:06:20 UTC - in response to Message 14952. Last modified: 3 Feb 2010 \| 15:07:25 UTC
	The problem of the GTX260 depends on a bug of the nvidia FFT which Nvidia does not have the time to solve. Probably is very rare. We don't know what is exactly because there is not source code for the Nvidia FFT. In any case, we could not do anything more about it. Does it work with the new application? We don't know. There have been so many changes that the specific situation which causes the problem may have disappeared. Some users are now trying the beta application with GTX260. gdf
	ID: 14955 \| Rating: 0 \| rate: / Reply Quote

ignasi Send message Joined: 10 Apr 08 Posts: 254 Credit: 16,836,000 RAC: 0 Level Scientific publications	Message 14956 - Posted: 3 Feb 2010 \| 15:19:55 UTC - in response to Message 14955.
	We submitted 1000 production WUs to the beta application.
	ID: 14956 \| Rating: 0 \| rate: / Reply Quote

ocgbargas Send message Joined: 18 Jun 09 Posts: 12 Credit: 4,327,530 RAC: 0 Level Scientific publications	Message 14961 - Posted: 3 Feb 2010 \| 17:16:19 UTC Last modified: 3 Feb 2010 \| 17:18:14 UTC
	Y de repente sido un error NVIDIA?. Como dije en mi mensaje, tengo más de 386.000 puntos obtenidos con el proyecto sin un solo error. CUDA fue el cambio y terminar el proceso. And suddenly been an nvidia bug?. As I said in my message I have more than 386,000 points earned with the project without a single mistake. Cuda was the change and end the process.
	ID: 14961 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 14964 - Posted: 3 Feb 2010 \| 17:47:17 UTC - in response to Message 14961.
	ocgbargas, I think the FFT bug is related to Cuda version 2.3 and only caused some errors in some cards when using some applications to run some tasks. There are things you can do to reduce the likelihood of errors in older cards. For example, Configure Boinc not to use the GPU when the computer is in use. Use after 1min. If you are watching video either shut Boinc down and start it up later or use the Snooze button. You only need to do these things on some older cards (G92 cards and 65nm G200 first release cards)! As the applications have changed, the FFT bug might not be such a problem any more because the way the application uses CUDA has changed.
	ID: 14964 \| Rating: 0 \| rate: / Reply Quote

Michael Goetz Send message Joined: 2 Mar 09 Posts: 124 Credit: 60,073,744 RAC: 0 Level Scientific publications	Message 14966 - Posted: 3 Feb 2010 \| 18:30:49 UTC - in response to Message 14964.
	You only need to do these things on some older cards (G92 cards and 65nm G200 first release cards)! To be a bit more specific, it doesn't appear to be all 65nm G200 GPUs; I've only seen people reporting trouble with GTX260 cards. I haven't seen any reports of errors with the GTX280, which uses same 65 nm G200 chips, except with all 240 shaders activated. The GTX 260 uses chips that have some faulty cores or shaders and are degraded to either 192 or 216 shaders. The GTX 280 uses the 65nm G200 chips that are fully functional. Considering it's the same chip, I find it odd that the problem doesn't seem to affect the GTX 280. FWIW, I have a GTX 280 and have never had a problem. Mine's factory over-clocked too, so it's running a bit hotter and faster than nominal.
	ID: 14966 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14967 - Posted: 3 Feb 2010 \| 18:45:08 UTC - in response to Message 14966.
	The FFT bug did only affect GTX260 (mostly the old one) and from driver 182 onwards. So CUDA 2.1 was fine, CUDA 2.2 was not. Let's see if the beta by luck has improved the situation. gdf
	ID: 14967 \| Rating: 0 \| rate: / Reply Quote

ocgbargas Send message Joined: 18 Jun 09 Posts: 12 Credit: 4,327,530 RAC: 0 Level Scientific publications	Message 14978 - Posted: 4 Feb 2010 \| 7:30:11 UTC Last modified: 4 Feb 2010 \| 7:31:18 UTC
	Pues primera unidad probada y error despues de llevar 9 horas. Ha sido durante la noche que no tocaba nadie el pc. Os lo pongo detallado segun pone el visor de errores de win7 64. Nombre de la aplicación con errores: acemdbeta_6.08_windows_intelx86__cuda, versión: 0.0.0.0, marca de tiempo: 0x4b680f5f Nombre del módulo con errores: acemdbeta_6.08_windows_intelx86__cuda, versión: 0.0.0.0, marca de tiempo: 0x4b680f5f Código de excepción: 0x40000015 Desplazamiento de errores: 0x0003274d Id. del proceso con errores: 0x1b24 Hora de inicio de la aplicación con errores: 0x01caa51e0458ad34 Ruta de acceso de la aplicación con errores: D:\boinc\programData\projects\www.gpugrid.net\acemdbeta_6.08_windows_intelx86__cuda Ruta de acceso del módulo con errores: D:\boinc\programData\projects\www.gpugrid.net\acemdbeta_6.08_windows_intelx86__cuda Id. del informe: d4517c31-1125-11df-845c-00158307a3d0
	ID: 14978 \| Rating: 0 \| rate: / Reply Quote

ocgbargas Send message Joined: 18 Jun 09 Posts: 12 Credit: 4,327,530 RAC: 0 Level Scientific publications	Message 14987 - Posted: 4 Feb 2010 \| 14:41:32 UTC
	2ª unidad tambien error. Lo dejo por imposible.
	ID: 14987 \| Rating: 0 \| rate: / Reply Quote

GDF Volunteer moderator Project administrator Project developer Project tester Volunteer developer Volunteer tester Project scientist Send message Joined: 14 Mar 07 Posts: 1957 Credit: 629,356 RAC: 0 Level Scientific publications	Message 14988 - Posted: 4 Feb 2010 \| 15:22:47 UTC - in response to Message 14987.
	Yes, your errors are due to the FFT bug which therefore did not go away by simple luck. We are reimplementing the FFT with our own code. So you have to wait for that to get rid of this problem. gdf
	ID: 14988 \| Rating: 0 \| rate: / Reply Quote

ocgbargas Send message Joined: 18 Jun 09 Posts: 12 Credit: 4,327,530 RAC: 0 Level Scientific publications	Message 14993 - Posted: 4 Feb 2010 \| 18:34:35 UTC
	Pero y porque falla con Gpugrid y por ejemplo con collatz, milky o folding no ocurren esos fallos??.
	ID: 14993 \| Rating: 0 \| rate: / Reply Quote

skgiven Volunteer moderator Volunteer tester Send message Joined: 23 Apr 09 Posts: 3968 Credit: 1,995,359,260 RAC: 0 Level Scientific publications	Message 14999 - Posted: 5 Feb 2010 \| 0:33:04 UTC - in response to Message 14993.
	They use different code, and they have their own problems!
	ID: 14999 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 763,983,634 RAC: 144,723 Level Scientific publications	Message 15022 - Posted: 5 Feb 2010 \| 15:29:59 UTC
	I've noticed that ACEMD beta 6.08 (cuda) workunits leave my 9800 GT GPU running about 2 C cooler than the older application. Not sure whether to consider that an improvement or a problem.
	ID: 15022 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 763,983,634 RAC: 144,723 Level Scientific publications	Message 15023 - Posted: 5 Feb 2010 \| 15:34:28 UTC - in response to Message 14966.
	You only need to do these things on some older cards (G92 cards and 65nm G200 first release cards)! To be a bit more specific, it doesn't appear to be all 65nm G200 GPUs; I've only seen people reporting trouble with GTX260 cards. I haven't seen any reports of errors with the GTX280, which uses same 65 nm G200 chips, except with all 240 shaders activated. The GTX 260 uses chips that have some faulty cores or shaders and are degraded to either 192 or 216 shaders. The GTX 280 uses the 65nm G200 chips that are fully functional. Considering it's the same chip, I find it odd that the problem doesn't seem to affect the GTX 280. FWIW, I have a GTX 280 and have never had a problem. Mine's factory over-clocked too, so it's running a bit hotter and faster than nominal. Could it be a problem in handling the way faulty shaders or cores are deactivated? If so, I would not expect to see the same problem on a GTX 280, since none are deactivated.
	ID: 15023 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 763,983,634 RAC: 144,723 Level Scientific publications	Message 15024 - Posted: 5 Feb 2010 \| 15:39:19 UTC - in response to Message 14967. Last modified: 5 Feb 2010 \| 16:24:53 UTC
	The FFT bug did only affect GTX260 (mostly the old one) and from driver 182 onwards. So CUDA 2.1 was fine, CUDA 2.2 was not. Let's see if the beta by luck has improved the situation. gdf Is it possible to do a separate build for the GTX260 (and any other machines using the old drivers) using the CUDA 2.1 SDK, but with the same source code otherwise? You could then test whether this build, but the newer drivers, works with the problem GTX260s or not, or whether any machines with the problem GTX260s need to switch back to the older drivers. I'd expect similar results for any other types of Nvidia GPU boards that the newer CUDA versions do not support well enough. If your server can detect which driver the machine is using, it could use this information to determine which application version to include; otherwise, you could just include both application versions and add a wrapper to determine which one is actually used. That would allow people with a GTX260, but one without the problem, to use the other application program instead if it runs faster. Another idea you could try, if you think the problem is in a DLL, is to try some builds with a mixture of CUDA 2.1 DLLs and more recent CUDA version DLLs. Another thought: Is it possible to mark parts of the results with which cores and shaders were used to calculate them? If so, you could then have the server build a file to specify additional cores and shaders that a particular machine should avoid using for any future workunits.
	ID: 15024 \| Rating: 0 \| rate: / Reply Quote

Michael Goetz Send message Joined: 2 Mar 09 Posts: 124 Credit: 60,073,744 RAC: 0 Level Scientific publications	Message 15026 - Posted: 5 Feb 2010 \| 16:49:16 UTC - in response to Message 15024.
	Another thought: Is it possible to mark parts of the results with which cores and shaders were used to calculate them? If so, you could then have the server build a file to specify additional cores and shaders that a particular machine should avoid using for any future workunits. I don't think that's possible under CUDA. You just tell CUDA, "Here, run these 10,000 copies of this kernal and tell me when you're done". All the fun stuff of assigning all that work to the individual multi-processors is done under the hood by the CUDA libraries. I don't recall ever seeing anything that would let you know where something actually ran.
	ID: 15026 \| Rating: 0 \| rate: / Reply Quote

robertmiles Send message Joined: 16 Apr 09 Posts: 503 Credit: 763,983,634 RAC: 144,723 Level Scientific publications	Message 15027 - Posted: 5 Feb 2010 \| 17:05:04 UTC - in response to Message 15026. Last modified: 5 Feb 2010 \| 17:12:52 UTC
	Then you may need to use a different CUDA library, such as one from the CUDA 2.1 SDK. You could also look for a way to tell the software to exclude some of the cores and shaders marked as usable from actually being used. Going from the results of doing that to determining which ones to exclude on a particular machine would be slower, though.
	ID: 15027 \| Rating: 0 \| rate: / Reply Quote

Post to thread

Message boards : News : New beta Nvidia application 60% faster.

	About	Science	Volunteers	Performance	Forum	Join us	Donate