Posts by Yeti

1) Message boards : AQUA@home : Wrong message for "no work available" (Message 3690)
Posted 404 days ago by Yeti
That's the question I started this tread with.


Uups, I missed that

They have written that single-cores will no longer get work because of the new multi-CPU-usage. So I guess BOINC 5.x will not be able to serve all the needs of the actual project-scheduler
2) Message boards : AQUA@home : Wrong message for "no work available" (Message 3688)
Posted 404 days ago by Yeti
perhaps because of boinc 5.x ?
3) Message boards : AQUA@home : Wrong message for "no work available" (Message 3686)
Posted 404 days ago by Yeti
So far as I know that there are no cudas among your boxes, set all points to yes ...
4) Message boards : AQUA@home : Errors (Message 3084)
Posted 420 days ago by Yeti
While resaerching about the above mentioned problem, I checked if other cruncher have succesfull processed my faulty WUs and I found not much that were going on fine.

And thwn I found, less than 3 hours of crunching and 75,000 credits ?
Can this really be ?

Here is the WU: http://aqua.dwavesys.com/workunit.php?wuid=710159
5) Message boards : AQUA@home : Errors (Message 3083)
Posted 420 days ago by Yeti
Now I'm back, with bad news.

I have checked all, I could think of, nothing has helped.

- changed BOINC-Install-Mode from Service to normal Application
- upgraded BOINC to 6.6.37
- Went back to an older NVIDIA-Driver

Nothinh has helped; WUs still crash after some crunching times.

As this is not happening on Seti, I think it must have to do something with Aqua.

I have checked and protokolled what I could find in output_file_

Here it is:

AQUAPT started on Tue Jul 14 12:28:40 2009

Executable file: projects/aqua.dwavesys.com/aqua_3.31_windows_intelx86__cuda.exe
----------------------------
input_file = 14_240_4_Ising_ndg.txt_08jul09-240-5M-64-a_14_30
N_space = 240
instance = 0
n_sweep = 5000000
stopTime = 0.000
Ltau = 64
Dtau = 1.000000
maxPhi = -0.280000
precision_bits = 4
iseed = 0
num_meas_phis = 26
Phi_meas_min = -0.300000
Phi_meas_max = -0.280000
Phi_sim_min = -0.305000
n_bet_meas = 20
num_meas_phis = 26
max_PT_chains = 309
PT_exchange_period = 5
PT_measurement_period = 20
alpha_swap_min = 0.250000
n_moves_btwn_phi_updt = 3000
n_moves_burn_in = 10000
conv_check_period = 2147483647
cuda_device = 0
checkpoint_file = AQUAcheckpoint

Warning: GPU lacks resources. Increasing number of blocks to 4, decreasing packed_chains to 8. Restarting the kernel.

...

Warning: GPU lacks resources. Increasing number of blocks to 19, decreasing packed_chains to 8. Restarting the kernel.
AQUAPT started on Tue Jul 14 21:17:00 2009

Executable file: projects/aqua.dwavesys.com/aqua_3.31_windows_intelx86__cuda.exe
----------------------------
input_file = 14_240_4_Ising_ndg.txt_08jul09-240-5M-64-a_14_30
N_space = 240
instance = 0
n_sweep = 5000000
stopTime = 0.000
Ltau = 64
Dtau = 1.000000
maxPhi = -0.280000
precision_bits = 4
iseed = 0
num_meas_phis = 26
Phi_meas_min = -0.300000
Phi_meas_max = -0.280000
Phi_sim_min = -0.305000
n_bet_meas = 20
num_meas_phis = 26
max_PT_chains = 309
PT_exchange_period = 5
PT_measurement_period = 20
alpha_swap_min = 0.250000
n_moves_btwn_phi_updt = 3000
n_moves_burn_in = 10000
conv_check_period = 2147483647
cuda_device = 0
checkpoint_file = AQUAcheckpoint
----------------------------


***********************
Number of discovered CUDA devices : 1

The Properties of the Device with ID 0 are:

Device Name : Quadro NVS 290
Device Revision : 1.1
multiProcessorCount : 2
clockRate : 0.918 GHz
regsPerBlock : 8192
maxThreadsPerBlock : 512
maxThreadsDim : 512 x 512 x 64
maxGridSize : 65535 x 65535 x 1
warpSize : 32
deviceOverlap (concurrent memory copy and execution) : Yes

totalGlobalMem : 268107776 bytes
sharedMemPerBlock : 16384 bytes
memPitch : 262144 bytes
totalConstMem : 65536 bytes
textureAlignment : 256 bytes
***********************


Warning: Failed to set CUDA to blocking mode!

***********************
Number of discovered CUDA devices : 1

The Properties of the Device with ID 0 are:

Device Name : Device Emulation (CPU)
Device Revision : 9999.9999
multiProcessorCount : 16
clockRate : 1.350 GHz
regsPerBlock : 8192
maxThreadsPerBlock : 512
maxThreadsDim : 512 x 512 x 64
maxGridSize : 65535 x 65535 x 1
warpSize : 1
deviceOverlap (concurrent memory copy and execution) : No

totalGlobalMem : -1 bytes
sharedMemPerBlock : 16384 bytes
memPitch : 262144 bytes
totalConstMem : 65536 bytes
textureAlignment : 256 bytes
***********************

Error! kernel start failed: no CUDA-capable device is available. File c:\Kamran\AQUAParallelTempering_V2\cuda_genrand.inc, Line: 44
6) Message boards : AQUA@home : Errors (Message 2764)
Posted 424 days ago by Yeti
Yeti have you tried switching AQUA's CUDA driver with SETI's?


We had done this already together in the past and it didn't help.

At the moment I am investigating another possible reason. I will come back when I have a result or a trend.
7) Message boards : AQUA@home : Errors (Message 2722)
Posted 425 days ago by Yeti
HM, crashed again :-((

It has run succesfull for 2 or 3 hours. Then, when BOINC tried to resume the application, the following happens:

09/07/2009 22:36:58 AQUA@home Restarting task 08jul09-240-5M-64-a_26_56_0 using AQUA_CUDA version 331
09/07/2009 22:36:59 AQUA@home Task 08jul09-240-5M-64-a_26_56_0 exited with zero status but no 'finished' file
09/07/2009 22:36:59 AQUA@home If this happens repeatedly you may need to reset the project.
....
This repeats 100 or more times, then:

07/09/09 22:38:39 AQUA@home If this happens repeatedly you may need to reset the project.
07/09/09 22:38:39 AQUA@home Restarting task 08jul09-240-5M-64-a_26_56_0 using AQUA_CUDA version 331
07/09/09 22:38:40 AQUA@home Computation for task 08jul09-240-5M-64-a_26_56_0 finished
07/09/09 22:38:44 AQUA@home Started upload of 08jul09-240-5M-64-a_26_56_0_0
07/09/09 22:38:51 AQUA@home Finished upload of 08jul09-240-5M-64-a_26_56_0_0

The result says:

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>

</stderr_txt>
]]>



As of these errors, I have switched the client to crunch Seti-Cuda-WUs; it has already crunched 3 WUs succesfull in last night.

So, the problem seems not to be my card or my driver, but something within the Aqua-Cuda-Application.

8) Message boards : AQUA@home : Errors (Message 2678)
Posted 425 days ago by Yeti
HM, crashed again :-((

It has run succesfull for 2 or 3 hours. Then, when BOINC tried to resume the application, the following happens:

09/07/2009 22:36:58 AQUA@home Restarting task 08jul09-240-5M-64-a_26_56_0 using AQUA_CUDA version 331
09/07/2009 22:36:59 AQUA@home Task 08jul09-240-5M-64-a_26_56_0 exited with zero status but no 'finished' file
09/07/2009 22:36:59 AQUA@home If this happens repeatedly you may need to reset the project.
....
This repeats 100 or more times, then:

07/09/09 22:38:39 AQUA@home If this happens repeatedly you may need to reset the project.
07/09/09 22:38:39 AQUA@home Restarting task 08jul09-240-5M-64-a_26_56_0 using AQUA_CUDA version 331
07/09/09 22:38:40 AQUA@home Computation for task 08jul09-240-5M-64-a_26_56_0 finished
07/09/09 22:38:44 AQUA@home Started upload of 08jul09-240-5M-64-a_26_56_0_0
07/09/09 22:38:51 AQUA@home Finished upload of 08jul09-240-5M-64-a_26_56_0_0

The result says:

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>

</stderr_txt>
]]>

9) Message boards : AQUA@home : Errors (Message 2570)
Posted 425 days ago by Yeti
Yeti and Allen, did you recently upgrade your drivers? Could you run the CUDA apps at any time in the past?


How can I upgrade the driver; I wrote that I already installed the latest one ?

And no, I never had a chance running a cuda-job succesfull.

My box has downloaded a fresh unit and it has run for 30 minutes without a crash. Let's see, what will happen today in the evening, when the box can resume the CUDA-task.

Yeti
10) Message boards : AQUA@home : Errors (Message 2550)
Posted 426 days ago by Yeti
CUDA 3.31 errored out with:

09/07/2009 09:39:20 AQUA@home Starting 08jul09-240-5M-64-a_20_51_0
09/07/2009 09:39:21 AQUA@home Starting task 08jul09-240-5M-64-a_20_51_0 using AQUA_CUDA version 331
09/07/2009 09:39:23 AQUA@home Task 08jul09-240-5M-64-a_20_51_0 exited with zero status but no 'finished' file
09/07/2009 09:39:23 AQUA@home If this happens repeatedly you may need to reset the project.
09/07/2009 09:39:23 AQUA@home Restarting task 08jul09-240-5M-64-a_20_51_0 using AQUA_CUDA version 331
09/07/2009 09:39:24 AQUA@home Task 08jul09-240-5M-64-a_20_51_0 exited with zero status but no 'finished' file
09/07/2009 09:39:24 AQUA@home If this happens repeatedly you may need to reset the project.
... (several 100 times the same)
09/07/2009 09:41:03 AQUA@home Task 08jul09-240-5M-64-a_20_51_0 exited with zero status but no 'finished' file
09/07/2009 09:41:03 AQUA@home If this happens repeatedly you may need to reset the project.
09/07/2009 09:41:03 AQUA@home Restarting task 08jul09-240-5M-64-a_20_51_0 using AQUA_CUDA version 331
09/07/2009 09:41:04 AQUA@home Computation for task 08jul09-240-5M-64-a_20_51_0 finished
09/07/2009 09:41:07 AQUA@home Started upload of 08jul09-240-5M-64-a_20_51_0_0
09/07/2009 09:41:11 AQUA@home Finished upload of 08jul09-240-5M-64-a_20_51_0_0

The WU shows on project-page:

<core_client_version>6.6.36</core_client_version>
<![CDATA[
<message>
too many exit(0)s
</message>
<stderr_txt>

</stderr_txt>
]]>

--------------------------

I'm using BOINC 6.6.36, my CUDA-driver is 186.18 (latest)


Next 10 posts

Home | My Account | Message Boards


Copyright © 2010 D-Wave Systems Inc.