|
楼主 |
发表于 2016-9-9 16:14:29
|
显示全部楼层
本帖最后由 horst1981 于 2016-9-9 16:19 编辑
这个我不知道要看哪个文件。
请问需要看哪个文件?
我的观察是:
卡包重守护重启了客户端,
生成log-20160909-111819.txt
这里面的配置参数是启正确的(图中1的红圈里),
但实际上客户端slot里的两个GPU opencl-index都已经变成-1了。
于是,就有了找不到正确的slot(图中2的红圈里):
由于后面我没注意,所以这个log里面后面的内容都是一直出错。
刚才重新查看logs,发现昨天少传了一个log,我发现问题后重新设置才生成的。
log-20160909-012610.txt
(37.82 KB, 下载次数: 3606)
11:18:19:***********************************************************************
11:18:19:<config>
11:18:19: <!-- Network -->
11:18:19: <proxy v=':8080'/>
11:18:19:
11:18:19: <!-- Slot Control -->
11:18:19: <power v='full'/>
11:18:19:
11:18:19: <!-- User Information -->
11:18:19: <passkey v='********************************'/>
11:18:19: <team v='3213'/>
11:18:19: <user v='horst1981'/>
11:18:19:
11:18:19: <!-- Folding Slots -->
11:18:19: <slot id='0' type='GPU'>
11:18:19: <cuda-index v='2'/>
11:18:19: <gpu-index v='1'/>
11:18:19: </slot>
11:18:19: <slot id='1' type='GPU'>
11:18:19: <cuda-index v='1'/>
11:18:19: <gpu-index v='2'/>
11:18:19: </slot>
11:18:19:</config>
11:18:19:Trying to access database...
11:18:19:Successfully acquired database lock
11:18:19:Enabled folding slot 00: READY gpu:1:Hawaii [Radeon R9 200 Series]
11:18:19:Enabled folding slot 01: READY gpu:2:Ellesmere XT [Radeon RX 480]
11:18:19:ERROR:No compute devices matched GPU #2 ATI:5 Ellesmere XT [Radeon RX 480]. You may need to update your graphics drivers.
11:18:19:WU00:FS00:Starting
11:18:19:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 10476 -checkpoint 15 -opencl-platform 0 -gpu-vendor ati -gpu 0
11:18:19:WU00:FS00:Started FahCore on PID 18636
11:18:20:WU00:FS00:Core PID:22716
11:18:20:WU00:FS00:FahCore 0x21 started
11:18:20:WU01:FS01:Starting
11:18:20:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found
11:18:20:WU01:FS01:Starting
11:18:20:ERROR:WU01:FS01:Failed to start core: OpenCL device matching slot 1 not found
(以上部分,未更改客户端设置,没有opencl-index,实际客户端里opencl-index显示-1。还是出错,下面我暂停,然后开始手动更改设置)
11:18:20:WU00:FS00:0x21:*********************** Log Started 2016-09-08T11:18:20Z ***********************
11:18:20:WU00:FS00:0x21:Project: 10495 (Run 14, Clone 21, Gen 30)
11:18:20:WU00:FS00:0x21:Unit: 0x000000298ca304f556ba63c83d0aad0b
11:18:20:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
11:18:20:WU00:FS00:0x21:Machine: 0
11:18:20:WU00:FS00:0x21:Digital signatures verified
11:18:20:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
11:18:20:WU00:FS00:0x21:Version 0.0.17
11:18:22:WU00:FS00:0x21: Found a checkpoint file
11:18:25:FS00:Paused
11:18:25:FS01:Paused
11:18:26:FS00:Shutting core down
11:18:26:WU00:FS00:0x21:WARNING:Console control signal 1 on PID 22716
11:18:26:WU00:FS00:0x21:Exiting, please wait. . .
11:18:36:WU00:FS00:0x21:Completed 500000 out of 5000000 steps (10%)
11:18:36:WU00:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
11:18:36:WU00:FS00:0x21:Folding@home Core Shutdown: INTERRUPTED
11:18:36:WU00:FS00:FahCore returned: INTERRUPTED (102 = 0x66)
11:18:46:Removing old file 'configs/config-20160830-004440.xml'
11:18:46:Saving configuration to config.xml
11:18:46:<config>
11:18:46: <!-- Network -->
11:18:46: <proxy v=':8080'/>
11:18:46:
11:18:46: <!-- Slot Control -->
11:18:46: <power v='full'/>
11:18:46:
11:18:46: <!-- User Information -->
11:18:46: <passkey v='********************************'/>
11:18:46: <team v='3213'/>
11:18:46: <user v='horst1981'/>
11:18:46:
11:18:46: <!-- Folding Slots -->
11:18:46: <slot id='0' type='GPU'>
11:18:46: <cuda-index v='2'/>
11:18:46: <gpu-index v='1'/>
11:18:46: <opencl-index v='2'/>
11:18:46: <paused v='true'/>
11:18:46: </slot>
11:18:46: <slot id='1' type='GPU'>
11:18:46: <cuda-index v='1'/>
11:18:46: <gpu-index v='2'/>
11:18:46: <opencl-index v='1'/>
11:18:46: <paused v='true'/>
11:18:46: </slot>
11:18:46:</config>
11:18:49:FS00:Unpaused
11:18:49:FS01:Unpaused
11:18:49:WU00:FS00:Starting
11:18:49:WU00:FS00:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 10476 -checkpoint 15 -opencl-platform 0 -gpu-vendor ati -gpu 2
11:18:49:WU00:FS00:Started FahCore on PID 1264
11:18:49:WU00:FS00:Core PID:25320
11:18:49:WU00:FS00:FahCore 0x21 started
11:18:49:WU00:FS00:0x21:*********************** Log Started 2016-09-08T11:18:49Z ***********************
11:18:49:WU00:FS00:0x21:Project: 10495 (Run 14, Clone 21, Gen 30)
11:18:49:WU00:FS00:0x21:Unit: 0x000000298ca304f556ba63c83d0aad0b
11:18:49:WU00:FS00:0x21:CPU: 0x00000000000000000000000000000000
11:18:49:WU00:FS00:0x21:Machine: 0
11:18:49:WU00:FS00:0x21:Digital signatures verified
11:18:49:WU00:FS00:0x21:Folding@home GPU Core21 Folding@home Core
11:18:49:WU00:FS00:0x21:Version 0.0.17
11:18:51:WU00:FS00:0x21: Found a checkpoint file
11:19:04:WU00:FS00:0x21:Completed 500000 out of 5000000 steps (10%)
11:19:05:WU00:FS00:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
11:19:20:Removing old file 'configs/config-20160830-004642.xml'
11:19:20:Saving configuration to config.xml
11:19:20:<config>
11:19:20: <!-- Network -->
11:19:20: <proxy v=':8080'/>
11:19:20:
11:19:20: <!-- Slot Control -->
11:19:20: <power v='full'/>
11:19:20:
11:19:20: <!-- User Information -->
11:19:20: <passkey v='********************************'/>
11:19:20: <team v='3213'/>
11:19:20: <user v='horst1981'/>
11:19:20:
11:19:20: <!-- Folding Slots -->
11:19:20: <slot id='0' type='GPU'>
11:19:20: <cuda-index v='2'/>
11:19:20: <gpu-index v='1'/>
11:19:20: <opencl-index v='2'/>
11:19:20: </slot>
11:19:20: <slot id='1' type='GPU'>
11:19:20: <cuda-index v='1'/>
11:19:20: <gpu-index v='2'/>
11:19:20: <opencl-index v='1'/>
11:19:20: </slot>
11:19:20:</config>
11:19:20:WU01:FS01:Starting
11:19:20:WU01:FS01:Running FahCore: "C:\Program Files\FAHClient/FAHCoreWrapper.exe" C:\ProgramData\FAHClient\cores/web.stanford.edu/~pande/Win32/AMD64/ATI/R600/Core_21.fah/FahCore_21.exe -dir 01 -suffix 01 -version 704 -lifeline 10476 -checkpoint 15 -gpu-vendor ati -gpu 1
11:19:20:WU01:FS01:Started FahCore on PID 23724
11:19:20:WU01:FS01:Core PID:19108
11:19:20:WU01:FS01:FahCore 0x21 started
(以上更改后保存,没有重启客户端,双卡正常开算,下面就都正常了。)
11:19:21:WU01:FS01:0x21:*********************** Log Started 2016-09-08T11:19:20Z ***********************
11:19:21:WU01:FS01:0x21:Project: 11703 (Run 0, Clone 414, Gen 72)
11:19:21:WU01:FS01:0x21:Unit: 0x0000005d8ca304f35689621c4cb67036
11:19:21:WU01:FS01:0x21:CPU: 0x00000000000000000000000000000000
11:19:21:WU01:FS01:0x21:Machine: 1
11:19:21:WU01:FS01:0x21:Reading tar file core.xml
11:19:21:WU01:FS01:0x21:Reading tar file system.xml
11:19:21:WU01:FS01:0x21:Reading tar file integrator.xml
11:19:21:WU01:FS01:0x21:Reading tar file state.xml
11:19:22:WU01:FS01:0x21:Digital signatures verified
11:19:22:WU01:FS01:0x21:Folding@home GPU Core21 Folding@home Core
11:19:22:WU01:FS01:0x21:Version 0.0.17
11:19:39:WU01:FS01:0x21:Completed 0 out of 5000000 steps (0%)
11:19:39:WU01:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
11:25:22:WU00:FS00:0x21:Completed 550000 out of 5000000 steps (11%)
11:25:49:WU01:FS01:0x21:Completed 50000 out of 5000000 steps (1%)
11:31:39:WU00:FS00:0x21:Completed 600000 out of 5000000 steps (12%)
11:31:57:WU01:FS01:0x21:Completed 100000 out of 5000000 steps (2%)
11:37:56:WU00:FS00:0x21:Completed 650000 out of 5000000 steps (13%)
11:38:05:WU01:FS01:0x21:Completed 150000 out of 5000000 steps (3%)
11:44:11:WU00:FS00:0x21:Completed 700000 out of 5000000 steps (14%)
11:44:13:WU01:FS01:0x21:Completed 200000 out of 5000000 steps (4%)
11:50:21:WU01:FS01:0x21:Completed 250000 out of 5000000 steps (5%)
11:50:27:WU00:FS00:0x21:Completed 750000 out of 5000000 steps (15%)
|
|