|
楼主 |
发表于 2016-8-10 13:02:50
|
显示全部楼层
log file里面看不出来任何异常的地方
04:40:02:Enabled folding slot 00: PAUSED cpu:15 (by user)
04:40:02:Enabled folding slot 01: PAUSED gpu:1:GP104 [GeForce GTX 1080] (by user)
04:40:07:16:127.0.0.1:New Web connection
04:40:16:FS01:Unpaused
04:40:16:WU00:FS01:Starting
04:40:16:WU00:FS01:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/NVIDIA/Fermi/beta/Core_21.fah/FahCore_21.exe -dir 00 -suffix 01 -version 704 -lifeline 4036 -checkpoint 3 -gpu 0 -gpu-vendor nvidia
04:40:16:WU00:FS01:Started FahCore on PID 10464
04:40:16:WU00:FS01:Core PID:15824
04:40:16:WU00:FS01:FahCore 0x21 started
04:40:18:WU00:FS01:0x21:*********************** Log Started 2016-08-10T04:40:17Z ***********************
04:40:18:WU00:FS01:0x21:Project: 11416 (Run 3, Clone 22, Gen 0)
04:40:18:WU00:FS01:0x21:Unit: 0x000000008ca304f156e81eaa737d8a33
04:40:18:WU00:FS01:0x21:CPU: 0x00000000000000000000000000000000
04:40:18:WU00:FS01:0x21:Machine: 1
04:40:18:WU00:FS01:0x21:Digital signatures verified
04:40:18:WU00:FS01:0x21:Folding@home GPU Core21 Folding@home Core
04:40:18:WU00:FS01:0x21:Version 0.0.17
04:40:18:WU00:FS01:0x21: Found a checkpoint file
04:41:03:Removing old file 'configs/config-20160808-202713.xml'
04:41:03:Saving configuration to config.xml
04:41:03:<config>
04:41:03: <!-- Folding Core -->
04:41:03: <checkpoint v='3'/>
04:41:03:
04:41:03: <!-- Folding Slot Configuration -->
04:41:03: <client-type v='beta'/>
04:41:03:
04:41:03: <!-- Network -->
04:41:03: <proxy v=':8080'/>
04:41:03:
04:41:03: <!-- Slot Control -->
04:41:03: <power v='FULL'/>
04:41:03:
04:41:03: <!-- User Information -->
04:41:03: <passkey v='********************************'/>
04:41:03: <team v='3213'/>
04:41:03: <user v='Azurewind'/>
04:41:03:
04:41:03: <!-- Folding Slots -->
04:41:03: <slot id='0' type='CPU'>
04:41:03: <paused v='true'/>
04:41:03: </slot>
04:41:03: <slot id='1' type='GPU'/>
04:41:03:</config>
04:41:20:WU00:FS01:0x21:Completed 625000 out of 5000000 steps (12%)
04:41:20:WU00:FS01:0x21:Temperature control disabled. Requirements: single Nvidia GPU, tmax must be < 110 and twait >= 900
04:43:45:WU00:FS01:0x21:Completed 650000 out of 5000000 steps (13%)
04:46:17:FS00:Unpaused
04:46:17:WU01:FS00:Starting
04:46:17:WU01:FS00:Running FahCore: "C:\Program Files (x86)\FAHClient/FAHCoreWrapper.exe" C:/ProgramData/FAHClient/cores/web.stanford.edu/~pande/Win32/AMD64/beta/Core_a4.fah/FahCore_a4.exe -dir 01 -suffix 01 -version 704 -lifeline 4036 -checkpoint 3 -np 15
04:46:17:WU01:FS00:Started FahCore on PID 14564
04:46:17:WU01:FS00:Core PID:11632
04:46:17:WU01:FS00:FahCore 0xa4 started
04:46:18:WU01:FS00:0xa4:
04:46:18:WU01:FS00:0xa4:*------------------------------*
04:46:18:WU01:FS00:0xa4:Folding@Home Gromacs GB Core
04:46:18:WU01:FS00:0xa4:Version 2.27 (Dec. 15, 2010)
04:46:18:WU01:FS00:0xa4:
04:46:18:WU01:FS00:0xa4:Preparing to commence simulation
04:46:18:WU01:FS00:0xa4:- Looking at optimizations...
04:46:18:WU01:FS00:0xa4:- Files status OK
04:46:18:WU01:FS00:0xa4:- Expanded 732547 -> 2132948 (decompressed 291.1 percent)
04:46:18:WU01:FS00:0xa4:Called DecompressByteArray: compressed_data_size=732547 data_size=2132948, decompressed_data_size=2132948 diff=0
04:46:18:WU01:FS00:0xa4:- Digital signature verified
04:46:18:WU01:FS00:0xa4:
04:46:18:WU01:FS00:0xa4:Project: 8617 (Run 0, Clone 10, Gen 0)
04:46:18:WU01:FS00:0xa4:
04:46:18:WU01:FS00:0xa4:Assembly optimizations on if available.
04:46:18:WU01:FS00:0xa4:Entering M.D.
04:46:24:WU01:FS00:0xa4:Using Gromacs checkpoints
04:46:24:WU01:FS00:0xa4:Mapping NT from 15 to 15
04:46:24:WU01:FS00:0xa4:Resuming from checkpoint
04:46:24:WU01:FS00:0xa4:Verified 01/wudata_01.log
04:46:24:WU01:FS00:0xa4:Verified 01/wudata_01.trr
04:46:24:WU01:FS00:0xa4:Verified 01/wudata_01.xtc
04:46:24:WU01:FS00:0xa4:Verified 01/wudata_01.edr
04:46:24:WU01:FS00:0xa4:Completed 315080 out of 2500000 steps (12%)
04:48:12:WU01:FS00:0xa4:Completed 325000 out of 2500000 steps (13%)
04:48:27:WU00:FS01:0x21:Completed 700000 out of 5000000 steps (14%)
但是debug那个文件里:
[0809/134837:ERROR:process_info.cc(608)] range at 0xf3fc3ce400000000, size 0x19b fully unreadable
[0809/134837:ERROR:process_info.cc(608)] range at 0xf3fc3d4800000000, size 0x19b fully unreadable
[0809/134837:ERROR:process_info.cc(608)] range at 0x0, size 0x19b fully unreadable
[0809/134837:WARNING:in_range_cast.h(38)] value 1769324942362 out of range
[0809/134837:WARNING:in_range_cast.h(38)] value 1769324943780 out of range
[0809/134837:WARNING:in_range_cast.h(38)] value 1769324944120 out of range
[0809/134837:WARNING:in_range_cast.h(38)] value 1769324940872 out of range
[0809/134837:WARNING:in_range_cast.h(38)] value 1769324941090 out of range
出问题之后就变成这样了
但是后台可以看见运算进程还在进行
|
|