|
本帖最后由 金鹏 于 2020-4-27 21:46 编辑
关注0.05新内核的计算效率
1070TI和1070都不能跑,新内核总是崩溃
COVID (GPU, core22 0.0.5) projects 13400-13401 to FAH[size=1.3em]We've just released two new projects---13400 and 13401---that validate new features of the new core22 release 0.0.5 that we will use for prioritization of compounds for COVID-19 experimental collaborators to make and test!
Project descriptions: https://stats.foldingathome.org/project?p=13400
We've restricted these projects to linux only because we're testing out some new custom integrators that currently seem to perform poorly on win. We're working on improving that for the next batch!
Project 13400 : core22 0.0.5 : linux only [due to inefficiencies in win]
Stats Credit = 205000
timeout = 1.5
deadline 2.0
Project 13401 : core22 0.0.5 : linux only [due to inefficiencies in win]
Stats Credit = 65392
timeout = 0.8
deadline 1.0

JohnChoderaPande Group Member Posts: 145Joined: Sat Feb 23, 2013 6:59 am Re: COVID (GPU, core22 0.0.5) projects 13400-13401 to FAH[size=1.3em]I've done some analysis of the higher failure rates:
Out of 1551 returned WUs:
A. 411 contain ERROR:exception: There is no registered Platform called "OpenCL"
B. 151 contain Following exception occured: Particle coordinate is nan
C. 8 contain ERROR:exception: There is no registered Platform called "CPU"
We're investigating A and C, which shouldn't happen if the client and core use the same criteria for determining eligibility for core22 projects.
I'm also trying to reproduce the failures in B, which I haven't seen on our local GPU cluster full of GTX 1080, GTX 1080Ti, and RTX 2080s.
For now, we've collected a ton of useful data to examine, so I've set 13400/13401 to collect-only.
Thanks for your help, everyone!
~ john Chodera // MSKCC
|
|