找回密码
 新注册用户
搜索
查看: 3809|回复: 6

[求助] Bad State detected... attempting to resume from last good checkpoint

[复制链接]
发表于 2013-9-29 00:16:31 | 显示全部楼层 |阅读模式
14:34:20:WU01:FS00:0x17:Completed 300000 out of 2500000 steps (12%)
14:35:29:WU01:FS00:0x17:Bad State detected... attempting to resume from last good checkpoint
14:38:01:WU00:FS01:0xa3:Completed 130000 out of 500000 steps  (26%)
14:42:20:WU01:FS00:0x17:Completed 275000 out of 2500000 steps (11%)
14:47:14:WU00:FS01:0xa3:Completed 135000 out of 500000 steps  (27%)
14:49:11:WU01:FS00:0x17:Completed 300000 out of 2500000 steps (12%)
14:56:18:WU01:FS00:0x17:Completed 325000 out of 2500000 steps (13%)
14:56:26:WU00:FS01:0xa3:Completed 140000 out of 500000 steps  (28%)
15:03:09:WU01:FS00:0x17:Completed 350000 out of 2500000 steps (14%)
15:05:40:WU00:FS01:0xa3:Completed 145000 out of 500000 steps  (29%)
15:10:15:WU01:FS00:0x17:Completed 375000 out of 2500000 steps (15%)
15:14:52:WU00:FS01:0xa3:Completed 150000 out of 500000 steps  (30%)
15:17:06:WU01:FS00:0x17:Completed 400000 out of 2500000 steps (16%)
15:24:05:WU00:FS01:0xa3:Completed 155000 out of 500000 steps  (31%)
15:29:46:WU01:FS00:0x17:Completed 425000 out of 2500000 steps (17%)
15:33:18:WU00:FS01:0xa3:Completed 160000 out of 500000 steps  (32%)
15:42:31:WU00:FS01:0xa3:Completed 165000 out of 500000 steps  (33%)
15:48:05:WU01:FS00:0x17:Completed 450000 out of 2500000 steps (18%)
15:49:13:WU01:FS00:0x17:Bad State detected... attempting to resume from last good checkpoint
15:51:45:WU00:FS01:0xa3:Completed 170000 out of 500000 steps  (34%)
15:56:04:WU01:FS00:0x17:Completed 425000 out of 2500000 steps (17%)
16:00:58:WU00:FS01:0xa3:Completed 175000 out of 500000 steps  (35%)
16:02:54:WU01:FS00:0x17:Completed 450000 out of 2500000 steps (18%)
16:10:01:WU01:FS00:0x17:Completed 475000 out of 2500000 steps (19%)
16:10:12:WU00:FS01:0xa3:Completed 180000 out of 500000 steps  (36%)
回复

使用道具 举报

发表于 2013-9-29 00:29:19 来自手机 | 显示全部楼层
显卡艹过头了,不稳定?
回复

使用道具 举报

发表于 2013-9-29 08:55:00 | 显示全部楼层
GPU 0 出错,适当降低GPU 0的频率或者小加电压
回复

使用道具 举报

 楼主| 发表于 2013-9-29 12:09:01 | 显示全部楼层
金鹏 发表于 2013-9-29 08:55
GPU 0 出错,适当降低GPU 0的频率或者小加电压

跑8900包才会这样,7810两个包都没问题,580 891核心,我先加点电压试一下,照理说1.1v的电压完全够的,以前就是这么跑,从来没问题,只是一直都没接过8900包
回复

使用道具 举报

 楼主| 发表于 2013-9-29 13:18:31 | 显示全部楼层
我怀疑是显存超到4200的缘故,这块580显存很不能超,显存超多了的话,核心同频率一样的电压也会不稳,现在显存降到默认试一试,这块580,体质89.3%,按说1.088跑891核心都是可以的
回复

使用道具 举报

 楼主| 发表于 2013-9-29 16:42:56 | 显示全部楼层
好像就是显存的原因,刚刚跑完一个7810包,没问题,又接了一个8900包,目前也没出错
回复

使用道具 举报

发表于 2013-9-29 16:56:23 | 显示全部楼层
machou 发表于 2013-9-29 13:18
我怀疑是显存超到4200的缘故,这块580显存很不能超,显存超多了的话,核心同频率一样的电压也会不稳,现在 ...

FAH对显存不敏感,PPD几乎不受显存频率影响,我都是超核心降显存频率跑
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 新注册用户

本版积分规则

论坛官方淘宝店开业啦~

Archiver|手机版|小黑屋|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2024-5-4 08:17

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表