找回密码
 新注册用户
搜索
查看: 5290|回复: 7

[求助] CoreStatus = 7A (122)--急求高手帮着找出病因

[复制链接]
发表于 2012-11-9 14:03:33 | 显示全部楼层 |阅读模式
本帖最后由 金鹏 于 2012-11-9 14:14 编辑

Z9PE D8 WS刷了3206新BIOS后,按照以前的设置上了103外频,结果跑了一天不到出现这个CoreStatus = 7A (122),貌似两个2687W ES 负载不均衡造成?
自动核心电压,进入BIOS查看CPU1 1.12V  CPU2  1.06V  内存电压1.51V  1600频率 时序9-9-9-24-128-1 ,开启内存交错和NUMA   ,功耗限制185瓦,高性能服务级别设置, 2个 CPU温度50-52度
求各路大神帮着分析原因
  1. (40%)
  2. [04:13:22] Completed 102500 out of 250000 steps  (41%)
  3. [04:29:15] Completed 105000 out of 250000 steps  (42%)
  4. [04:45:09] Completed 107500 out of 250000 steps  (43%)

  5. -------------------------------------------------------
  6. Program Gromacs, VERSION 4.5.3
  7. Source code file: /vspm58/VM/fah-converted/mnt/fah_windows_build/LinuxBuilds/gromacs-4.5.3/src/mdlib/pme.c, line: 534

  8. Fatal error:
  9. 6 particles communicated to PME node 16 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
  10. This usually means that your system is not well equilibrated.
  11. For more information and tips for troubleshooting, please check the GROMACS
  12. website at http://www.gromacs.org/Documentation/Errors
  13. -------------------------------------------------------

  14. Thanx for Using GROMACS - Have a Nice Day

  15. [04:54:22] mdrun returned 255
  16. [04:54:22] Going to send back what have done -- stepsTotalG=250000
  17. [04:54:22] Work fraction=162.9188 steps=250000.
  18. [04:54:26] logfile size=89648 infoLength=89648 edr=25 trr=1
  19. [04:54:26] logfile size: 89648 info=89648 bed=25 hdr=1
  20. [04:54:26] - Writing 90186 bytes of core data to disk...
  21. [04:54:26] Done: 89674 -> 11605 (compressed to 12.9 percent)
  22. [04:54:26]   ... Done.
  23. [04:59:27]
  24. [04:59:27] Folding@home Core Shutdown: UNSTABLE_MACHINE
  25. [04:59:27] CoreStatus = 7A (122)
  26. [04:59:27] Sending work to server
  27. [04:59:27] Project: 8101 (Run 21, Clone 0, Gen 71)

  28. [04:59:27] + Attempting to send results [November 9 04:59:27 UTC]
  29. [04:59:27] - Reading file work/wuresults_01.dat from core
  30. [04:59:27]   (Read 12117 bytes from disk)
  31. [04:59:27] Connecting to http://128.143.231.201:8080/
  32. [04:59:35] Posted data.
  33. [04:59:35] Initial: 0000; - Uploaded at ~1 kB/s
  34. [04:59:35] - Averaged speed for that direction ~147 kB/s
  35. [04:59:35] + Results successfully sent
  36. [04:59:35] Thank you for your contribution to Folding@Home.
  37. thekraken: The Kraken 0.7-pre15 (compiled Sun Jul  1 20:38:22 MST 2012 by root@Test)
  38. thekraken: Processor affinity wrapper for Folding@Home
  39. thekraken: The Kraken comes with ABSOLUTELY NO WARRANTY; licensed under GPLv2
  40. thekraken: PID: 1963
  41. thekraken: Logging to thekraken.log
  42. [04:59:35] Trying to send all finished work units
  43. [04:59:35] + No unsent completed units remaining.
  44. [04:59:35] - Preparing to get new work unit...
  45. [04:59:35] Cleaning up work directory
  46. [05:01:50] + Attempting to get work packet
  47. [05:01:50] Passkey found
  48. [05:01:50] - Will indicate memory of 32132 MB
  49. [05:01:50] - Connecting to assignment server
  50. [05:01:50] Connecting to http://assign.stanford.edu:8080/
  51. [05:01:53] Posted data.
  52. [05:01:53] Initial: 8F80; - Successful: assigned to (128.143.231.201).
  53. [05:01:53] + News From Folding@Home: Welcome to Folding@Home
  54. [05:01:53] Loaded queue successfully.
  55. [05:01:53] Sent data
  56. [05:01:53] Connecting to http://128.143.231.201:8080/
  57. [05:02:01] Posted data.
  58. [05:02:01] Initial: 0000; - Receiving payload (expected size: 30306108)
  59. [05:13:04] - Downloaded at ~44 kB/s
  60. [05:13:04] - Averaged speed for that direction ~211 kB/s
  61. [05:13:04] + Received work.
  62. [05:13:04] Trying to send all finished work units
  63. [05:13:04] + No unsent completed units remaining.
  64. [05:13:04] + Closed connections
  65. [05:13:09]
  66. [05:13:09] + Processing work unit
  67. [05:13:09] Core required: FahCore_a5.exe
  68. [05:13:09] Core found.
  69. [05:13:09] Working on queue slot 02 [November 9 05:13:09 UTC]
  70. [05:13:09] + Working ...
  71. [05:13:09] - Calling './FahCore_a5.exe -dir work/ -nice 19 -suffix 02 -np 32 -checkpoint 15 -verbose -lifeline 1844 -version 634'

  72. thekraken: The Kraken 0.7-pre15 (compiled Sun Jul  1 20:38:22 MST 2012 by root@Test)
  73. thekraken: Processor affinity wrapper for Folding@Home
  74. thekraken: The Kraken comes with ABSOLUTELY NO WARRANTY; licensed under GPLv2
  75. thekraken: PID: 1967
  76. thekraken: Logging to thekraken.log
  77. [05:13:09]
  78. [05:13:09] *------------------------------*
  79. [05:13:09] Folding@Home Gromacs SMP Core
  80. [05:13:09] Version 2.27 (Thu Feb 10 09:46:40 PST 2011)
  81. [05:13:09]
  82. [05:13:09] Preparing to commence simulation
  83. [05:13:09] - Looking at optimizations...
  84. [05:13:09] - Created dyn
  85. [05:13:09] - Files status OK
  86. [05:13:11] - Expanded 30305596 -> 33158020 (decompressed 109.4 percent)
  87. [05:13:11] Called DecompressByteArray: compressed_data_size=30305596 data_size=33158020, decompressed_data_size=33158020 diff=0
  88. [05:13:11] - Digital signature verified
  89. [05:13:11]
  90. [05:13:11] Project: 8101 (Run 25, Clone 6, Gen 56)
  91. [05:13:11]
  92. [05:13:11] Assembly optimizations on if available.
  93. [05:13:11] Entering M.D.
  94.                          :-)  G  R  O  M  A  C  S  (-:

  95.                    Groningen Machine for Chemical Simulation

  96.                             :-)  VERSION 4.5.3  (-:

  97.         Written by Emile Apol, Rossen Apostolov, Herman J.C. Berendsen,
  98.       Aldert van Buuren, Pär Bjelkmar, Rudi van Drunen, Anton Feenstra,
  99.         Gerrit Groenhof, Peter Kasson, Per Larsson, Pieter Meulenhoff,
  100.            Teemu Murtola, Szilard Pall, Sander Pronk, Roland Schulz,
  101.                 Michael Shirts, Alfons Sijbers, Peter Tieleman,

  102.                Berk Hess, David van der Spoel, and Erik Lindahl.

  103.        Copyright (c) 1991-2000, University of Groningen, The Netherlands.
  104.             Copyright (c) 2001-2010, The GROMACS development team at
  105.         Uppsala University & The Royal Institute of Technology, Sweden.
  106.             check out http://www.gromacs.org for more information.


  107.                                :-)  Gromacs  (-:

  108. Reading file work/wudata_02.tpr, VERSION 4.5.5-dev-20120903-d64b9e3 (single precision)
  109. [05:13:18] Mapping NT from 32 to 32
  110. Starting 32 threads
  111. Making 2D domain decomposition 8 x 4 x 1
  112. starting mdrun 'FP_membrane in water'
  113. 14250000 steps,  57000.0 ps (continuing from step 14000000,  56000.0 ps).
  114. [05:13:24] Completed 0 out of 250000 steps  (0%)

  115. NOTE: Turning on dynamic load balancing

  116. [05:29:25] Completed 2500 out of 250000 steps  (1%)
  117. [05:45:01] Completed 5000 out of 250000 steps  (2%)
复制代码
回复

使用道具 举报

发表于 2012-11-9 17:31:21 | 显示全部楼层
回复 1# 金鹏

在过度超频的机器上常见这个错误,不过按理讲103外频并不高。建议再观察观察。
回复

使用道具 举报

发表于 2012-11-9 17:40:00 | 显示全部楼层
UNSTABLE_MACHINE,根据提示:不稳定的机器。估计是超频引起的
回复

使用道具 举报

发表于 2012-11-9 17:46:49 | 显示全部楼层
100-103意义不大吧
回复

使用道具 举报

 楼主| 发表于 2012-11-9 18:39:39 | 显示全部楼层
回复 3# wpf999
回复 2# cuda
  1. Program Gromacs, VERSION 4.5.3
  2. Source code file: /vspm58/VM/fah-converted/mnt/fah_windows_build/LinuxBuilds/gromacs-4.5.3/src/mdlib/pme.c, line: 534

  3. Fatal error:
  4. 6 particles communicated to PME node 16 are more than 2/3 times the cut-off out of the domain decomposition cell of their charge group in dimension x.
  5. This usually means that your system is not well equilibrated.
复制代码
排除核心电压问题,我估计是出在内存电压问题上,8跟1600的4G内存默压是1.65V,我设置在1.51V,现在设置成1.55V跑几天看看咋样
回复

使用道具 举报

 楼主| 发表于 2012-11-9 18:41:23 | 显示全部楼层
本帖最后由 金鹏 于 2012-11-9 19:08 编辑
100-103意义不大吧
zflowers 发表于 2012-11-9 17:46


103下跑8101包比100下每帧快了30秒上下,一个包正好节省出1-2个存盘上传下载的时间,相当于增加了30K的PPD
回复

使用道具 举报

发表于 2012-11-9 19:21:03 | 显示全部楼层
回复 5# 金鹏

内存降压恐怕没有什么好处,也省不了多少电,建议改回默认电压。金士顿这款内存跑1.51V感觉有点悬,以前也是跑这个电压吗?
回复

使用道具 举报

 楼主| 发表于 2012-11-9 19:28:43 | 显示全部楼层
回复 7# cuda

以前的板子上也是这个电压,木有一点问题

换新的这个板子不论是CPU核心还是内存电压都不是很准确,而且有小幅掉压
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 新注册用户

本版积分规则

论坛官方淘宝店开业啦~
欢迎大家多多支持基金会~

Archiver|手机版|小黑屋|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2024-4-28 00:37

Powered by Discuz! X3.5

© 2001-2024 Discuz! Team.

快速回复 返回顶部 返回列表