中国分布式计算论坛

 找回密码
 新注册用户
搜索
楼主: vmzy

[项目新闻] United Devices [已结束]

[复制链接]
 楼主| 发表于 2005-12-22 23:41:22 | 显示全部楼层
12/21/2005
We have made some great progress on getting the new cancer job ready. There was an issue with the format of the data we received from Oxford and we believe that we have resolved this. I have been able to get successful cancer runs using the new data on an internal system. Hopefully we will be able to get some of the data loaded onto grid.org today or tomorrow.

In the past we have utilized the beta server and the beta tester team to help verify that things are working correctly before moving to grid.org. Due to hardware issues this is not going to be possible this time. If you are one of the beta tester members please do not be offended at this. We plan to reinstate the beta tester group when we migrate to the new system. I know everyone is anxious about the cancer data so we will not wait.

What this means is that it is possible that there may be a problem with the job when we submit it. I will be keeping an eye on the system and will be able to flip back to the current job quickly if necessary. Since the current job is basically stale data anyway this should not upset anyone. I will post again once the new job is submitted.

I have submitted (one of) the new cancer jobs. I am currently processing ligand 29 of 600 and it appears to be working correctly. We will keep a close eye on it for any problems. The old cancer job has been set to not dispatch any more workunits, but is still active so that any outstanding results can be returned. As I mentioned, there may be a need to do some additional tweaking on the new job, but as of now it looks good.

2005年12月21日
在新癌症任务的准备工作方面,我们获得了一些大的进展。虽然我们从牛津接收到的数据的格式有一个问题,但我相信我们已经解决了它。我已经成功的在一个内部系统上运行了新癌症数据。希望我们能在今天或明天把一部分数据上传至grid.org。

以前在上传前,我们要用beta服务器和beta测试小组帮助核实任务的准确性。由于硬件问题这时候显然不可能继续beta测试了。如果您是beta测试小组成员之一请先不要发飙。我们计划在转移到新系统后马上恢复beta测试小组的工作。我知道大家都急切的关注着新癌症数据,因此我们会尽快搞好它的。

这意味着新上传的任务可能会出问题。我会留意系统,如果有必要我会迅速回溯为当前的任务。因为当前的任务基本上是过期的数据,它应该不会有问题。一旦新任务上传完毕,我将再发帖通知大家。

我试算了一个新癌症任务。我当前已经处理了29/600 ligand,看起来运行正常。我们将留意它是否有问题。老癌症任务已经设置成,停止分发任务模式,但项目仍然是激活状态,以便所有有用的结果可以被上传。如同我全面提到的,新任务也许还需要做一些额外的调整,虽然目前看来它一切正常。

评分

参与人数 1基本分 +40 维基拼图 +20 收起 理由
霊烏路 空 + 40 + 20

查看全部评分

 楼主| 发表于 2005-12-30 22:43:22 | 显示全部楼层
12/29/2005
Unable to connect to UD server

The City of Austin had a transformer blow that affected power to UD during the last 24 hours. This caused Grid.org to be unavailable temporarily. The problem has since been remedied and I have verified that I am able to pull a workunit now from the UD server. Since there are many devices all trying to hit the UD server at the same time, there are still some backoff messages occurring. It took me about 10 minutes to get a successful connection and a download to occur. These will go away as the load subsides. Give it a little time and everyone should be crunching away normally again.

2005年12月29日
无法连接UD服务器

在过去的24小时里奥斯汀城的变压器出了问题影响了UD的电力供应。这导致Grid.org暂时无法访问。问题已经被解决了,并且我核实了我现在能从UD服务器下载一个任务。因为现在有许多设备,一起设法连接UD服务器,所以暂时仍然有一些“无法连接”消息出现。需要大约10分钟才能成功连接和下载一次。等网络负载降下来这些情况就会有所好转。大家请再耐心等一下,不久一切应该就会恢复正常。

评分

参与人数 1基本分 +30 维基拼图 +15 收起 理由
霊烏路 空 + 30 + 15

查看全部评分

 楼主| 发表于 2006-1-4 18:00:13 | 显示全部楼层
1/3/2006
I hope everyone had a happy new year. Here is the latest status:

Grid.org Outage - As I posted previously, we had a temporary outage due to a blown City of Austin transformer (a rogue squirrel is suspected). This outage should not have required any reregistrations. There was a temporary period where devices may have been "Backing off..." due to a high load after the servers became available, but this should no longer be occurring.

New Cancer Data - A small batch of the new cancer data has been uploaded and members are currently crunching away at it. This is a very small subset of the data we received, ~20 WUs, just so we can verify that we are getting the results we expect. I plan on sending some of the results to our Oxford contact today to verify these are the results they expect. Once that is confirmed, I will upload more data. There have been a few members complaining of occassional lost results. I am not sure how wide spread this is yet nor the cause of the loss. Other members have reported complete success so we will need to investigate this more closely. Note that the old cancer job has been disabled so if you are working on the cancer project, you are crunching the new data.

2006年1月3日
我希望大家新年过的好。这最新的报告:

Grid.org停机 - 如同我前面的帖子所言,由于奥斯汀城的一个变压器爆炸,我们临时停机了(怀疑是一只流氓松鼠干的)。这次停机不需要用户重新注册。由于高流量负载在服务器恢复之后有段时间也许有设备遇到"Backing off..."信息,但这应该不会再发生了。

新癌症数据 -  一小批新癌症数据被上传了,并且成员正在计算它。这是我们接受数据的一个非常小的一部分,约20 个任务,通过他们我们希望能核实,我们是否取得了我们期望的结果。我计划在今天寄发一些结果到我们的牛津部门核实这些结果是否是他们所期望。一旦他们被证实OK,我将上传更多的数据。有是几名成员报告说偶尔结果会丢失。我不敢肯定它是不是普遍现象,更不知道丢失的起因。其它成员报告说完全正常,因此我们将需要严密调查此事。注意老癌症任务已经结束了,因此如果您想研究癌症项目,您需要等新数据。

评分

参与人数 1基本分 +36 维基拼图 +18 收起 理由
霊烏路 空 + 36 + 18

查看全部评分

头像被屏蔽
发表于 2006-1-4 18:18:15 | 显示全部楼层
提示: 作者被禁止或删除 内容自动屏蔽
头像被屏蔽
发表于 2006-1-5 12:53:12 | 显示全部楼层
提示: 作者被禁止或删除 内容自动屏蔽
 楼主| 发表于 2006-1-5 15:51:24 | 显示全部楼层
1/4/2006
Unable to connect for Cancer job

There was a small issue with the cancer job that caused devices to not be able to download a new cancer WU. This was an unforseen side affect of trying to retreive results so that we can have them verified with our Oxford contact. The process consists of running a script that verifies minimum successful results, retreives them from the grid server, and aggregates them into a known format. Apparently the script marks a WU as complete during the final stages of processing. I have since reset the WUs so they will dispatch again.

To try to prevent this from happening again, I am going to submit another cancer job (a duplicate of the current one) that I can experiment with hopefully without affecting the current one.

Note that until I can successfully get the retrieval script to work and get the results to our Oxford contact for validation, I will not be able to upload the bulk of the new data. Please have a little patience during this phase of the project. As I mentioned, in the past we had the luxury of a beta test system to work these issues out, but unfortunately that is no longer the case. This requires us to do some testing on the production system which we really do not like to do.

2006年1月4日
无法下载癌症任务

癌症任务出了点小问题,以致设备不能下载新癌症任务。这是尝试提取结果文件以便能让我们的牛津合作方能核实他们的准确性时,出现的未预见到的副作用。整个过程包括,运行一个脚本,寻找最佳的成功结果,从grid服务器提取他们,并且把他们封装为一个已知的格式。在最后处理阶段,脚本明显地把任务标记为完成了。因此我重设了任务参数,让他们再次发放任务。

为了防止它再次发生,我准备上传另一个癌症任务(当前任务的复制品)我希望能对它们继续进行测试,而不影响当前任务。

注:等到我成功地得到了新的检索脚本程序,并得到我们的牛津合作者的检验结果后,我才会上传所有的新数据。在项目的这个阶段里请保持耐心。如前面的帖子所提到的,以前我们有一个完善的Beta测试系统来解决这些问题,但实际上它现在不能用。这迫使我们只能在任务生成发放系统上进行测试工作。

译者注:鄙视gird.org,居然强迫大家做“小白鼠”!不过因为测试服务器坏了,也没法,不好说什么!
但他们居然重复发放测试数据,就绝对该BS了。
gird.org应该向FAD学习。要专门找人测试,而且不该重复发放测试数据,没数据干脆就暂停项目,反正重复的数据算了也白算!与其在这里浪费计算量,我不如暂时搞点别的!
强烈BS gird.org的浪费行为!

评分

参与人数 1基本分 +40 维基拼图 +20 收起 理由
霊烏路 空 + 40 + 20

查看全部评分

 楼主| 发表于 2006-1-11 23:18:54 | 显示全部楼层
January 10, 2006
Sorry for the late status, but I wanted to finish a test before posting:

New Cancer Data - I have been running some tests on an internal system and believe that I have some good news. It appears from my testing that everything is working as expected. There was a false alarm last week when I reported that we were ready to send results to our Oxford contact. Upon further investigation it was seen that there were some missing output files. This problem has been resolved internally so I am ready to grab the results from the new test job that has been running on Grid.org to see if the results are the same.

I will need to run the result aggregation script which is always run after a job is complete. An effect of running the script is that the workunits will all be marked complete and dispatching will stop. This may result in a lost workunit result or two. I will re-enable the job as soon as the result script completes so there will be minimum outage.

Lost Workunits - There have been some complaints of lost workunits with the new data. Since we have many results for each workunit, this is not a problem with the new cancer data per se. We will keep investigating this issue until we understand what is happening.

2006年1月10日
抱歉这么晚才报告项目状态,因为我想完成一个测试后才发帖:

新Cancer数据 - 我在一个内部系统上进行了一些测试,我相信取得了一些好的进展。从我的测试看来,一切运作如愿。上星期当我向我们的牛津方面报告我们准备好寄发结果时出现了一个假警报。进一步调查发现,有一些输出文件丢失。这个问题已经被内部解决了,因此我准备从Grid.org上运行的新测试任务中提取结果,看和以前的结果是否相同。

我需要在任务完成后运行结果收集脚本程序。运行脚本的结果是,任务将全部标记为完成状态,并且将停止发放任务。这也许会导致丢失一个或二个任务。当结果收集脚本完工后我将重新发放工作,那样就可以将停机时间降到最短。

丢失任务 - 有人抱怨说新数据会丢失任务。因为我们每个任务都有许多结果,并不是所有的新癌症数据结果都有问题。我们将继续调查这个问题直到我们了解发生了什么错误。

January 10, 2006
I have sent the results from our small test job to our Oxford contact. We will now have to wait to see what they say. As soon as I get confirmation, I can upload all of the new data. Note that we know that there is a problem with the format of the data. I had to manually convert the data in order to run our test. I am asking our contact to provide us with the correct format so that we know the input data is valid from their point of view.

2006年1月10日
我已经把我们的小测试任务结果寄发给了我们的牛津方面。我们现在必须等待,看他们怎么说。当我得到确认OK后,我将上传所有新数据。注:我们知道数据的格式有一个问题。为了进行我们的测试,我必须手工转换数据。我请求我们的合作者提供给我们正确的格式,以便让我们知道输入的数据就他们看来是否合法。

评分

参与人数 1基本分 +40 维基拼图 +20 收起 理由
霊烏路 空 + 40 + 20

查看全部评分

 楼主| 发表于 2006-1-17 14:58:14 | 显示全部楼层
January 16, 2006
Not a lot to mention this week:

New Cancer Data - The results from the current job have been sent to our Oxford contact. We must now wait for the results to be verified. Note that there is already one issue. Oxford is requesting an additional tag Molecule_ID be added to the input data. Hopefully this is something they will be able to deliver. As soon as I hear back, I will post.

Lost Workunits - One of our 3 servers that handle dispatch and result retrieval was not working properly. I do not see how that would cause workunits to be lost, but I guess it is possible. That server has been fixed and is working properly now. Let's keep an eye on the workunits and see if this has a positive affect on those experiencing the problem.

2006年1月16日
这个星期没啥说的:

新Cancer数据 - 当前任务的结果已经寄到了我们的牛津联络处。我们现在必须等待结果被核实。注:已经出了一个问题。牛津要求为输入数据添加一个额外的Molecule_ID标记。希望他们能详细叙述一下细节。当我得知后,我将尽快发帖通知大家。

丢失任务 - 我们的3台负责发放和回收任务的服务器中的一个出了问题。我没发现它怎么会导致任务丢失,但我猜测这是可能的。服务器已经被修好了,现在工作正常。我们会留意任务,看这是否会对那些烦人的问题有一个正面的积极影响。

评分

参与人数 1基本分 +18 维基拼图 +9 收起 理由
霊烏路 空 + 18 + 9

查看全部评分

 楼主| 发表于 2006-2-11 15:06:25 | 显示全部楼层
January 20, 2006

We recently had a problem with the forums. The problem occurred due to a logfile exceeding a maximum size which caused the apache web server to crash. It took a while to track down this issue, but the forums should be working correctly now. Note that this problem only affected the forums and not any of the Grid.org job processing.

2006年1月20日

近期我们的论坛出了点问题。由于日志文件过大导致apache网络服务器崩溃。虽然废了点时间来解决这个问题,但论坛现在应该工作正常了。注:这个问题只影响了论坛,没有对Grid.org的任务处理产生任何影响。

评分

参与人数 1基本分 +16 维基拼图 +8 收起 理由
霊烏路 空 + 16 + 8

查看全部评分

 楼主| 发表于 2006-2-11 15:26:54 | 显示全部楼层
January 26, 2006

I apologize for not posting a status on Monday, but we are having a company conference this week which is consuming most of my time. The status of the new cancer data is the topic everyone is interested in so here is a very quick status.

I have a conference call with our Oxford contact tomorrow morning to discuss the new data and the results I sent a couple of weeks ago. After the call, I hope to have much more information to share. I know there is frustration about not having all of the new data loaded for members to work on. Since Oxford is the one that will ultimately be using the results, I have had to wait for them to respond as to the validity of the results. That is out of my control.

Regardless of what Oxford has to say, I will load some more of the new data next week. If the data must be recrunched later, so be it. I was trying to minimize the amount of data that must be reworked to cut down on the complaints about wasted time later. Since there are already complaints about time wasted crunching the same workunits, I guess it does not matter.

Please understand that having the data validated by Oxford is the bottleneck here and that United Devices is ready to upload all the new data as soon as the results are confirmed.

2006年1月26日

对不起大家,星期一没有发布项目状态信息。本周我们有一个公司会议花费了我大部分的时间。大家都非常关心新癌症数据的消息,我在这儿向大家透漏一点。

明天早上我将和牛津联络处召开一个网络会议,讨论一下新数据和几周前发回的结果。会议后,我希望会有更多的信息和大家分享。我知道没有足够的新数据让大家处理,使大家很感失落。因为牛津才是结果的最终使用者,所以我们必须等待他们对结果的验证信息。而这是我们无法控制的。

不论牛津这么说,下周我将上传更多的新数据。如果以后数据必须重算,也无所谓。我将尽可能减少数据的量,以免将来需要重算时大家又要抱怨浪费了太多的计算时间。既然现在很多人抱怨重复计算任务包浪费了计算时间,那么我想到时有这些抱怨也是可以理解的。

请理解牛津校验数据这个瓶颈的存在,但一旦结果得到确认United Devices将会尽快将所有新数据全部上传。

评分

参与人数 1基本分 +30 维基拼图 +15 收起 理由
霊烏路 空 + 30 + 15

查看全部评分

 楼主| 发表于 2006-2-11 15:38:02 | 显示全部楼层
January 30, 2006

New Cancer Data - I had a conference call with our Oxford contact on Friday. There is an additional field they would like in the results so they have sent a sample data file containing this information. I will be uploading this later today or tomorrow. Hopefully this will produce the results they are expecting and we will be able to make all of the new data available for members later this week or next.

Rosetta - The current batch has been processed and we will be making a new batch available later this week. As always, there will be a two week period for any outstanding workunits to be credited.

Team Stats - For some reason these are missing for the 26th. The stats job will be rerun shortly to pick up this day.

Thanks to everyone for their contribution.

2006年1月30日

新癌症数据 - 星期五我和牛津联络处开了个电话会议。由于他们需要结果的一些额外的信息数据,所以他们发给我们了包含该信息的新样本数据。我将在今晚或明天把这些上传,希望这能得到他们想要的数据,希望我们能在本周末或下周可以把所有数据对用户公开。

Rosetta - 当前一批任务已经处理完了,本周末我们会开放一批新任务。如常,需要2周时间来处理积分发放。

小组统计 - 由于某些原因,26号的统计信息丢失。今天我们将会尽快恢复这些统计。

感谢大家的贡献。

[ Last edited by vmzy on 2006-2-11 at 15:48 ]

评分

参与人数 1基本分 +20 维基拼图 +10 收起 理由
霊烏路 空 + 20 + 10

查看全部评分

 楼主| 发表于 2006-2-11 15:49:33 | 显示全部楼层
February 02, 2006

I have just uploaded a portion of the latest data I received from Oxford. My internal tests looked good as far as the workunits being able to be processed. We will let this job run for a while before shipping the results back to Oxford for verification. Note that the previous cancer job is still running as well, so it is up to the dispatcher as to whether you get one of the new workunits or not. As soon as I see a few successful results, I will stop the previous job from dispatching so everyone will be crunching the new data.

2006年2月2日

我刚把从牛津发来的一部分最新数据上传了。就目前内部测试的任务处理情况而言一切正常。在把结果传给牛津检验前我们会多测试几次。注:由于前一批癌症任务仍在处理中,由于任务分配程序将无法区分新旧数据。一旦我们收到足够多的新结果,我将停止旧数据的发放,这样大家就可以都算新数据了。

评分

参与人数 1基本分 +16 维基拼图 +8 收起 理由
霊烏路 空 + 16 + 8

查看全部评分

 楼主| 发表于 2006-2-11 15:56:34 | 显示全部楼层
February 03, 2006

I have received confirmation from our Oxford contact that the sample result data from the latest cancer job looks good. This means that we are ready to start crunching all of the new data. The new data will be uploaded in pieces (jobs) just like we have been doing with Rosetta. When a job has completed, I will allow at least a week for any outstanding results to be uploaded by members.

This is great news for all of us. Thank you for your continued patience and contribution.

2006年2月3日

我刚从我们的牛津联络处收到消息,最新的样本癌症任务的结果文件很好。这意味着我们将要开始处理全部的新数据。我们将像Rosetta那样分批上传新数据(任务)。当一批任务完成后,我们将最少再等一个星期来接收老任务的结果。

这对大家而言绝对是一个好消息。感谢大家的耐心和所作出的贡献。

评分

参与人数 1基本分 +16 维基拼图 +8 收起 理由
霊烏路 空 + 16 + 8

查看全部评分

 楼主| 发表于 2006-2-11 16:37:28 | 显示全部楼层
February 06, 2006

Cancer data - As I mentioned previously, Oxford has verified our previous results and has released all of the new data. There is plenty to keep us busy for a while. They also wish to run all of the new data against the previous protein when we are done with the current protein.

Some members are experiencing aborted WUs with the new data. We are currently investigating this issue. The problem is a bit elusive since not all members are experiencing it and some members are experiencing it much more frequently than others. I have verified that there are results for each and every WU and that the total is approximately the same for all. I have also verified that there are approximately the same (small) number of errors returned for each WU.

This tells me that there is not a problem with any particular WU, but some other issue. If there was a bad WU, we would see either a significantly lower number of results or a significantly greater number of errors. This is not the case. We will keep looking into this issue until we find a solution.

These WUs appear to be processing much faster than the previous batches. I am not sure of the reason for this and will rely on Oxford to tell us if something is not right. Note that there is a bunch of data to be processed. I am not quick to modify the job configuration (adding more Ligands) since the next ones I upload may take longer.

It has been noticed that the number of hits is high for some of the WUs. Again I do not know the reason for this and will have to rely on Oxford to tell us if something is wrong.

Thank you for your contribution.

2006年2月6日

癌症数据 - 如前所述,牛津已经检验通过了我们前面发送的结果,并发布了所有的新数据。这够我们算一阵的了。同时他们希望当我们完成当前蛋白质的计算后,用新数据把以往的蛋白质也算一遍。

一些用户在计算新数据时遇到了“异常”任务包。我们正在调查此事。这个问题有一点复杂,因为并不是所有的用户都有这个问题,有一些用户会频繁遇到此问题。我检查了每个任务上传的结果,发现他们几乎是一致的。我还发现每个任务的结果都有些类似的小问题出现。

这说明不是某些任务包有问题,而是出现了其它棘手的问题。如果某个任务包出了问题,将会出现其对应结果数量的骤减或错误数量的骤增。但这些却没有发生。我们将继续关注此事,直至找到解决方案。

这批任务比前一批的计算量要小的多。我不知道导致这些的原因是什么,只有牛津方面才能确定是否出了问题。注:我不会马上修改任务配置(增加Ligand),因为下一批数据要等一段时间才能上传。

发现某些相同任务的hit数却不同。我同样不知道导致这些的原因是什么,只有牛津方面才能确定是否出了问题。

感谢您的贡献。

评分

参与人数 1基本分 +40 维基拼图 +20 收起 理由
霊烏路 空 + 40 + 20

查看全部评分

 楼主| 发表于 2006-2-11 16:38:20 | 显示全部楼层
February 10, 2006

We have finished processing the current Rosetta job. A new one will be uploaded tonight. Some members may have experienced a "Cannot connect" message due to this. Until the new job is active, I have reset the current one to continue to dispatch to prevent these messages.

2006年2月10日
我们已经完成了当前Rosetta任务的处理工作。今晚将会上传新一批数据。因此一些用户可能会收到“Cannot connect”消息。等到任务上传完毕后,我才会开放任务发放。到时这些信息将不会再出现。

译者注:由于装备不全,所以翻译的准确性会很低,望大家见量!

评分

参与人数 1基本分 +15 维基拼图 +7 收起 理由
霊烏路 空 + 15 + 7

查看全部评分

您需要登录后才可以回帖 登录 | 新注册用户

本版积分规则

论坛官方淘宝店开业啦~
欢迎大家多多支持基金会~

小黑屋|手机版|Archiver|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2021-9-21 13:20

Powered by Discuz! X3.4

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表