RNA World 的 FAQ

xuyongchen · 发表于 2011-1-7 17:43:59

回复 14# nekoko

关键是翻译的时候看着那么多很头大，我又在学校里，没法用电脑，只能用词典，新词、合成词神马的又查不到。。。翻的时候想撞墙。。。所以进度慢。我尽量优先翻译这篇吧。。

xuyongchen · 发表于 2011-1-7 17:45:13

回复 15# feynord

时间不是问题，肯定够，软硬件方面只要是有电脑在就问题不大。我先翻吧，遇到不好翻译的再向你请教吧~[em02]

xuyongchen · 发表于 2011-1-7 19:28:41

回复 14# nekoko

冗余验证是什么？我在谷歌上没有找到的说。。。

姫海棠 果 · 发表于 2011-1-7 20:01:14

回复 18# xuyongchen

一种结果验证技术，可以看看这篇：http://www.equn.com/wiki/%E5%BA% ... 5%E6%88%90%E6%9E%9C

xuyongchen · 发表于 2011-1-7 20:39:29

原来如此，有点像商品质量检测的意思~

feynord · 发表于 2011-1-7 22:50:32

回复 17# xuyongchen

请教不敢当,加油啊:)

xuyongchen · 发表于 2011-1-8 22:39:32

第四部分（之前feynord兄弟翻译的编为第三部分吧~）更新3、26、27、30、34

3．What types of work units are available?
有效地任务单元是什么类型的？
At present we have four types of work units, which are based on (1) CMBUILD, (2) CMCALIBRATE, (3) CMSEARCH and (4) InReAlyzer. CMBUILD work units produce an RNA co-variance model from a text alignment of members of RNAs belonging to the same RNA family. CMCALIBRATE work units calibrate an RNA co-variance model produced by CMBUILD such that it can be used to score the probability that a potential RNA identified as a member of a certain RNA family is indeed a true candiate of that family. CMSEARCH work units use the output of CMBUILD, i.e. a calibrated co-variance model to search for an RNA family in the genome of a specified organism. InReAlyzer work units convert the somewhat cryptic text output of CMSEARCH work units into high-resolution PNG graphics that allow for convenient visual judgement whether or not a given CMSEARCH candidate belongs to the RNA family under investigation.目前我们有四种任务单元类型，分别基于（1）CMBUILD,(2)CMCALIBRATE,(3)CMSEARCH以及(4)InReAlyzer。CMBUILD任务单元创建同一系的不同RNA模型。CMCALIBRATE任务单元（用来）校准由CMBUILD生成的RNA模型，以便检测该模型是否可能是该系中潜在的一种RNA。CMSEARCH任务单元利用CMBUILD的输出结果，即一种不同以往的经过校正的模型，来为某一种特殊的组织的基因寻找RNA系。InReAlyzer任务单元将CMBUILD中一些意义深远的任务单元转换成高分辨率的PNG图片，以便方便的进行视觉判断某个给定的CMSEARCH的候选结构属不属于正在研究的RNA系的一员。
26.What systems are going to be supported in the future?
哪些系统将在未来得到支持？
At present we support Linux, Windows and Mac wherever possible. PS3 most likely will not be supported due to its small RAM capacity although it might be possible to use it for other applications in the future which, at present, we have not implemented yet. If we manage to establish a virtual machine approach, however, it might be possible to even support a number of additional systems.目前无论怎样我们支持Linux,Windows以及Mac。PS3由于其较小的内存而没有得到我们支持，尽管在未来可能有其他的子项目供其运算。如果我们成功的建立一些虚拟机，那样的话甚至可能会支持一些额外的系统。

27.Are BOINC and the RNA World applications safe, i.e. free from viruses and other malware?
BOINC以及RNA World任务安全吗？没有病毒以及其他的恶意软件？
Yes. BOINC as well as all RNA World applications are open source, i.e. can be inspected by anyone who is interested. The RNA World applications are compiled in-house using compiler tools that are widely applied public domain tools which e.g. are used to produce the code of the majority of todays webservers. Consequently, if these were malicious, we would already face a much bigger problem.是的，BOINC以及RNA World任务属于开放资源。即可以被任何对其感兴趣的人检测。RNA World的任务是使用广泛应用于公共领域的家庭用编译工具编译的。它们被用来编写出了当今的主流网络应用（服务）。因此，如果它们使恶意的，我们应该早就面对比目前更加严重的问题了。

30.Can RNA World be operated in offline mode?
RNA World能否在离线模式下运行？
Currently not, because we do not allow for caching of large sets of work unit packages because we require a high turn-around time. In case of CMCALIBRATE work units this is quite easy to understand, since the results of these are the basis for all subsequent CMSEARCH work units. Generally, CMBUILD results are the basis for CMCALIBRATE calculations and CMCALIBRATE results are required as input data for CMSEARCH. The output of CMSEARCH in turn then serves as input for InReAlyzer. However, since we are planning to add a set of additional applications in the future which lack these strict interdependencies, it is very likely that certain types of future work units will allow for offline computation.目前暂不可以，因为我们不允许囤积大量的任务单元包。我们需要一个长期的转变时间。以CMCALIBRATE任务单元为例很容易让人理解，因为它们的结果是所有CMSEARCH任务单元的基础。一般而言，CMBUILD的结果是CMCALIBRATE计算的基础而CMCALIBRATE的结果又是CMSEARCH的输入数据。CMSEARCH的输出结果反过来作为InReAlyzer的输入。不过，我们正在考虑添加一系列额外的任务。这些任务不需要这些苛刻的相互关联。在未来有很高的可能性特定的任务可以离线计算。

34.According to the server status page, work units should be available, so why don't I get any?
根据（项目）服务状态页面，任务单元应该是可获得的，但为什么我无法获得任何任务？
Assuming you have activated the type of work units announced to be available for processing in your RNA World project profile, the reason is RNA World's homogenous redundancy policy: a work unit delivered to a Linux x64 machine of a certain CPU type for example will only be sent to another Linux x64 machine for validation which has the same CPU type installed. If your system does not get any work units anymore then the remaining work indicated on the server status page can only be delivered to machines that provide an operating system and/or CPU different from yours.假设你已经激活了在页面中显示可获得的任务的种类，并（申请）在你的RNA World项目文件中运行，那原因就是RNA World的同类冗余政策：例如传送给一台某型号CPU的Linux 64位机器的任务只会传送给另外一台完全相同的Linux 64位机器用以验证有效性。如果服务状态页面上显示有任务但你不能获得的话，那么剩余的任务将只能被传送给与你的系统或CPU型号不同的机器。

xuyongchen · 发表于 2011-1-9 19:23:27

更新第五部分更新35、36、39、40

35.I came home and my machine was basically unresponsive with multiple RNA World screensaver windows open - what is going on here?
我回到家后发现我的机器无法响应而RNA World屏保处于运行状态。这是怎么回事？
This (rarely occuring) strange behaviour is not yet completely understood. For simplicity, our current screensaver makes use of Adobe FlashPlayer. Consequently, the problem you describe can occur only on machines where FlashPlayer is installed (on others, the screensaver function will not work). To resolve the issue, it seems you need to either upgrade to the latest FlashPlayer version or uninstall it completely from your machine (of course, uninstalling is not really a good suggestion as many websites make use of Flash).这种奇怪的表现目前（我们）尚不能完全理解。（该现象很罕见）简单地说，我们的屏保使用Adobe FlashPlayer软件。因此，你描述的问题只会出现在安装了FlashPlayer的电脑上（在其他电脑上，屏保将不会运行）。解决问题的方法是升级最新的FlashPlayer版本或者从电脑中完全移除它。（显然，移除并不是一个好的建议，因为大量的网站使用Flash）

36.It seems that the entire RNA World website is available only in German?
是不是整个RNA World网站内容只支持德语？
No, you can individually customize your display language. Forum settings for example are found here. Setting BOINC pages to English (only necessary if it doesn't work properly with the browser's ACCEPT setting) can be done here. Since 18th of January 2010 we have also incorporated the Boinc translation system (BTS).不，你可以定制显示语言。例如论坛设定在这里（超链接）。在这里设置BOINC页面（语言）为英语（仅当无法正常工作在浏览器设定下时）在2010年1月18号之后我们内置了Boinc 翻译系统。

39.Why is RNA World not using the standard BOINC forum?
为什么RNA World不使用标准的BOINC论坛？
The RNA World forums are multilingual which means there is more than just one forum and these are indeed located on a server different from the BOINC servers and from the RNA World server. The reason is that we need to make sure that forum communication remains intact even if the BOINC and the RNA World project severs are non-functional. It is actually surprising that several other DC projects do not do it the same way as we do. A single drawback is that you have to register on our forum server to make use of it but we feel that given the advantages, this drawback is bearable.RNA World论坛是多语种的，这意味着将会有不止一个论坛并且论坛将置于一个不同于BOINC服务器以及RNA World服务器的（独立）服务器上。这么做的原因是我们要确保论坛交流在BOINC与RNA World项目服务器无法正常工作的情况下仍然完好。其他的几个分布式项目不像我们一样设置着实让我们惊奇。一个不利因素是你得在我们的论坛上注册才能使用。但是我们感觉考虑到它所带来的好处，这个不利因素是可以忍受的。

40.It seems a long-named RNA World work unit is blocking my entire BOINC system
看上去有一个名字很长的RNA World任务单元堵塞了我的整个BOINC系统。
This is a known issue which it is occurring only very rarely and relates to a yet unresolved bug in the BOINC manager. It is also exclusively happening on Windows-based machines. The source of the error is the fact that Windows allows only 256 characters at maximum that can be used for the sum of path name plus file name length. RNA World uses explicit file names, i.e. from the long file names the user can easily derive what is being computed on his or her machine. We would like to keep it like that to allow third-party developers to conveniently construct RNA World monitoring programs. The point is that if such a long-named work unit is being sent to your Windows machine, it will get stuck in the downloading process because it can't be written productively to your hard drive. As a baffling consequence, your BOINC manager will stop downloading work units for any DC project it is hooked up to. To resolve the issue you just have to delete that WU from within the BOINC manager. We hope that the BOINC developers will fix this issue, soon.这是一个由于BOINC管理器bug引发的已知的很少发生的问题。同时这也是一个仅在windows平台下出现的问题。错误的原因是windows中路径名称与文件名总长最大仅仅支持256个字母。RNA World使用详细的名称，即从长长的文件名中用户就可以轻易的知晓他（她）们的电脑在计算什么。我们保留这点是为了给第三方优化RNA World监视程序提供方便。问题在于如果一个名称如此长的任务单元被发送到你的windows电脑中，它将因为无法被写入硬盘中而被困在下载环节。令人困惑的结果是你的BOINC管理器将不会下载任何其他的分布式任务。解决的方法就是把该任务从BOINC管理器中删除。我们希望BOINC的维护者可以在不久的将来解决这个问题。

xuyongchen · 发表于 2011-1-10 23:32:27

更新最后一部分13、14、29、37、38

13.Are work units generated automatically or manually?
任务单元是自动还是手动生成的？
BOINC-based work units are always generated in a fully automated manner from operator-supplied input files. RNA World currently relies on operator-curated input archives that will be automatically processed by the RNA World server to yield several thousands of work units per archive. Archives can be placed in a on-hold queue such that once the server is running low on work units, it can process new ones from this supply.
We are currently working on implementing user job submission interfaces such that, under strict security guidelines, researches can use the RNA World distributed supercomputer to process their own project files. Here, we do not plan to allow batch job processing for security reasons and the users will have to register and use a digital certificate for clear identification.
It is also planned to derive work units fully automated by regularly scanning RNA-relevant databases for novel sequences that could be analyzed. 基于BOINC的任务单元一直完全由运行支持出入文件自动生成。RNA World目前依赖于运作辅助输出档案，它可以自动被RNA World服务器所运行以在每个档案中生成数千个任务单元。档案可以被安置在一个冻结队列中这样一旦服务器任务不足时可以从这里生成新的任务。目前我们正在完善用户任务提交界面，因此在严格的安全保证下，研究人员可以利用RNA World分布式计算运行它们自己的项目文件。现在，出于安全原因我们不打算引入分批任务，用户也需要注册并使用数字证书进行验证。我们还计划通过常规扫描RNA相关数据库，将任务单元完全自动导出以便寻找可以被分析的新奇的序列。

14.Is a continuous work supply guaranteed?
持续性的任务有保证吗？
Our objective is to continuously recruit more and more RNA-relevant bioinformatic tools to RNA World. Moreover, the data sources containing RNA-relevant information that require analysis by RNA World are growing daily. To cope with these two facts, we expect that RNA World will require increasing compute capacities and consequently should be expected not to run out of work, soon. However, we are computing on an individual project basis plus we try to build up databases containing pre-computed results e.g. for listing potential RNA candidate genes in any given organism. Once our objectives are reached, we will naturally stop sending out work units until we have new projects in store. This will be announced in time on the RNA World website to avoid machines to run idle.我们的目标是不断地在RNA World中引入越来越多的RNA World相关的生物结构的工具。而且，需要由RNA World进行分析的包含有RNA相关信息的资源与日俱增。为了应对这两个事实，我们预期RNA World将需要更强的计算能力并最终能够稳定运行。但是，我们正在一个个体项目基端上运行外加我们准备建立包含有运算前结果的数据库，即列举任意给定组织的潜在RNA候选基因。一旦我们的目标完成，很自然，我们将停止发送任务单元，除非我们又有了新的项目。这将被及时的在RNA World网站上宣布以防（用户）机器白白的运行。

29.What Internet traffic can be expected?
网络流量大约多少?
All files are transferred in compressed format and most files contain simple ASCII data such that compression rate is around 30%, i.e. original file sizes will be reduced to 30% of their original size. In general, CMBUILD and CMCALIBRATE work units are the smallest and should require less than 1 MB (usually even less than 100 kB) of data traffic. CMSEARCH work units cause somewhat higher download traffic depending on the size of the genome that is going to be analyzed: Current upper limit: With a maximum of 512 MB for one of the chromosomes of an opossum (uncompressed file size), 150 MB would have to be transferred (compressed file size) for a CMSEARCH work unit plus a few kB for additional control files. Of course, the upload traffic only contains the result file and not the genome that was searched for RNA presence and consequently will be much, much smaller. Normal traffic: A typical bacterial genome such as that of e.g. E. coli is about 4.6 MB (uncompressed) in size. Hence, 1.3 MB (compressed) of data plus the control files (just a few kB) will be transferred. Lower limit: Many viral genomes as well as plasmid sequences contain less than 10 kB of data in uncompressed format. However, note that small CMSEARCH work units are expected to complete quickly such that your machine may request new data over and over again depending on your systems performance.所有被传输的文件均为压缩格式，大多数包括基本的ASCII数据并且压缩率大约30%，即文件大小将为原始文件的30%。通常，CMBUILD和CMCALIBRATE的任务单元是最小的，应该不超过1MB（一般小于100KB）的流量。CMSEARCH任务单元会有一些更大流量的下载，取决于它将要分析的基因的尺寸：目前的上限是：一个负鼠染色组的大小不会超过512MB（原始数据），150MB的CMSEARCH任务单元将和一些不到1MB的额外的控制文件一同被传送。当然，上传文件只包括结果而不包括所要寻找的RNA基因组，因此最终（大小）将会非常小。普通流量：一个典型的细菌（比如coli）基因组大约为4.6MB（未压缩）。因此，1.3MB（压缩后）的数据外加控制文件（仅仅不到1MB）将被传送。更低的限制：很多病毒感染基因组以及质粒顺序在未压缩模式下只有不到10KB的数据。但是，注意到较小的CMSEARCH任务单元可以预见将被更快的完成，因此取决于电脑性能，你的机器可能会不断地申请任务。

37.I got the message "redundant result", what exactly does that mean?
我得到了“冗余结果”的信息，这是什么意思？
First, a few remarks on the terms used in BOINC. A work unit is defined as a computational job which we would like participants to complete. A result, by contrast, is a collective term for the files which the server generates and sends to the participants. If enough results (quorum) are successful (this includes the data transfer to the participant, computation of the job, return of the result files to the server, etc.) and got validated (i.e. is identical to at least one other result successfully returned to the server), then a work unit is complete. For example, in RNA World, for each CMSEARCH-based work unit three results are being generated and sent to three different machines. If two of these (quorum) are successful and get validated, the work unit is completed. As a consequence, the third result is no longer required, i.e. it is redundant (redundant result). This third result then (1) will not be sent out again (if it has not yet been sent out), (2) will be aborted on the client machine if it has been sent out but computation has not yet commenced or (3) will be completed on receive credits if its computation has already started. We generate more results (three) per work unit than required for the quorum (two), to collect results more quickly. If we would not do it this way, we would always have to wait for the deadline to complete until the server detects that the clients do not send anything else in. Only then the server would generate an additional result and send that on out again and again wait for incoming data.首先，BOINC中很少有关于术语的意见。任务单元被定义成我们想要参与并完成的计算性的任务。结果，相比而言，对文件是一种收集性的终端，它们由服务器生成并发送至参与者。如果足够的结果（“指定结果”）是成功（这包括传送数据给参与者，计算任务，返回结果给服务器等等）且经验证有效的（即至少与另外一个返回结果进行了验证）这样一个任务就算是完成了。例如，在RNA World中，每一个基于CMSEARCH的任务单元会被生成并传给三台不同的机器。如果其中两台（“指定机器”）计算成功并验证有效，该任务即为完成。因此，第三个结果就不再需要了，即冗余结果。这第三个结果将（1）不再被发送（如果尚未发送的话），（2）在用户端被删除，但仅当被发送却没有被计算的情况下，或（3）如果已经被计算的话讲正常提供积分。我们每个任务生成比判定有效任务数更多的结果，以便更快的收集结果。如果我们不这么做的话，我们将只能在截止日到来时才由服务器检测到用户没有发送任何数据并完成该任务。只有在这样的情况下，服务器将会生成额外的结果并一次又一次的发送并等待输入的数据。

38.The progress bar is at 100% and seems to sit there for hours - what is happening here?
进度条在100%上停了很长时间，出了什么问题？
This is common behavior in BOINC projects, especially if you have just switched from another project to RNA World or if the work units of a given BOINC project are very heterogenous compared to each other. RNA World work units are de facto extremely heterogenous in their system requirements. For each computation, a series of small mini simulations is run on the server to estimate the time required for completion on the server. Since your machine differs from our server hardware, information based on the benchmarks performed from time to time on your machine are used to scale the duration determined for that work unit on the server to your machine. This scaling process is good but not perfectly accurate. So, the first work units often differ detectably in completion time from what the progress bar indicates. But, with more and more work units of that type pouring in on your system, a BOINC-integrated calculation mechanism corrects for that deviation in a progressive manner. So, with time, this "sitting at 100%" should become more and more rare. However, if the incoming work units are extremely different from each other in type (as is often the case for RNA World work units even if based on the same application), this adjustment might again turn out inaccurate for these new work units and an automatic re-adjustment will take place. In the worst case scenario, this might lead to the perception of an apparently constant unreliablity of the progress bar indicator. The bottom line is that you should just expect a work unit to take longer than indicated and not conclude there is something wrong with the work unit or your hardware.这在BOINC平台是一个很常见的状态，尤其是你刚刚从另外一个项目转换到RNA World或者BOINC项目的任务单元之间非常不同。RNA World任务单元在系统需求方面事实上是极端相异的。每一个计算，在服务器上运行的一系列微型的模拟完整了计算时间需求。因为你的机器和我们的服务器硬件不同，由基准程序实时运行的信息将会被服务器用来度量任务持续时间。这种度量很有好处但并不是完全准确。所以，第一个任务单元的完成时间经常与任务进度条所显示的不符。不过，由于越来越多的该类型任务单元涌入你的系统内，一个BOINC整合计算方法将会以一种渐进的方法修正这种偏差。但是，如果输入任务单元类型之间迥然相异（尽管基于一个子项目，这在RNA World中会有这种情况），这种调整队新任务可能会再次显得不准确，而在调整将会进行（进一步修正）。在最坏的情况下，这可能会导致一种对于进度条的不信任。至少，你应该知道该任务将比所显示的数值消耗更多的时间而不是任务单元本身有误或硬件出了问题。

终于结束了。。。。。

Youth · 发表于 2011-1-11 20:11:47

xuyongchen和feynord翻译辛苦了~~[em05]

我已经把两位的翻译成果搬运到了：http://www.equn.com/wiki/RNA_Wor ... 8%E8%A7%A3%E7%AD%94，有几个条目我适当修改了一下，其它大部分内容还要靠大家一起来校对了。[em03]

xuyongchen · 发表于 2011-1-11 21:31:25

好的，这个翻译有点急且赶工，很多地方感觉没有深入思考就给翻了，大家提意见吧

feynord · 发表于 2011-1-22 14:45:32

回复 25# Youth

Youth辛苦了~
这些翻译有可能被放在官方网站上么?
比方说这个网页的中文版
http://www.rechenkraft.net/wiki/index.php?title=RNA_World/FAQ/en

Youth · 发表于 2011-1-22 15:34:59

这几个页面都可以放的

我的意思是先内部多校正校正，差不多了再搬上去：）

xuyongchen · 发表于 2011-1-22 18:00:01

还是多修改好，多修改好

		自动登录	找回密码
密码			新注册用户

[已翻译，待校对] RNA World 的 FAQ