|
楼主 |
发表于 2007-4-12 14:00:47
|
显示全部楼层
补充一点
In a second phase, the initial dataset is being updated with newly published genomic data, adding 393,999 new protein sequences. Additionally, a fully curated reference dataset was added (SwissProt - 254,609 sequences), contributing to controlled annotation and data cross-referencing. Finally, an experimental dataset of about 3 million potential protein sequences derived from Open Reading Frames (ORFs) lacking a classical computational coding prediction was added, in an attempt to discover additional protein sequences or coding patterns. This second phase of the project is expected to take an additional 4 months of WorldGrid processing.
(翻译:MatthewBB)
第二阶段中, 初始数据集用最新公布的染色体组数据进行了更新,增加了393999个新的蛋白质序列.此外还增加了一个完全手工组织管理的参考数据集(SwissProt-254609个序列), 用于受控的分析和数据的交叉引用. 最后, 增加了一个从缺少传统计算性编码预测的不确定数据帧(ORFS)派生出的包含3百万潜在存在的蛋白质序列的实验数据集,这一数据集用于发现另外的蛋白质序列或编码模式. 项目第二阶段预期再额外耗费WorldGrid 4个月的处理时间.
不知道翻得对不对,大概这个意思吧 |
|