查看: 2963|回复: 2

[已转移到维基条目] [Wiki条目]XtremLab项目介绍

发表于 2013-4-12 12:32:25 | 显示全部楼层 |阅读模式
本帖最后由 arthur200000 于 2013-5-5 17:26 编辑


@昂宿星团人 改标记啦


参与人数 1维基拼图 +8 收起 理由
昂宿星团人 + 8 扫雷+1


发表于 2013-4-12 14:00:06 | 显示全部楼层
本帖最后由 SaintLaser 于 2013-4-14 18:35 编辑

== Quick Summary ==
Desktop grid (DG) systems use the idle computingpower of many volunteered desktop PC's to support large-scale computation andstorage. For over a decade, DG systems have been the largest and the mostpowerful distributed computing systems, offering a plethora of computing powerat a fraction of the cost of supercomputers. The volunteer desktopsparticipating in DG projects are volatile and heterogeneous, but there islittle detailed information about their volatility and heterogeneity. Yet thischaracterization is essential for the simulation and modelling of such systems.We are conducting a project whose short-term goal is to obtain detailed pictureof the DG landscape, and whose long-term goal is to create a testbed fornetwork and distributed computing research. To this end, we have deployedXtremLab, which is a BOINC-based project that actively measures host CPU andnetwork availability on volunteer desktops. The resulting resource measurementdata and characterization will be useful for a broad range of research areas,including distributed and peer-to-peer computing, and fault tolerance. Ourlong-term goal is create a large-scale testbed for networking and distributedcomputing research. Ultimately, we believe the results will help broaden therange of applications that can utilize desktop grid systems, and acceleratediscovery in a variety of scientific domains.

== Background ==
Since the late 1990's, DG systems, such as
SETI@Home, have beenthe largest and most powerful distributed computing systems in the world,offering an abundance of computing power at a fraction of the cost ofdedicated, custom-built supercomputers. Many applications from a wide range ofscientific domains -- including computational biology, climate prediction,particle physics, and astronomy -- have utilized the computing power offered byDG systems. DG systems have allowed these applications to execute at a hugescale, often resulting in major scientific discoveries that would otherwise hadnot been possible.

The computing resources that power DG systemsare shared with the owners of the machines. Because the resources arevolunteered, utmost care is taken to ensure that the DG tasks do not obstructthe activities of each machine's owner; a DG task is suspended or terminatedwhenever the machine is in use by another person. As a result, DG resources arevolatile in the sense that any number of factors can cause the task of a DGapplication to not complete. These factors include mouse or keyboard activity,the execution of other user applications, machine reboots, or hardwarefailures. Moreover, DG resources are heterogeneous in the sense that theydiffer in operating systems, CPU speeds, network bandwidth, memory and disksizes. Consequently, the design of systems and applications that utilize thesesystem is challenging.
维持桌面网格系统运行的计算资源由志愿者们的计算机提供,因为提供这些资源是志愿提供的的,我们必须确保桌面网格任务不会妨碍每一个计算机所有者的活动;桌面网格任务会在计算机忙碌时暂停或终止。这就造成了桌面网格的不稳定性,因为许多因素都可以导致桌面网格应用的任务无法完成【译注:比如一直挖BitcoinLOL】。这些因素包括使用键盘和鼠标、使用其他应用、重启计算机、硬件错误。此外,桌面网格计算资源有差异性——不同的操作系统【译注:从第三世界还在使用的Windows 98到欧美最新的Windows8,各个版本的LinuxUnixMac OSPS3,现在连带着ARM处理器的Android设备跟还在使用ARM11的树莓派——2013年初,Asteroid@Home——都能进来参一脚】、CPU速度、网络带宽、内存与硬盘大小。【译注:原文还遗漏了处理器、协处理器型号各有不同】。这最终对整个系统的设计及其应用的运行而言是一项严峻的挑战。

== Goals ==
The long-term overall goal of XtremLab is tocreate a testbed for networking and distributed computing research. Thistestbed will allow for computing experiments at unprecedented scale (i.e.,thousands of nodes or more) and accuracy (i.e., nodes that are at the"ends" of the Internet).
Currently, the short-term goal of XtremLab is todetermine a more detailed picture of the Internet computing landscape bymeasuring the network and CPU availability of many machines. While DG systemsconsist of volatile and heterogeneous computing resources, it unknown exactlyhow volatile and heterogeneous these computing resources are. Previouscharacterization studies on Internet-wide computing resources have not takeninto account causes of volatility such as mouse and keyboard activity, otheruser applications, and machine reboots. Moreover, these studies often onlyreport coarse aggregate statistics, such as the mean time to failure ofresources. Yet, detailed resource characterization is essential for determiningthe utility of DG systems for various types of applications. Also thischaracterization is a prerequisite for the simulation and modelling of DGsystems in a research area where many results are obtained via simulation,which allow for controlled and repeatable experimentation.
For example, one direct application of themeasurements is to create a better BOINC CPU scheduler, which is the softwarecomponent responsible for distributing tasks of the application to BOINCclients. We plan to use our measurements to run trace-driven simulations of theBOINC CPU scheduler in effort to identify ways it can be improved, and fortesting new CPU schedulers before they are widely deployed.
We conduct availability measurements bysubmitting real compute-bound tasks to the BOINC DG system. These tasks areexecuted only when the host is idle, as determined by the user's preferencesand controlled the BOINC client. These tasks continuously perform computationand periodically record their computation rates to file. These files arecollected and assembled to create a continuous time series of CPU availabilityfor each participating host. Utmost care will be taken to ensure the privacy ofparticipants. Our simple, active trace method allows us to measure exactly whatactual compute power a real, compute-bound application would be able toexploit. Compared to other passive measurement techniques, our method is not assusceptible to OS idiosyncracies (e.g. with process scheduling) and takes intoaccount keyboard and mouse activity, and host load, all of which directlyimpact application execution.

== Impact ==
The results of this research will be useful todistributed computing research and other fields in many of ways. First, thetrace data will enable accurate simulation and modelling of DG systems. Forexample, the traces could be used either to directly drive simulationexperiments or to create generative probability models of resourceavailability, which in turn can be used by simulators to explore a wide rangeof hypothetical scenarios.
Second, because the traces will contain thetemporal structure of availability, the traces will enable the assessment ofthe utility of DG systems for a wide range of applications. Currently, therange of applications that utilize DG systems effectively has been limited toapplications with loosely-coupled tasks that are independent of one another;the volatility and heterogeneity of DG resources makes the execution oftightly-coupled applications with complex task dependencies extremelychallenging. With the traces, we could conduct a cost-benefit analysis for awide range of applications; specifically, we could determine the limitationsthat prevent certain types of applications from utilizing DG systemseffectively, and suggest new research directions to address these limitations.
In addition, we believe our measurements couldbe useful for other sub-domains in computer science such as fault tolerance,peer-to-peer computing, and Grid computing. For example, one issue relevant tothe fault tolerance research community is how often resources crash and why.The data we collect will reflect the time to failure for each desktop resourceand thus be a valuable data set for those researchers. We will make the tracespublicly available to all these research communities.
Finally, we believe the results of this projectwill help improve performance and broaden the set of applications that can takeadvantage DG systems. Currently, only applications with independent,compute-bound tasks can use desktop resources efficiently. We hope that themeasurements collect in the near term will be useful in evaluating techniquesfor broaden the set of DG application to ones that are morecommunication-intensive and tightly coupled, for example.
== Past Work ==
We have previously conducted a number of relatedresearch efforts. First, we measured and characterized several DG systems atthe University of California at San Diego and the University of Paris-Sud. Weobtained several months of traces of the availability of hundreds of desktopPC's within these organizations. We then characterized the DG systems by obtainingseveral aggregate and per-host statistics. This characterization formed thebasis for a model describing the the utility of the DG systems for differentapplications, and for developing efficient ways of scheduling tasks to DGresources. So that others could use our gathered trace data sets, we created anan online DG trace archive publicly accessible at One limitation of this work, which we address in the XtremLab project,is that no measurements were taken of home desktop PC's, which contributesignificantly to Internet-wide DG projects.
我们过去进行了一系列的相关研究工作。首先,我们在加利福尼亚州大学-圣克鲁斯分校(University of California at SanDiego)
巴黎第十一大 (University of Paris-Sud)测量几个桌面网格系统并赋予特性,获得了那些系统中数百台桌面电脑的共工作数据。通过汇总这些特性我们可以建立一个模型,用来表示桌面网格系统下各个应用是如何工作的,令桌面网格系统更有效的制定任务。其他人也可以使用我们收集数据的办法,我们建立了一个联网的桌面网格系统的公共资料库,网址是。这个资料库有一个缺点,就是没有为整个桌面网格系统做出显著贡献的家用台式电脑的数据。
== The XtremLab Team ==
The members of the XtremLab team belong to theLaboratoire de Recherche en Informatique (LRI, i.e., computer sciencelaboratory) of the University of Paris-Sud, XI. In particular, Mr. Paul Malecotis a graduate student interested in distributed and parallel computing, and isthe primary developer of XtremLab. Dr. Derrick Kondo ( and Dr. Gilles Fedak both serve as academic advisors for the project.Dr. Derrick Kondo is an INRIA post-doctoral fellow interested in the simulationand modelling of large-scale distributed systems. Dr. Gilles Fedak ( is an INRIA research scientist and is interested in the design andimplementation of distributed systems. Professor Franck Cappello is thedirector of the project and the computer science laboratory.
巴黎第十一大学的计算机科学实验室 (Laboratoirede Recherche en Informatique LRI) 。特别的,Paul Malecot先生是一个对分布式计算、平行计算有浓厚兴趣的在校硕士研究生,是XtremLab的主要开发人员。Derrick Kondo博士( Fedak博士是该项目的学术顾问。Derrick Kondo博士是法国国家信息与自动化研究所(Institut National de Recherche en Informatique et en AutomatiqueINRIA)的博士后,他对大规模分布式系统的模拟和建设很感兴趣。Gilles Fedak博士(是法国国家信息与自动化研究所的研究员,对分布式系统的设计与运行有兴趣。Franck Cappello教授是计算机中心和本项目的主管。
The XtremLab project is funded by the InstitutNational de Recherche en Informatique et Automatique (INRIA), which is the non-profitnational French institution for computer science research.


校对可能会有点痛苦,一是文章本身比较长,二是翻译比较水。  发表于 2013-4-15 10:30


参与人数 1维基拼图 +27 收起 理由
昂宿星团人 + 27 篇幅不小啊,辛苦啦(*¯︶¯*)


发表于 2013-6-3 10:54:45 | 显示全部楼层


参与人数 1基本分 +20 收起 理由
昂宿星团人 + 20 哟西


您需要登录后才可以回帖 登录 | 新注册用户



小黑屋|手机版|Archiver|中国分布式计算总站 ( 沪ICP备05042587号 )

GMT+8, 2021-8-3 19:19

Powered by Discuz! X3.4

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表