- 积分
- 16992
- UID
- 4
- 在线时间
- 小时
- 最后登录
- 1970-1-1
|
楼主 |
发表于 2005-10-18 09:07:56
|
显示全部楼层
THE FUTURE OF PEER-TO-PEER COMPUTING
An economical method for pumping up computing power by tapping
into P2P systems using Web server technologies.
By Alfred W. Loo
The client/server architecture [4] for computing systems was first proposed as an alternative to the conventional mainframe systems approach for large enterprises. In the mainframe approach, almost everything is done by mainframe computers. Processing in the mainframe quickly becomes a bottleneck in any information system. Enterprises are forced to keep pumping money into mainframe upgrades in order to maintain efficiency under increased processing demands.
Client/server models shift the processing burden to the client computer. A client is a computer that requests services from another computer (that is, the server), while a server is a dedicated computer providing services to clients in these models. For example, a client may request a database server to retrieve a record. After the server passes the record to the client, the client computer is responsible for further processing (calculating, formatting output, GUI preparation). Through workload sharing, client/server systems can improve overall efficiency while reducing the budget for computing resources.
Client/server models began gaining wide acceptance in the late 1980s when companies sought fresh competitive advantages in an ailing economy.
In the current global economic situation, companies are again searching for ways to improve their processing power without further investment in new hardware and software. Many client computers are idle most of the time and have unused disk storage capacity. The next logical step is to maximize the use of these client computers. The peer-to-peer (P2P) model is the answer.
In a P2P system [1], computers can now act as both clients and servers. Their roles in any task will be determined according to what is most appropriate for the system at the time. This approach minimizes the workload on servers and maximizes overall network performance.
P2P computing allows users to make use of the collective power in the network. It helps organizations tackle the kind of large computational jobs they could not handle before. P2P implementation is also cost effective for individuals as well as small companies. The benefits are lower costs and faster processing times for everyone involved.
Although P2P is one of the latest industry buzzwords, there are still problems in developing large scale P2P projects. Here, we look at two high-profile P2P networks as examples and present our solutions. The Napster Model Napster is a high-profile P2P network that gives its members the revolutionary ability to connect directly to other members’ computers and search their hard drives for digital music files to share and trade.
The operations of Napster are described in Figure 1. Members download a software package from Napster and install it in their computers. The Napster central computer maintains directories of music files of members who are currently connected to the network. These directories are automatically updated when a member logs on or off the network. When-ever a member submits a request to search for a file, the central computer provides information to the requesting member who can then establish a connection directly with another member’s computer possessing that particular file. The download of the target file takes place directly between the members’ computers, bypassing the central computer.
The power of Napster and similar applications is they allow the sharing of widely dispersed information stores without the need for a central file server. Over 36 million people joined the Napster community and it rapidly accelerated the development and implementation of other P2P models. The limitation is it can only share music files and participants cannot share other resources. (It has also faced considerable legal challenges, unrelated to its technological model, from music publishing companies objecting to the free copying of copyrighted material. Napster is now working on a new model in order to avoid the legal problems.)
CPU Power Sharing
In addition to file sharing, computing power can be shared by P2P networks. In April 2001, Intel Corporation, the University of Oxford, the National Foundation for Cancer Research, and United Devices, announced a joint P2P computing project aimed at combating cancer by linking millions of PCs in a vast P2P network. This network is far more powerful than any single supercomputer. The project’s implementation is quite simple. Each user downloads a small program to his/her computer via an Internet connection. The program works as a screen saver and it runs only when the computer is idle. The objective of the program is to discover drugs for the treatment of cancer. It tests chemicals by “bending and flexing” each of hundreds of millions of molecular structures to determine if they interact with proteins involved in cancer therapy. When a given molecular structure triggers an interaction with target protein, it is transmitted back to the coordinator through the Internet.
This project is succeeding in the sense it has attracted over one million PC owners to participate and a total donation of about 200,000 years of CPU time as of last May. However, it is extremely difficult for other organizations or individuals to develop P2P projects similar to the anti-cancer program. The weaknesses and problems of this method are discussed here.
Security. Participants must completely trust the research organization before they download the programs. Allowing P2P programs to run on a computer greatly increases vulnerability to security breaches. Such breaches may include:
? Deleting files or directories on the computer;
? Reading from or writing to the files on the computer;
? Executing programs or commands such as making long-distance calls with a modem and telephone line; and,
? Connecting to other computers and performing illegal operations such as hacking.
It is very difficult to secure P2P applications against such misuses, especially where participating computers use operating systems like Microsoft Windows. Any organization less famous than Intel or the University of Oxford would have problems assuring participants of the safety of their network.
Motivation. Participants do not receive any tangible benefit from participating in this type of project. They donate computer time only because they believe in the project’s objective.
Many public or commercial organizations have thousands of PCs lying idle outside the 9A.M. to 5P.M. working hours. They are indeed ideal donors, but do not join such projects. Why? Above all, research results from this project will be the sole and exclusive property of Oxford University. Other academic institutions may not appreciate this notion. Although the other two partners in the anti-cancer project—Intel and United Devices—do not receive any direct benefit from their participation, many observers assume otherwise, even if only in terms of marketing and public relations coups. Such perceptions deter many organizations from participating in likeminded projects. In fact, it is highly unlikely potential donors will be attracted to projects with explicit commercial objectives.
Performance efficiency. In the anti-cancer project, participants must download a program and install it on their computers. If the participants want to donate processing time to a new project, they must repeat this process. It is also extremely difficult to maintain the system and perform tasks like upgrading the programs on the participants’ computers. What’s needed is an automatic method of storing and updating the programs on remote computers.
Compatibility. The software used in the anticancer project can only be executed on PCs. There are a large number of workstations that use Unix or other operating systems, many of which are enterprise-based and unused after office hours. It is a pity that workstations cannot join this project due to compatibility problems. Another large-scale P2P project—seti@home from the University of California (setiathome.berkeley.edu)—analyzes signals picked up by radio telescopes in an attempt to detect extraterrestrial intelligence. The seti@home project solves the compatibility problem by using different versions for different platforms. Participants can download their versions according to different operating systems such as MacOS, Unix, and so on. This approach makes more computer power available to the researchers, but increases the cost of maintenance as many versions must be kept and updated.
Enabling Technologies
All of the requirements mentioned here can be met. Some problems can be solved with the latest enabling software while others need to be handled by building infrastructure.
Java. In many P2P systems, we find heterogeneous systems with different operating systems and hardware plat-forms. Program portability is of the utmost importance in such structures. Java is the only language that delivers the “write once, run anywhere” promise.
Security managers. Java is designed with security in mind and a security manager is one of its special features. By specifying security policy in the security manager, users can control programs’ behavior according to predefined limits.
Web servers and servlets. When we surf the Net, we select a desired Web page via an embedded link and our browser sends a message to the remote Web server. The message is encoded in Hypertext Transport Protocol (HTTP). The Web server then locates the appropriate Web page and sends it back to the browser. This simple model is sufficient as long as the user wants to view static Web pages.
Some applications require more processing and a dynamic Web page is required. In a typical example, the Web server must access several records and perform some calculations before it can assemble a Web page. In this case, the HTTP message will invoke a program on the Web server. On the server side, the Common Gateway Interface (CGI) programs are written in any language except Java, while servlets are Java programs. Java servlets are better than CGI programs as they have no compatibility or security problems.
Power Server Model
This ability to invoke a program on a Web server can be used in P2P applications. Figure 2 shows a power server model using server and servlet. In this model we introduce a new concept—a single client computer using the computing power of many servers simultaneously. This differs from conventional networks where many clients work with one server.
In traditional client/server systems, servers usually serve data to the client. In this model, we define “power servers” in the sense that these computers serve CPU power to other users. Every computer in the system will become a power server by installing a Web server package and a few Java programs.
One computer acts as a client. A Java application program is executed on the client by dividing a single, computation-intensive task into many small subtasks and queuing them in the system. The application program invokes a servlet on the server and transfers a small part of the task to the servlet. The servlet can then complete the task on the server. The computed results are sent to the client. A performance test of this model is available in [6].
The behavior of Web servers is well defined. Many good server software packages are available [5] and many of them are freeware/shareware. These packages are small and easy to install. Any computer user should be able to install a Web server package and allow their computer to act as a power server.
We need to update the server machines one-by-one in the power server model. If we have a large number of servers, this maintenance is very time consuming. The drawback can be overcome by automation. Because most Web servers have upload functions, the maintenance job can be alleviated by using this function. This can be achieved by uploading the new version of the servlets and is automated by use of a special program on the client computer. The maintenance job can be further reduced if all participants are within one organization and their computers are connected in a local area network. Only one copy of the Web server and servlet is installed on the network drive. Every computer invokes the Web server and servlet using the single version on the network drive and thus the maintenance job is streamlined.
Future Development of P2P Systems
Some hurdles cannot be overcome with software alone. Building reliable automobiles alone will not solve transportation problems. We also need highways and gas stations, or no one will be interested in using a car. We must build a computing infrastructure to realize the full benefits of P2P applications.
Infrastructure of distributed systems has been the subject of much research. One example is the Globus Grid Computing Project, which develops complex infrastructures for different kinds of distributed systems. Interested readers can obtain information of the Globus project from [2, 3]. However, a simple infrastructure is sufficient for our power server model.
Coordinator. One problem of this model is the difficulty in finding power servers. We can solve this problem by adding a coordinator to the system as illustrated in Figure 3. The coordinator is a computer that stores application servlets and the IP addresses of all power servers.
The client stores the servlets on the coordinator.
We must build a computing infrastructure to realize the full benefits of P2P applications.
Any computer owner who wants to donate computer power to the network must register with the coordinator and provide the following information: IP address; type of processor; when and how long it is available; and the amount of memory available.
The coordinator is not involved in the actual computation. It stores information of participants processing capabilities. The concept is similar to the work of [7]. Among the tasks of the coordinator:
? Allows new users to register;
? Maintains the database of power servers’ information;
? Matches user requirements with power servers and passes the IP addresses of power servers to the user; and
? Transfers servlets to the power servers
The user contacts the coordinator to get the IP addresses of available power servers and uses the IP address to initiate the servlet on the power servers.
For very large P2P, multiple levels of coordinators for each country, city, and organization might be necessary. The organization coordinator will record the IP addresses of computers in its organization. The city coordinator will record the addresses of all organization coordinators and power servers in the city. A global system will include many country, city, and organization coordinators.
Incentives. If we want to build a P2P system consisting of different organizations and individuals, we must provide incentives to the participants. One way to do this is to set up an association in which members (organizations and individuals) can share each other’s computing power. Members of the association commit to connecting to the Internet for agreed amounts of time, allowing other members access to computing resources.
It might also be possible to create a market for surplus computing power; individuals or organizations could sell their unused computing power and earn income, which would increase the return on investment and shorten the payback period of the computers.
Comparison. In addition to overcoming all the problems of the anti-cancer project, the power server model’s advantages are that participants can initiate new projects at any time without having to upgrade or add software to the power servers. Moreover, each computer owner is confident his/her computer is safe as it is protected by their own version of the security manager. The limitation of this model is the programs must be written in Java as it is the only language that can run across all platforms. (The differences between all three models are summarized in the table here.)
Conclusion
Many enterprises are waiting for vendors to develop software products that allow for sharing CPU power with each other. Existing Web server technologies can be easily extended using the model described here. In addition to saving the money not spent on expensive dedicated P2P software packages, P2P systems using Web server technologies will be easier to maintain as the features of Web servers are already well understood. The Web server will become a power server by adding a few Java programs.
We have presented an inexpensive way to dramatically increase computing power. P2P networks for power exchange will be popular in the future. Many organizations and individuals will discover they can do things in different, much more powerful, ways as they obtain increased access to computing power.
References
1. Barkai, D. P2P Computing. Intel Press, Santa Clara, CA 2002.
2. Foster, I., Kessselman, C., Nick, J. and Tucke, S. The physiology of the grid: An open grid services architecture for distributed systems integration. Open Grid Service Infrastructure WG, Global Grid Forum, 2002.
3. Foster, I., Kessselman, C. and Tucke, S. The anatomy of the grid: Enabling scalable virtual organizations. International J. Supercomputer Applications 15, 3 (2001).
4. Goldman, J., Rawles, P. and Mariga, J. Client/Server Information Systems. Wiley, Hoboken, NJ, 1999.
5. Hunter, J. and Crawford, W. Java Servlet Programming. O’Reilly, Sebastopol, CA, 2001.
6. Loo, A., Bloor, C. and Choi, C. Parallel computing using Web servers and servlets. Internet Research 10, 2 (2000).
7. Schwartz, D. Cooperating Heterogeneous Systems. Kluwer, Dordrecht, The Netherlands, 1995.
Alfred W. Loo ([email protected]) is an assistant professor in the Department of Information Systems at Lignan University, Tuen Mun, Hong Kong. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee.
? 2003 ACM 0002-0782/03/0900 $5.00 |
|