您在這裡

How Peer-to-Peer File Sharing Works

8 九月, 2015 - 15:12

Previously we noted the rise of client/server computing in organizations, including the use of “fat” clients containing many applications and some data, and “thin” clients which do not store data and contain so little application software that they are similar to a dumb terminal (yet much nicer to work with, since a thin client would normally have a web browser and might support both audio and video). Some industry leaders, such as Oracle CEO Larry Ellison, predicted in the nineties that the PC would become obsolete because networks would have such robust bandwidth and speed that a user could work online with applications in such a transparent way that they would be barely aware that the data and applications were not stored on their thin client device. Others, however, took a different view. They reasoned that since many users make regular use of a few key applications (such as word processing), there would continue to be a need (whether real or perceived) for those applications to reside directly on the user’s PC (so, it would be a “fat” client). Since, thanks to Moore’s Law, those devices were getting ever more powerful yet less expensive, most users’ PCs would have a lot of excess computing capacity. Instead of relying on a client/server architecture to serve up applications via the Internet, why not find ways to tap into the excess storage and processing capacity of the many PCs already connected to the net?

When FTP is used to share files, an FTP server must (at least temporarily) store the file before it can be forwarded to its destination. This is not necessary in a peer-to-peer network, in which resources (data and applications) are distributed throughout “peer” computers, instead of being concentrated in server computers. Once I have obtained a particular file from another user, my machine then can provide it to other users on the network. Any peer computer can at different times act as either a client or a server. And, some services are designed to ensure that every participant is both a “taker” and a “giver;” users who prevent access to files residing on their computers are eventually barred from further use of the network (a strategy known as “give to get”).

Early P2P applications worked as follows: Users install a P2P software application on their machines. When a user wants a particular file, he issues a request for that file, and agent software then searches the Internet for machines containing the P2P software. On each such machine found, the agent then searches for the requested file. This process worked pretty well at first, since computers operate very rapidly and the random path of searches meant that different user PCs were tapped at different times. However, it was inherently inefficient, since it meant that if multiple users sought the same file, each user’s agent would go through the same process of hunting up machines and then checking the contents of each machine it encountered until it found the desired file. So, very soon new P2P application software was developed that would create one or more indexes, which were stored on one or more computers. Most P2P software today involves creating at least one index, such as a directory of machines containing the P2P software. This index can be kept on every participant’s machine or, as with the Gnutella P2P service, the index can be kept on a small number of machines that act as directory servers. Upon receiving a user request, the agent first examines the directory on one of these machines, then uses that information to find those machines.

Some P2P services also create an index of music files stored on specific users’ machines. Since the directory/index itself is not a large file (in comparison with the music file itself), most experts don’t consider this approach to be a violation of the basic P2P principle. The music and video files that are to be shared are distributed across users rather than stored in a few centralized servers. Music files are large; when they are stored centrally a server can easily become paralyzed by multiple simultaneous user requests. In contrast, when these files are distributed across the Internet, no one user’s machine is heavily burdened. Also, to reduce burden on individual users, some services have different users’ machines provide portions of large files rather than the entire multi-megabyte file. Thanks to meta-data, which describes where each block fits, these portions are then reassembled at the recipient’s machine.