AGNUS - The Altruistic GNUtella Server
The use of peer-to-peer systems for sharing information, files and other resources has risen dramatically over the last three years. File sharing is the ‘killer app’ that has driven this explosion in popularity. The first generation of peer-to-peer file sharing systems, including Napster, Morpheus and Kazaa, followed the traditional client-server paradigm. However, concerns over the legality and scalability of such systems has driven the development of entirely decentralized file sharing protocols, the most popular of these being Gnutella. To date, decentralized file sharing systems have been unable to compete with those based on a centralized architecture in terms of the QoS (quality of service) they offer to users. AGnuS improves QoS across the Gnutella network by increasing file availability, improving network friendliness, increasing file quality and improving file-acquisition time. These results are achieved by layering caching, load balancing, content-based routing and file filtering services on top of the core Gnutella protocol. AGnuS is implemented using the base Gnutella protocol in order to maintain compatibility with the large number of existing Gnutella hosts. Our experimental results show significant QoS improvements when compared to a network of generic Gnutella hosts.
A cursory analysis of the Gnutella protocol shows that it could not possibly scale over a very large number of users. The overhead incurred due to processing messages from every other host grows exponentially with network size. Supporting one million isers on Gnutella would require over 2GBps bandwidth1. This is great enough to bring the Internet infrastructure to it's knees.
The current scalability policy used in Gnutella is to segment the network by affixing each message with a time to live value which limits how far it will travel through the Gnutella network. This effectively creates a 'search horizon' around each host limiting their access to the network. Typically a user will have access to 10,000 hosts at any one time. While this addresses the scalability issues that we encounter on Gnutella, it vastly reduces network coverage to approximately 1/10th of the total number of hosts.
Agnus provides scalability through differentiation rather than segmentation. Each Agnus host connects to peers according to their content, making it possible to create a series of virtual networks running over Gnutella. By forwarding requests to the appropriate virtual network (e.g. audio, video, document etc.) greater coverage of the network can be achieved. Each host still has access to only 10,000 peers, however, each of those peers is carrying the file type that the host is searching for resulting in a far greater quantity of quality search hits.
The decentralised and unstructured nature of the Gnutella network means that it is impossible to intelligently provision the network with resources: Users in one search horizon may find many more results than users in another. This inequality increases further when one is looking for an uncommon file type or a rare file.
The Xerox Parc paper - "Free Riding on Gnutella"2 discovered that 50% of files are served by only 1% of hosts, this makes the actual architecture of Gnutella much closer to the traditional client-server model rather than peer-peer. This makes the system as a whole vulnerable to the failure of these large servers and makes the servers themselves legitimate targets for legal action.
Agnus seeks to address the issues of resource distribution using file replication and content based routing. Each Agnus host locates and caches popular files in order to serve them to the network, thus providing a small net gain to the Gnutella community, while saving the user time via automatic downloads. Content based routing is implemented by each AGNUS host scanning for the local density of different file types and directing queries to the most appropriate area of the network.
One of the broader conclusions of the Xerox paper was that rational users on Gnutella will not share or will share very little of their bandwidth. The average user will always seek to maximise the benefit they accrue from the system even if it is to the detriment of the community as a whole - the "Tragedy of the Digital Commons"3.
Agnus provides a tangible reward for sharing: by automating the download of popular files, users can leave their PC running overnight or any time they are away from their computer and return to find a selection of the most popular files whilst providing a positive effect to the Gnutella community.
1. "Why
Gnutella Can't Scale, No Really." - Jordan Ritter
2. "Free
Riding on Gnutella" - Eytan Adar & Bernardo Huberman
3. "The Tragedy
of the Commons" - Garrett Hardin
4. http://polo.lancs.ac.uk/p2p/DocumentsMain.htm