The Open P2P Tracing Project
Internet scale public resource computing (PRC), where users at the edge of the network donate resources to provide a service have become massively popular, particularly in the domain of P2P file sharing. Popular file sharing systems are large-scale, anonymous and decentralised. The digital communities that they support are a significant and novel arena in which to study human social interaction.
By building a better understanding of how users behave and why they behave this way, it might be possible to build better P2P systems and perhaps, bridge the gap between the behaviour of P2P users and what society considers acceptable.
Despite the particularly critical role of user behaviour in PRC and specifically in P2P, relatively little work has been conducted on accurately characterising and understanding user behaviour. More importantly, studies of user behaviour on such systems rarely go beyond the ‘bits and bytes’ and try to understand the social and psychological factors that drive user behaviour in these novel digital communities.
We are in the process of building an open system for P2P tracing which, we hope, will address some of these issues by providing a freely available, significant and continually updated body of trace data that can be easily accessed over the web.
Shortcomings of Existing Studies
There are a number of notable shortcomings in current studies of user behaviour on P2P file sharing systems:
Most studies of user behaviour on P2P file sharing networks concern themselves only with the technical aspects of user behaviour (files shared, bandwidth usage etc.). While this information is critical for simulation of P2P traffic and for the development of approaches encouraging positive user behaviour, the next step - reasoning about the social and psychological factors which produce this behaviour, is rarely taken. Furthermore, most studies do not take into account the real-world factors which may affect user behaviour.
Another significant gap exists in the body of work relating to P2P user behaviour regarding the identification of underlying trends. For example, the data-point provided by Adar’s 2000 study of free riding [1] has been used in a significant body of research up until the present day, however, when we revisited this study in 2005, we found that free riding had increased, revealing a significant, and (until that point) unidentified trend [2]. To reveal such trends, ongoing monitoring of user behaviour is required.
Finally, most studies of user behaviour use closed data sets. This is highly undesirable as gathering a sufficiently significant trace can be an extremely time-consuming process. The use of closed data sets also prevents data-points from being re-visited using different methodologies.
The Open P2P tracing project aims to improve our understanding of user behaviour on P2P resource sharing networks by providing continually up to date and historical network trace data which will be made freely accessible to all interested parties. It is our hope that this open data set will grow over time to a resource capable of better informing P2P simulations and supporting research into encouraging positive user behaviour.
Locally we intend to use the data produced by this system to explore the social and psychological factors which determine user behaviour. This will be done in collaboration with researchers at Lancaster and York St. Johns psychology department. We are also interested in the creation of tools to facilitate the interpretation of this data set, particularly the use of natural-language processing to categorize resource discovery traffic.
The project is currently in the early stages of development. As monitoring data becomes available, it will be made accessible to the research community through this website.
If you are interested in using this data set or participating in any other way, please contact us.
1. Free Riding on Gnutella (2000): http://www.firstmonday.dk/issues/issue5_10/adar/
2. Free Riding on Gnutella Revisited:
The
3. Is Deviant Behaviour the Norm on P2P File Sharing Networks? (in press): On this site and at http://dsonline.computer.org/