Window,is proposed to address the issue of heterogeneity in large-scale P2P networks. The algorithm allows nodes to determine the amount of information they collect based on their capabilities, ensuring efficient utilization of bandwidth resources. With PeerWindow, each node can gather information from thousands of other nodes with only 1kbps of bandwidth contribution.
(2) An application-layer multicast algorithm, called Prefix-Matching Multicast, is introduced for heterogeneous environments. This algorithm guarantees message delivery to all relevant nodes without redundancy, ensuring each node receives a message only once. Theoretical analysis proves its completeness, and experimental results confirm its high multicast efficiency.
(3) The Tourist routing protocol is presented to overcome the lack of adaptivity in existing structured overlay network routing algorithms. Tourist optimizes routing efficiency by adapting to the available bandwidth resources across all nodes in the system. It automatically tunes itself to achieve optimal performance in a super-large-scale P2P network of up to 5 million nodes, where all messages can be routed within two hops.
(4) PB-link Tree, a P2P indexing management algorithm, is proposed to reduce bandwidth consumption during joint queries. By distributing a B+ tree across multiple nodes using hash-based localization, PB-link Tree eliminates the need for transmitting intermediate query results in large volumes. Experimental evaluations show that PB-link Tree outperforms the conventional DB-link Tree in terms of reduced data transmission and shorter query execution times.
(5) Granary, a wide-area network distributed storage system, is developed and implemented, incorporating the research outcomes. Granary facilitates object-oriented data storage and management, supports attribute-based queries, offers improved data access patterns, and enhances query processing capabilities, simplifying the development of upper-layer applications.
In conclusion, this study delves into the intricate challenges posed by peer-to-peer architecture in wide-area network distributed storage systems. It presents innovative solutions addressing node information collection, multicast in heterogeneous environments, routing efficiency, and indexing management. The implementation of Granary demonstrates the practicality of these approaches, fostering more reliable, available, and efficient data storage and retrieval in a global context. The findings contribute significantly to the field of P2P systems and distributed storage, providing a solid foundation for future research and development.