Hypercore is a distributed append-only log

Hypercore comes with a secure transport protocol, making it easy to build fast and scalable peer-to-peer applications. Think lightweight blockchain crossed with BitTorrent.

Hypercore currently serves as the foundation for a diverse range of P2P applications, including chat apps, filesystems, databases, and even a browser for the distributed internet. And the whole thing's MIT-licensed, so you can use it as you like!

If you are interested in trying it out we suggest you go to the Github repository of the Node.js implementation.

To get in touch with us, you can join our Discord, or reach out to any of the core team members, @mafintosh, @pfrazee, and @andrewosh.

Append-only logs?

If you are not familiar with append-only logs, they are basically just lists you can only append to. If you think about it in terms of normal array operations, it is a log where you can only call get(index), push(data) and retrieve the log length, but where you can never overwrite old entries.

Hypercore allows you to quickly distribute these kind of logs in a peer-to-peer fashion. Each peer can choose to download only the section of the log they are interested in, without having to download everything from the beginning.

Using append-only logs, Hypercore can easily generate compressed bitfields describing which portions of a log a peer has. This, among other things, helps make the replication protocol light and efficient.

Secured by Merkle trees and cryptography

Hypercore verifies log contents by building a Merkle tree using the BLAKE2b-256 hash function. Peers can also use this Merkle tree to only download the parts of the log they are interested in, and redistribute these parts to others.

Merkle trees are widely used in many distributed systems. They are normally only used to verify static content, because when content changes the root of the tree changes as well. To support mutable datasets, Hypercore uses asymmetric cryptography to sign the root of the Merkle tree as data is appended to it.

Only the owner of the private key can append to the log. But one Hypercore is rarely used on its own -- more powerful, multi-user data structures can be created by combining multiple cores. Hypercore's main purpose is to be a building block in other things.

Community

Hypercore is best used as building block to create powerful peer to peer applications and modules. It has served as the foundation of the Dat project for many years. Here are some of the groups using and building the technology.

Hyperdivision

Hypercore and cryptography related products and consultancy.

Beaker

Beaker is an experimental browser for exploring and building the peer-to-peer Web using Hyperdrives.

Digital Democracy

Digital Democracy works in solidarity with marginalized communities to use technology to defend their rights.

Geut

Custom JS Software Development Team. Code with opinions.

Cabal

Cabal is an experimental p2p community chat platform, focusing on group chat in channels.

Ara

Ara is a set of decentralized, open source software tools that handle real-world, user-level functionality in online identity, content distribution, and rights management.

arso

arso builds tools for decentralized archives of community media, currently developing Sonar, a p2p database and search engine.

Ink & Switch

We are an industrial research lab working on digital tools for creativity and productivity.

Wireline

Wireline includes open source protocols for identity, decentralized credentials, distributed data consistency, and the management of complex peer-to-peer networks.

DatCXX

DatCXX is a work in progress implementation of Dat in C++.

Playproject

The Playproject is an organization using dat for p2p search smart contracts and a distributed file storage network called datdot.

Peermaps

Peermaps is a distributed, offline-friendly alternative to commercial map providers such as google maps.

DatRS

Rustlang implementation of the Dat Project protocols. The goal is to eventually bind to C, WebAssembly, Swift and Java to support most native platforms.

Hyperpy

Python implementation of the Hypercore protocol.

Liberate Science

Liberate Science is a worker cooperative democratising research work.

Hyperdrive is a P2P filesystem

Hyperdrive is designed to help you share files quickly and safely, directly from your computer.

Hyperdrive is built using Hypercores. Using an append-only index you can find only the files you are interested from a large drive in milliseconds -- all P2P.

By default, readers only download the portions of files they need, on demand. You can stream media from friends without jumping through hoops! Seeking is snappy and there's no buffering.

If you want to learn more about the code you can see the Node.js implementation in the GitHub repository

If you want to download or share filesystems, we suggest you read on and try out the daemon. This website is even available as a drive, using the link at the bottom of the page!

The Hyperdrive Daemon

The cross-platform Hyperdrive daemon deals with all the storage and networking details for you. The daemon's CLI lets you create, share, and inspect Hyperdrives in your library, and its gRPC API provides full programmatic control over daemon-managed drives.

You can install the daemon today using the NPM package manager.

npm install -g hyperdrive-daemon

If you're on Mac or Linux, the daemon also offers FUSE support, meaning your drives will appear as normal folders on your computer -- moving files in and out of Hyperdrive is a breeze.

As with our other modules and data structures, you can read more about the daemon in Hyperdrive Daemon Github repository.

How does Hyperdrive work?

Hyperdrive indexes filenames into a Hypercore. To avoid having to scan through this entire Hypercore when you want to find a specific file or folder, filenames are indexed using an append-only hash trie, which we call Hypertrie. The hash trie basically functions as a fast append-only key value store with listable folders.

If you are looking for a simple database abstraction on top of Hypercore you can also use Hypertrie standalone, outside of Hyperdrive. In you want to learn more about how to use it, we suggest you look at the Hypertrie repository

Since it builds on top of the append-only log, it inherits the same guarantees of every change being versioned by default, making it easy to see historical changes and prevent accidental data loss.

In addition to the Hypertrie, which we refer to as the metadata log, Hyperdrive uses another Hypercore to store the binary file content of each file you insert. This dual-log design makes it easy to replicate or watch only the metadata log, without content, if that is what you are interested in.

Each entry in the Hypertrie links to the content log to signal where a file's binary data starts and ends. Additionally, the entry contains all the normal POSIX data you'd be interested in, such as modification time, creation time, file modes, etc.

Mounts

For better composability and collaboration, an entry can also link to a completely different Hyperdrive or Hypercore. We call this feature mounts Even though it's ostensibly simple, it can be used to build powerful collaborative features.

Internally we have been using Hyperdrive mounts for a concept we call "groupware", where each user mounts their own drive inside a single shared one, then applications render multi-user views over the group drive. The groupware pattern can be used to build lightweight, Dropbox-like applications, among others.

We are always exploring new ways to enhance Hyperdrive! If you are interested in collobarating with us always feel free to to open an issue on Github, reach out at @hypercoreproto or join our Discord.

Hyperswarm is a DHT for the home

Hyperswarm combines a Kademlia-based distributed hash table (DHT) for global discovery with MDNS to discover peers on local networks.

Hyperswarm is part of the Hypercore protocol but can be used standalone as well. You can learn more on the Hyperswarm Github organisation and in the main Node.js implementation repository

A Kademlia DHT

As with many P2P projects, Hyperswarm uses a DHT to discover peers on the internet. Hypercore maintains a hash of its public key, called a "discovery key", that is used as the topic with which peers share their IPs and ports to discover each other.

Hypercore comes with a capability system wherein each peer has to verify they know the correct public key before they start sharing data. Even if you know a Hypercore's discovery key, you must also know the public key in order to download data from peers.

The DHT used in Hyperswarm is based on Kademlia, a UDP-based DHT which is used by many projects, including BitTorrent. You can learn a lot more about Kademlia in the Kademlia paper.

DHTs scale well in general, but are notoriously hard to work with. Hyperswarm employs a series of heuristics to answer queries quickly and garbage collect stale data as fast as possible.

As a general note, most DHTs (Hyperswarm included) generally expose routing information such as your IP/port to help route requests. Privacy-preserving DHTs are an active research area by many participants in the P2P ecosystem.

Distributed Holepunching

Ideal P2P networks should be able to connect any two peers together, wherever they are. In practice, this is a challenge due to firewalls which reject incoming connections and NATs which mask your IP. To help “break through” firewalls, a technique called UDP holepunching is used.

Traditionally UDP holepunching requires the use of centralised servers that are preknown by each peer in a network.

Hyperswarm expands on holepunching by making it a first-class feature: any peer in the DHT can help you holepunch to any other peer that it knows about.

After finding the relevant peers, Hyperswarm utilises the UTP transport protocol, to make reliable connection streams between peers. For non-holepunching cases, TCP is used as well.

Importantly, Hyperswarm connections are unencrypted -- encryption must be added separately. As an example, once Hyperswarm has established a connection, Hypercore wraps it in a Noise protocol stream to add end-to-end encryption.