Meet Arken, A Decentralized Digital Archive Built for the World's Open Source and Scientific Data.


A bit of Backstory

Many researchers, museums, and archivists are struggling to host and protect a vast amount of important public data. Although the price of high capacicity hard drives has decreased significantly in recent years, most open-source and scientific organizations don't have the ability to build and maintain their own hosting infastructure. As such many groups get trapped relying on expensive cloud storage providers. For continuously well funded projects, cloud storage is an excepted line item, but for smaller projects with limited term grants, once the grant money runs out often times researchers are forced to move their data to offline external hard drives.
Moreover many museums simply can't afford to take the steps to digitize their collections because of the lasting monthly expenses cloud storage would require.
After working with a number of these grant supported projects and watching them fail to find a sustainable storage solution, we decided to build Arken.

What is Arken?

A Non-Technical Version

Arken is a storage system built to lower the bar needed to sustainably and affordably host and back up publicly accessible data. Instead of needing a powerful enterprise server Arken can run on almost anything--including a $35 dollar Raspberry Pi. Utilizing IPFS (the Inter-Planetary Filesystem) Arken also makes content publicly accessible without requiring organizations to register a static ip address or learn to port forward and setup a web server on your own.
Truly backing up any piece of data means having multiple copies. (Optimally even in geographically different regions.) Although anyone can setup their own Arken Cluster--and we encorage you to do so if you're interested/have the means--we'd also like to encorage you to think about joining the Core Arken Cluster. The Core Arken Cluster is the official data repository maintained by the Arken Project, dedicated to preserving open source and scientific data. Besides being an Archival System, Arken is also a community of researchers, organizations, and individuals contributing storage space to the core cluster. As a member of the Core Cluster organizations agree to host a subset of eachothers data. By creating a geographically distributed backup, your data will never be lost in the case that any single node fails or looses connection.

If you are an individual or organization without the ability to host your own system, you may apply for an allocation to the Core Arken Cluster by submitting your files as an application. The Arken Project typically gives allocations to researchers in need who's data is at risk of being lost and who's data will benefit the public by being openly accessible. If you believe your data falls into this category please submit your application to the core-keyset using the Arken import tool.

A Technical Version

Arken is a management engine that runs on top of the IPFS (Interplanetary File System) protocol. Each instance of Arken calculates which important files are hosted by the fewest number of other nodes on the network and should thus be locally backed up to reduce the risk of data loss. Arken also knows how much space it's using on your system and will respect limits you set by locally deleting data that is backed up more than a specified number times across the cluster.

Interested in Uploading Data to the Core Arken Cluster?

If you are a researcher, collections curator, or open source publisher check out our guide on using the Arken Import Tool to submit & upload your data to the Core Arken Cluster here.



Want to explore the data already stored in the Core Arken Cluster?

Checkout the Core Keyset, open up a keyset file to explore the File IDs contained within. To view the file in your browser or download it you can then paste the link to the Arken to Web Link like so.

https://link.arken.io/ipfs/REPLACE-ME-WITH-FILE-ID