Many researchers, museums, and archivists are struggling to host and
protect a vast amount of important public data. Although the price of
high capacicity hard drives has decreased significantly in recent years,
most open-source and scientific organizations don't have the ability to
build and maintain their own hosting infastructure. As such many groups
get trapped relying on expensive cloud storage providers. For continuously well
funded projects, cloud storage is an excepted line item, but for smaller projects
with limited term grants, once the grant money runs out often times researchers are
forced to move their data to offline external hard drives.
Moreover many museums simply can't afford to take the steps to digitize their collections
because of the lasting monthly expenses cloud storage would require.
After working with a number of these grant supported projects and watching them fail to find
a sustainable storage solution, we decided to build Arken.
Arken is a storage system built to lower the bar needed to sustainably and affordably
host and back up publicly accessible data. Instead of needing a powerful enterprise server
Arken can run on almost anything--including a $35 dollar Raspberry Pi.
Utilizing IPFS (the Inter-Planetary Filesystem) Arken also makes content publicly accessible
without requiring organizations to register a static ip address or learn to port forward and setup
a web server on your own.
Truly backing up any piece of data means having multiple copies. (Optimally even in geographically different regions.)
Although anyone can setup their own Arken Cluster--and we encorage you to do so if you're interested/have the means--we'd
also like to encorage you to think about joining the Core Arken Cluster. The Core Arken Cluster is the official data repository maintained by
the Arken Project, dedicated to preserving open source and scientific data. Besides being an Archival System, Arken is also a community
of researchers, organizations, and individuals contributing storage space to the core cluster. As a member of the Core Cluster
organizations agree to host a subset of eachothers data. By creating a geographically distributed backup, your data will never be lost
in the case that any single node fails or looses connection.
If you are an individual or organization without the ability to host your own system, you may apply for an allocation to the Core Arken Cluster by submitting your files as an application. The Arken Project typically gives allocations to researchers in need who's data is at risk of being lost and who's data will benefit the public by being openly accessible. If you believe your data falls into this category please submit your application to the core-keyset using the Arken import tool.
Arken is a management engine that runs on top of the IPFS (Interplanetary File System) protocol. Each instance of Arken calculates which important files are hosted by the fewest number of other nodes on the network and should thus be locally backed up to reduce the risk of data loss. Arken also knows how much space it's using on your system and will respect limits you set by locally deleting data that is backed up more than a specified number times across the cluster.
If you are a researcher, collections curator, or open source publisher check out our guide on using the Arken Import Tool to submit & upload your data to the Core Arken Cluster here.
Checkout the Core Keyset, open up a keyset file to explore the File IDs contained within. To view the file in your browser or download it you can then paste the link to the Arken to Web Link like so.
https://link.arken.io/ipfs/REPLACE-ME-WITH-FILE-ID