Uploading Data to the Cluster

Introduction

What is the Core Arken Cluster?

The Core Arken Cluster is the official data repository maintained by the Arken Project, dedicated to preserving open source and scientific data. Besides being an Archival System, Arken is also a community of researchers, organizations, and individuals contributing storage space to the core cluster. As a member of the Core Cluster organizations agree to host a subset of eachothers data. By creating a geographically distributed backup, your data will never be lost in the case that any single node fails or looses connection.

Getting Started: How can I upload data to Arken?

If you're familiar with how GitHub works, submissions to the Core Arken Cluster are done through Pull Requests. If you're not familiar with GitHub no worries, we'll walk you through it. To prevent anyone/everyone from being able to upload to the Core Arken Cluster we handle uploads through application submissions to the Core-Keyset repository.

To write an application you'll need:
1. All of your data available on your local computer (or connected through an external hard drive).
2. A GitHub Account (Click here to sign up for an account.)
3. And the Arken Import Tool. (Click here to follow a tutorial to install the Arken Import Tool on your machine.)

If you've got any questions, reach out to us at [email protected].

Getting Started: How can I upload data to my own Arken Cluster?

Keysets don't just have to be for serious stuff however, anyone can create their own Keyset and use it to backup and share family photos, song recordings, etc... between your own Arken nodes.

The process for uploading data to your own Arken Cluster is nearly the same as uploading data to the Core Arken Cluster. However, you will need to create your own publicly accessible Git repository and use the link to that repository instead of the link to the Core Arken Keyset. (When creating your Git Repository make sure to create your own keyset.config file.) To add your keyset to your Arken node edit the called .arken/sources by removing the Core Keyset and adding the link to your own Keyset.

The Arken Import Tool

Installing & Using the Arken Import Tool:

Whether you're uploading data to the Core Arken Cluster or to your own Cluster, you'll need to use AIT (or the Arken Import Tool) to generate your file manifest and submit it to the Keyset. You can think of a Keyset in Arken is file manifest for the whole cluster. A Keyset tells the Arken nodes which files to backup and keep track of so that as nodes join and leave the cluster, the files are always backed up on multiple machines. If you've heard of GIT before, it's got a very similar syntax to AIT. AIT is a command line application which means that it doesn't have a graphical interface. I know that sounds scary but we'll walk you through setting it up and using it to upload your data.

First let's download AIT to your computer. Go to the AIT GitHub Page linked below, at the top of the screen will be the latest release of AIT. Under the "Assets" section will be a number of differently named versions of AIT. Each name corresponds to the operating system and type of computer it's made to run on. For example, "darwin-amd64" is meant to be run on a Intel based Mac (any mac made before the fall of 2020. Here "darwin" means MacOS.) Right click on the package that matches your operating system and computer type and copy the link to it.

(Note: Windows packages are coming! But for now, please use the Windows Subsystem for Linux to run AIT. After installing WSL please follow the Linux instructions for installing AIT.)

AIT Releases

Now open the terminal on your computer.
If your on Linux we'll assume you know how to get to your terminal. If you're on a Mac press the cmd + space bar keys on your keyboard to open spotlight search. Now type "terminal" and hit return.

(If you're on a Mac) Create your /usr/local/bin directory by running

$ sudo mkdir /usr/local/bin 



Now dowload the AIT application to your local bin by running the following command replacing the "PATH-TO-THE-APPLICATION" with the link that we coppied from the AIT GitHub page earlier.

$ sudo curl -L path-to-app -o /usr/local/bin/ait 

Now we must allow AIT to be executed as an application.

$ sudo chmod a+x /usr/local/bin/ait/ 


(On Linux) Run

$ sudo ln -s /usr/local/bin/ait /usr/bin/ait 


(On MacOS) Run

$ echo "export PATH=/usr/local/bin:$PATH" >> ~/.profile 


You may have to log out and back in, but after that you should be able to see the AIT menu by running:
$ ait 


If you're uploading to the Core Arken Cluster click here to checkout the AIT tutorial on uploading your data to the Core Arken Cluster.

Demo:

Comming Soon!