Zero Knowledge Cloud Storage

Cubbit's infrastructure revolves around three players:

The User: you access Cubbit directly via your chosen device (computer or phone).

The Swarm: a distributed, P2P network of Cubbit Cells where the data is stored.

The Coordinator: a suite of machine learning algorithms that optimize the payload distribution on the network, while also taking care of security and metadata. It’s also in charge of triggering the recovery procedure for files in the Swarm.

These three components interact to enable safe and private cloud storage inside a zero-knowledge architecture, ensuring that no one in the system, not even the Coordinator, can access the users’ data.

How it works

The Path of a File

Client-side encryption

The client generates a new AES 256 key and uses it to encrypt the file. To allow users to sign in and retrieve their keys from any device, this “file key” is stored on the Coordinator in an encrypted form, using a master key that is derived by the user's password. This “zero-knowledge” scheme ensures that no third party, not even the Coordinator, can access the user’s data and keys.

client side encryption illustration

File redundancy

The encrypted file is split into N chunks and then processed into K additional redundancy shards through Reed Solomon error-correcting codes. This procedure allows the retrieval of the payload even if individual cells go offline, as long as you can reach any set of N cells. Parameters are dynamically chosen and optimized such that the probability of downtime is lower than 10^-6.

File redundancy illustration

Authorization and peer selection

Next, the owner of the file asks the coordinator for authorization to upload it to Cubbit. The coordinator, in addition to taking care of this, assigns a location to the file inside the Swarm, determining which hosting peers are most suitable.  

To do so, the coordinator runs a fitness function in order to both nullify the probability of losing files due to natural disasters and grant a constant network performance. In other words, the coordinator spreads the chunks as far as possible, while also minimizing network latency and other factors (bandwidth usage, storage optimization, etc).

Authorization and peer selection illustration

File Distribution

Each of the N+K shards is stored on a different Cubbit Cell, called a ‘hosting peer’, this means that Cells don’t contain the user's files but encrypted shards of other people's files.

To make this possible, the coordinator facilitates peer-to-peer connections when needed, acting as a handshake server. Thanks to Reed Solomon, uptime is guaranteed as long as at least N hosting peers are online at the same time.

File Distribution illustration

Client-side encryption

The client generates a new AES-256 key and uses it to encrypt the file. To allow users to sign in and retrieve their keys from any device, this “file key” is stored in the Coordinator in an encrypted form, using a master key that is derived by the user's password. This zero-knowledge scheme ensures that no third party, not even the Coordinator, can access the user’s data and keys.

The path of the file

File redundancy

The encrypted file is split into N chunks and then processed into K additional redundancy shards through Reed Solomon error-correcting codes. This procedure allows the retrieval of the payload even if individual Cubbit Cells go offline, as long as you can reach any set of N Cells. Parameters are dynamically chosen and optimized such that the accessibility level is at 99.5%.

Authorization and peer selection

Next, the owner of the file asks the Coordinator for the authorization to upload it to Cubbit. The Coordinator, in addition to taking care of this, assigns a location to the file inside the Swarm, determining which hosting peers are most suitable.

To do so, the Coordinator runs a fitness function in order to both nullify the probability of losing files due to natural disasters and grant a constant network performance. In other words, the Coordinator spreads the chunks as far as possible, while also minimizing network latency and other factors (bandwidth usage, storage optimization, etc).

File Distribution

Each of the N+K shards is stored on a different Cubbit Cell, called a ‘hosting peer’. This means that Cells don’t contain the user's files but encrypted shards of other people's files.

To make this possible, the Coordinator facilitates peer-to-peer connections when needed, acting as a handshake server. Thanks to Reed-Solomon, uptime is guaranteed as long as at least N hosting peers are online at the same time.

Network self-healing

The Coordinator monitors the uptime status of each Cubbit Cell, triggering a recovery procedure when the total number of online shards hits a certain threshold - namely, N + K/2. If more than half the K hosting peers go offline, the Coordinator alerts the remaining hosting peers, which in turn contact other Cells via peer-to-peer, end-to-end encrypted channels to fully restore the number of online shards to the maximum level. It is worth noting that peers can retrieve the missing shards without the intervention of the original owner as they work on encrypted payloads. 

While redundancy parameters alone are tuned to guarantee a statistical uptime of c.a. 99.999999999%, this recovery procedure virtually pushes the uptime to 100% by handling history effects such as permanently-disconnected peers and redistributing missing shards over new entries of the Swarm. This is how our zero-knowledge cloud storage works.

network self healing illustration

Environmental impact

The internet infrastructure is responsible, as of today, for 10% of the total worldwide energy demand [1,2]. Data centers account for one third of it, making “the Cloud”, despite the ephemeral name, an ecological monster that consumes as much as the entire United Kingdom (66 millions inhabitants and 5th world economy).

Cubbit is based on small, optimized single-board computers, which have an impact per GB that is up to 10 times smaller than data center racks. Moreover, it can leverage geographical proximity to avoid long data transfers, which, in certain cases, can be as much consuming as storage itself [3].

19025 Kg

of CO2 Saved

Environmental impact illustration

The result, detailed in our green paper, is that an average storage plan of ~5 TB will save, choosing Cubbit over traditional cloud storage, the equivalent consumption of an always-on fridge.

[1] Greenpeace International. How clean is your cloud? catalysing an energy revolution. Technical report, 2012[2] Mark P Mills. The cloud begins with coal. Digital Power Group, 2013.[3] Jayant Baliga, Robert WA Ayre, Kerry Hinton, and Rodney S Tucker. Green cloud computing: Balancing energy in processing, storage, and transport. Proceedings of the IEEE, 99(1):149–167, 2011.

Read our green paper

FAQ

How can I be sure that my data is safe?
Can Cubbit access the stored data?
Where is my data located once it is distributed?
Why this additional step of fragmentation and redundancy?
Why does this technology optimize secure access?
What are the levels of reliability and durability guaranteed by the network?
What do I need to retrieve my data?

71% of ransomware victims lose their data

Fight back now!
By 2026, the business data you'll need to manage will triple (Gartner), putting your ROI at serious risk due to ransomware and natural disasters. That's why, together with 50+ leading companies, we've created the one and only cloud with geo-distributed, anti-ransomware, S3 backup and instant disaster recovery. Don't pay the ransom: choose Cubbit.
Contact us