Nasuni. A blockchain-like system to store information

How to use Nasuni UniFS® as blockchain-like system to solve blockchain regulatory, scalability and lack of storage problems.

--

Abstract

Nowadays everyone wants to have a blockchain, their virtues are cheered while their limitations are little known. Solving these limitations to get blockchain applied to a real problem has led to a wide variety of blockchains implementations and blockchain types that somehow move away from its initial approach but maintain some of its virtues.

Now we have: public blockchains, federated blockchains, consortium blockchains, private blockchains, storage blockchains, blockchains with limited computing capacity, blockchains with more computing capacity, smart contract based blockchains, permissioned blockchains, permissionless blockchains, protocol 2 blockchains, sharded blockchains, Proof of Work based blockchains, Proof of Stake based blockchains, Proof of Space based blockchains and so on… But, can anyone tell me what the hell a blockchain is??

The objective of this article is to demonstrate that Nasuni implements a storage system based on the fundamental principles of the blockchain technology while improves or even completely removes its most important issues for the use case of a permissioned and consortium or private blockchains.

Blockchain

The blockchain data structures were popularized in 2009 with the Satoshi Nakamoto’s paper called “Bitcoin: A Peer-to-Peer Electronic Cash System”. In this paper Satoshi uses a blockchain data structure as a ledger for securely store bitcoin transactions history. But, the ideas behind the blockchain are quite old, and trace back to a paper by Haber and Stornetta in 1991.

Their proposal was a method for secure time stamping of digital documents, rather than a digital money scheme. The goal of time stamping is to give an approximate idea about when a document came into existence. More importantly, time stamping accurately conveys the order of creation of these documents: if one came into existence before the other, the timestamps will reflect that. The security property requires that a document’s timestamp can’t be changed after the fact.

In short, the blockchain uses public key cryptography to create an append-only, immutable, timestamped chain of content. Copies of the blockchain are distributed on each participating node in the network.

Blockchain systems are theoretically ideal for storing highly sensitive information for three reasons.

  • The first is because they maintain a very high level of replication of this information which makes it extremely resilient and durable.
  • The second is because the information remains inmutable once it is written, from this (timestamped) moment on, the information only evolves based on changes, which are stored separately and recomposed once the information is requested at a given time.
  • The third is because every change is cryptographically signed by the author adding a layer of non-repudiation and ownership.

That said, there are a number of issues that make it practically impossible to use a blockchain system to store information at production scale.

Consensus and Proof of Work: Cash transaction oriented blockchains were possible because of distribution, nobody itself has the control, and this distribution was made real through consensus algorithms, and noise reduction mechanism called proof-of-work.

In the Satoshi’s proposal, to gain the right to propose to other nodes a new piece of information that will be saved in the chain, you have to solve a puzzle (a hash puzzle) that is computationally complex and costly, and put the proof that you have solve it into the proposal. This information is then validated by each node in the network based on the rules of a consensus protocol and all the nodes that follow the same protocol will eventually store the piece of information and build the same chain.

This mechanism, which fits perfectly and made possible electronic cash systems, is extremely slow and costly for almost everything else though.

Cost of Storage: Blockchain networks try to gather the many nodes the better to improve its reliability. Furthermore, try to keep the storages of these nodes as small as possible (the latest trend is to run nodes on raspberry pi devices). Everything that is written on a blockchain is replicated to every single node of the network, for this reason, to write more than 1 MB is extremely expensive or simply not allowed.

Operation scalability: This is probably the major problem and source of dispute and research with the blockchain systems right now. Currently, in all blockchain protocols each node stores all states and processes all transactions. This provides great levels of security, but also greatly limits scalability: a blockchain cannot process more transactions than a single node can. In part because of this, Bitcoin is limited to ~3–7 transactions per second, Ethereum to 7–15, etc. This ratio proves insufficient even for a single company where there can easily be hundreds of records per second.

Regulatory compliance: If the blockchain system is going to store sensitive or personal data, the way of instrument and manage this storage, access and retrieval of information have to be aligned with regulations as GDPR, HIPPA etc.

Hard to integrate: Almost all applications used by enterprises today are compatible with the CIFS/SMB and NFS file sharing protocols. As a result, applications can read from and write to any file server or NAS device that supports these standards. This is not the case with blockchain.

Cloud Object Storage

Object storage is a computer data storage architecture that manages data as objects, as opposed to other storage architectures like file systems which manages data as a file hierarchy, and block storage which manages data as blocks within sectors and tracks. This object storage treats each piece of information (the object) as a chain of ones and zeros that can be up to 5TB in size and a series of labels or metadata, including a global and unique identifier, associated with that chain. These objects are usually files, even if they are not treated as such, and once stored, the bulk is called “unstructured file data”.

Simple, beautiful and strongly reliable.

Given that simplicity, these objects are easily replicable among disk cabinets, data centers or locations, and since these objects have no relationship with each other, the infrastructure that supports it is simple and highly scalable.

Public cloud providers, typically offer a range of object storage services designed for around 99.999999999% durability and 99.99% availability of objects over a given year. That’s huge. Usually customers decide which levels of replication and availability they want through different storage classes or tiers.

Additionally they offer to encrypt the information at rest with keys provided by them, provided by the client or even the client can encrypt the information before uploading it and then re-encrypt it once there.

For our purposes, this gives us a robust backend for raw data storage with high replication and confidentiality levels, but we still have problems to solve; how do we deal with all this information in an efficient way? How do we get the basic functionalities of a conventional file system? How do we make the information easily integrable with existing applications or other file systems?

Let’s add a sugar on the top.

Nasuni

Nasuni® Cloud File Services™ is powered by UniFS®, downloadable software that is the first global file system designed for modern cloud object storage, and which enables users to access their third-party file storage providers from any location rapidly, safely, and securely. Using Nasuni Cloud File Services, organizations can store, protect, synchronize, and collaborate on unstructured file data, from actively used to inactive, across all locations.

UniFS is designed around WORM principals and never over-writes an object once it is written. This means that files in the file system are kept immutable by UniFS. This holds true for file versions as well — every file change is time-stamped as its own object to provide complete data protection, eliminating the need for separate file backup and replication tools or processes.

Each node/appliance includes Nasuni Continuous (and endless) File Versioning, software that is a high-performance cache that takes periodic snapshots of the file system. This continuous snapshotting captures file changes as they occur and transmits only those changes to the third-party cloud storage system, so that the third-party cloud storage system always contains the latest version of every customer file. It also provides highly granular file-level data protection that offers improved recovery points and recovery times compared to traditional file backup, eliminating the need for backup hardware, software, and maintenance.

Every change to every file is securely transmitted to the cloud-native file system, UniFS, which keeps the immutable record of every file version in public cloud object stores such as Azure or Amazon Web Services.

Having said that, comparing Nasuni with a blockchain we have:

  • Immutable information (state machine). With Nasuni, once a data is written, it is never modified. Blockchain systems go from one N state to another N + 1 state each time a block is mined. With the continuous version of Nasuni we may say the file system goes from an N state to another N + 1 state every time a snapshot is made, or a file is modified. All changes are recorded and leave a trace.
  • A high level of information replication: With Nasuni, there is a “by-design” high level of information replication both in the cloud backend and in the different edge nodes / appliances cache which improves to the max both durability and availability. These levels of replication are “controllable” both in backend (mentioned object storage native replication) and putting more or less edge cache nodes. Blockchain systems have a simpler architecture and stores information in every node. A lot of nodes are required to maintain the consensus, so information is ludicrously replicated.
  • Fine grained authentication and authorization: Nasuni volumes and shares can be integrated with a directory service such as Microsoft AD or Open ldap which allows a wide range of authentication & authorization methods in order to access the information. Blockchain systems use public keys as identities so real identities are in principle hidden nor do they have the concept of a shared space or “share” or a way to manage authorization beyond have or not write access. Each user has their own space and things are passed from one to another.
  • Easy integration with all applications: Nasuni shares information through common protocols such as CIFS or NFS so can be easily integrated with every existing app. Blockchain systems requires bespoke development to integrate with an app (aka Dapps).

Conclusion

Blockchains, in its most basic nature, are append-only information state machines that ultra-replicate this stored information to improve its availability and resiliency. Due to this high level of replication, the information storage tends to be very expensive or even impossible.

With Nasuni in combination with cloud object storage, we can leverage the fundamental advantages that makes Blockchains praised, while getting rid of the drawbacks.

At least, not for electronic cash systems, but for information storage systems. But remember, information today is even more valuable than money.

So makes sense to store it in a safe.

Many industries with highly critical and regulated information processes such as healthcare, pharmaceutical, banking, automotive, infrastructures or transport may benefit from a blockchain-like, distributed, secure and highly available information storage system that allows them to meet regulatory requirements and evolve in a healthy way.

--

--