Are You Keeping an Immutable Copy of Your Unstructured Data in The Cloud?

Are You Keeping an Immutable Copy of Your Unstructured Data in The Cloud?

More and more enterprises are struggling with a situation I have been talking about for some time: the accelerating growth of unstructured data. With this tremendous growth comes a requirement to protect this data in almost all instances. The challenges related to protecting these large volumes of business-critical information is forcing companies to become creative in order to meet data protection requirements. With many customers venturing away from traditional NDMP backups and implementing snap and replication as their data protection strategy, it is becoming increasingly more common for many customers to express major concerns about meeting their business continuity requirements. This is especially so given the increasing rise in devastating and debilitating ransomware attacks.

Many of my financial services customers have requirements around long-term retention on reliable media with a verification checksum of every file to meet regulatory requirements. Granular recovery from any point-in-time to any file system technology is the type of flexibility customers need in order to adequately protect these massive volumes of unstructured data. And they are seeing firsthand the importance of immutability and integrity of their data.

The year 2020, as challenging as it was, brought many new exciting announcements from Datadobi. The most exciting of these announcements, in my opinion, was that of DobiProtect. The software allows our customers to sync file system data from any SMB or NFS storage platform to a number of private and/or public cloud storage platforms. In addition, DobiProtect allows you to restore data from its object format to any SMB or NFS file system technology. Let’s look a little bit deeper. 

Dissecting the Technology Behind DobiProtect

DobiProtect leverages the same proven Datadobi architecture to sync file system data to an object format. The copy engine has proven time and time again to be the fastest, and most accurate, on the market. As an example, one Datadobi customer with large genomics datasets copied an entire petabyte (PB) of data in just five days. The Datadobi copy engine was a critical component to achieving those numbers. As one might hope, DobiProtect calculates a checksum on all source files and target objects to verify the integrity of data being synced. And since DobiProtect is using standard protocols there are no limitations regarding the recall of synchronized data. Any point-in-time image created by DobiProtect from any given source system can be recalled to any desired target system. Gone are the restrictions imposed by outdated protocols such as NDMP which require parity between the original source system and the target of the recall. 

As for the objects being created during copy operations, DobiProtect takes a unique approach to storing content on the object store. Many people think it’s a trivial task to take a file and store it as an object. As it turns out the characteristics of file systems do not particularly map well to object storage. How the files are named and how that translates into an object key can be problematic. How do you maintain permissions? What about alternate data streams (ADS)? How do you maintain settings like SMB share and NFS export definitions? All of these are important to maintain in the event of a catastrophe. Simply recalling data but not the permissions, associated data streams, and/or access points makes the point of the backup somewhat meaningless.

The DobiProtect Approach

Once a bucket or container is created on the object store and presented to DobiProtect, the software creates a multi-level abstraction layer. This unique abstraction layer allows DobiProtect to protect all file system items along with all their associated metadata – the NTFS permissions, ADS, POSIX mode bits, NFSv4 ACLs, the timestamps (ctime, mtime, atime), etc. DobiProtect can do this and avoid the pitfalls of incompatible file system and object store semantics. And as mentioned before, critical source NAS settings such as SMB share and NFS export definitions are maintained. These settings are absolutely critical when you are protecting a large NAS system with PBs of data and, literally, thousands of shares/exports defined. We all tend to focus on the file content itself but there are other highly critical aspects of our datasets that must be maintained both accurately and in an immutable fashion to avoid the disaster ransomware can inflict.

Not only does DobiProtect’s architecture abstract the underlying object storage platform, it leverages versioning to provide multiple point-in-time copies of the file data while also specifying the retention period governing the amount of time each version is maintained. The ability to store extended metadata and NAS settings is key to DobiProtect’s capability to restore any file system data from object format to any file system technology. 

As you review your data protection strategies and processes related to your unstructured data it’s important to add critical capabilities such as those provided by DobiProtect to your existing practices. Snapshots and replication are wonderful tools for quick, isolated recovery and also for failover to a secondary system but with ransomware now targeting snapshots and sitting idle for extended periods of time (turning replication into ransomware propagation) it’s critical to maintain several point-in-time copies and to do so leveraging cost effective nearline object storage. DobiProtect can help you provide full end-to-end coverage of your unstructured datasets as an extra line of defense.

For more information, please visit the DobiProtect product page