What is WORM?
WORM (Write Once Read Many) is the practice of saving data that cannot be deleted until some period of time has elapsed. It’s intended largely to help organizations meet regulatory and legal retention requirements on their datasets. Such as:
My experience with WORM.
The first WORM device I ever recall seeing in person was a SCSI-attached CD Library; full of discs that were all shiny and gold, and that’s how I’ll forever think of WORM. Once data is laid down on a track, it’s read-only, forever. Not long after that I saw a presentation on Dell EMC™’s Centera® for the first time, which was described by my sales rep at the time as a black-hole for your compliance data, where you can save data that you must keep, but almost never read back. It’s fairly likely that this is how many of you first encountered WORM as well. When I first started using Rainfinity FMA and Centera as a customer a few years later, we came to the conclusion that infinite retention was best, because nobody could make up their minds otherwise, and disk was cheap. In recent years I’ve seen the industry shift towards companies becoming more concerned with not just keeping data long enough, but also getting rid of it when it no longer must be kept. Datadobi of course has a unique connection there, as our four founders were part of Dell EMC’s Centera engineering division. Although we have tons of migration experience in the Centera to NAS and Centera to ECS space, this post focuses mainly on migrations of file servers (NAS) with WORM attributes.
How does WORM work on NAS?
Ok, now that we’ve established what WORM is and why you might be legally obligated to use it, what does that really mean in your data center? It means that there must be some mechanism for setting a retention date, and then some separate mechanism to commit the data (basically make it immutable once that date is set). The most common NAS protocols on the market, SMB (including CIFS) and NFS, have no specific mechanism to support WORM attributes. As a result, WORM attributes have been stored in a fairly common manner across most platforms:
- Use the access time of the file as the retention time, the only distinction is that those access times end up in the future, which would normally never be the case for any file, since access time usually means when was the last time a file was read.
- To commit the file, you make it no longer writeable. On NFS this means effectively a ‘chmod –w’ (removing the write attribute from all parties with access to read the data). On SMB this means flipping the file attribute of ‘read-only’.
Storage vendors like to use terms to give you a sense of security when storing data like this. NetApp® has SnapLock®, Dell EMC VNX® & Celerra® have FLR (File Level Retention), Dell EMC Isilon® has SmartLock®. Most also have compliance or governance terms thrown around, which imply hardware enforced retention.
In all NAS implementations that I’ve seen thus far (quite a few), this is basically how it’s done. The directory, file system, or volume that is to contain this WORM data must be created in a special manner before any data is placed within it, to know that once data is committed, it cannot be modified or deleted until the retention date is met. The only exception to this is a so-called privileged delete, which is not something that a user could ever do, only a storage administrator, and only on a system that is not in a governance or compliance mode.
If you start to think about it for a while you might come up with the idea, of well, I can just change the time on the storage system to the future, and then I can delete the data today. (Just trick it). On Compliance WORM systems, this too is not possible because those systems usually have a separate clock called a compliance clock that only counts up.
Migration of WORM data?
So you have WORM data on some old systems and you need to move it to some new system. How do you do it?
The magic is all in the order of operations.
I’ll start with an analogy.
I saw this random Facebook post (one of those viral ones that everybody shares) the other day and it basically stated a simple math question and asked people for their answer. It looked something like this:
1 + 4(2+3)=?
For those of us who remember some middle school math, you’ll get 21. But why? Because you have to do things in the proper sequence to get the correct result.
If you did this problem wrong, you might get 25, or perhaps something else
This same premise holds true for migrating worm data because you must:
- Copy the data and ensure absolute integrity.
- Set the retention date on the target to match the date on the source.
- Read back that retention date on the target to ensure it was set correctly.
- Only once you’re sure the data and timestamps are correct, set the proper permissions.
- Commit the file so that it’s immutable.
- Provide a chain-of-custody to be able to prove that all the data made it from the source to the target correctly with the same attributes.
DobiMigrate can do this, and here’s how using the same steps above:
- DobiMigrate performs a MD5 hash of the source file and compares that with the MD5 hash of the target file after it’s copied. If the hashes don’t match, it’s not a valid copy. Incremental copies are performed automatically (usually every hour) up until you’re ready to perform a cutover. Once the scheduled cutover window begins, DobiMigrate makes the source shares or exports read only automatically (on many systems). DobiMigrate then performs a final incremental copy and creates the target shares and exports to as closely match the source as possible.
- After the cutover, for a WORM migration, DobiMigrate adds a final step of ‘Copy WORM’. During this step the MD5 hashes that have been calculated are compared and, if they are correct, DobiMigrate will set the access time of the file in the future to match the value held on the source (this is the retention date or, how long the data has to be kept around)
- It will now read-back that date to ensure that it’s accurate.
- Now that the data is correct and the timestamps are correct DobiMigrate opens the file handle once and sets the proper permissions.
- With the file handle still open, DobiMigrate will make the file immutable and then close the file.
- Lastly, a final report is provided for both the cutover details, but also a CSV final report of all the data on the source, the matching copy on the target, and all the proper attributes to give a chain-of-custody level of detail.
What platforms do you support?
See that’s the great thing about being protocol based; the easy answer here is anything that speaks SMB/CIFS or NFSv3 is supported. We do of course have API integration with most of the big NAS platforms on the market, and more are being added as we speak. This API integration is a nice-to-have however and not a necessity. To-date, in the WORM space we’ve tested compliance migrations with DELL/EMC VNX (FLR), DELL/EMC ISILON (SmartLock incl. compliance mode), and NetApp SnapLock. That doesn’t however mean that other platforms won’t work.
I have a guy that knows [tool_name_here].
Why can’t I just use that?
I’ve been asked this many times by customers and partners alike:
The short answer.
The importance of unstructured data today to a business, especially WORM data, cannot be overestimated. There is a reason that you must keep this data around and use technology to enforce and prove that you are keeping it. Old-CLI-based tools are so useless in this regard to the point that they are of no practical use for a WORM migration whatsoever. Are they going to give you the validity checking, the ability to set the atime and commit separately, the easy error checking (like when unexpected min/max/default retentions have been set [and shouldn’t be])? No, but DobiMigrate can.
The long answer.
Host-based tools; like RoboCopy, DellEMCopy, or rsync:
RoboCopy, Xcopy, RichCopy, DellEMCopy, and rsync historically have been the usual go-to file migration tools in a storage administrator’s toolbox. Fundamentally there is a lot wrong with them. Most of the data moves across just fine, but there is no guarantee. And if we say that 99.9% of the data made it, is that enough? That’s losing 1 file in a thousand? How about 99.99%? That’s losing 1 file in a million? Data corruption is fairly common in file migrations, but unless you have validation of the data that you’ve moved, rather than verification by exception (error or fail in a log file), then all you’ve proved is what didn’t make it, not what did. Perhaps ignorance is bliss, but that’s not what I would want to say to a regulatory compliance officer.
With any scripted tool, you also introduce the opportunity for significant human error.
If you take 10 sysadmins and give them the same tool, same source and same target, you’ll likely get 10 different scripts based on their own past experience and, let’s be honest, a few last-minute google searches. If that idea doesn’t make you nervous, it should. Especially when it comes to regulated data sets like PCI (credit-card processing), PHI(Patient Health Information), etc.
For the un-initiated, NDMP is the protocol used on most NAS systems to perform backups. Because a backup is just another copy of the data somewhere else, several attempts have been made to use this protocol to perform migrations. I once explained to an account team that using an NDMP-based copy mechanism to do a compliance migration was like assembling a china cabinet using a sledgehammer. While in theory it might work, the odds that you break something or mess up are really high, and with compliance data sometimes there is very little you can do to fix it.
NDMP-based copies have a few primary challenges:
- You have no ability to control the order of operations, which as I stipulated above is critical in a WORM migration.
- You consume NDMP threads on the source which may mean that normal backups cannot be taken, or take twice as long to complete during the migration. This should scare you. Missing backups in most industries today is not acceptable, ever. And the argument of “but we were doing a migration” isn’t going to get very far when data is lost and un-recoverable.
- There is no ability to throttle or slow the impact.
Change your perspective and reset your expectations.
(Use DobiMigrate instead)
So here is where I get biased (important points, but the same ones I’d give you in a sales pitch to be fair).
- DobiMigrate is faster than anything else on the market.
- Performance Testing done by Dell EMC: (disclaimer, I did most of these tests in my previous role at Dell EMC) https://community.emc.com/community/products/isilon/blog/2016/04/27/accelerating-your-journey-to-the-data-lake-with-dobiminer-from-datadobi
- An independent performance test conducted by PassMark: https://www.passmark.com/ftp/Datadobi_N2N_Migration_Benchmark_Testing_April_2016_Edition_1.pdf
- Your source and target are usually heavily mismatched, the target in a refresh being brand new with more SSDs and faster CPUs, and the source being old, slow and in-production. Throttling the impact on that source so that you can limit the impact during business hours is critical. DobiMigrate can do this for you and very easily.
- Proving not what data didn’t make it from hundreds of log files, but what data did make it instead including MD5 hashes is of immense value. No other tool can do it as fast as DobiMigrate can.
- Scripted migration methodologies while they work when you have the best trained personnel with tons of time to dedicate to a project should not be used any longer. The value of an organization’s unstructured data is too high, and the value of their own time is so high in these days of do more with less that with those conflicting priorities you’re introducing risk to the business.
- Performing a file migration is about more than just moving data, it’s about:
- Moving permissions (but being flexible, changing them if needed, or removing orphaned Security Descriptors (SIDs)
- Creating Shares and Exports to match the source
- Timing cutover events accurately to help plan outage windows.
- Being there to support you when things go sideways because of a strange configuration issue or unique dataset (everybody seems to configure their NAS devices just a bit differently).
- Emailing you status reports, so that the tool does the work for you, not you working to manage a tool with a hundred CLI switches.
See it for yourself.
But rather than bore you with step-by-step screenshots, this is a case where a video is far more appropriate:
For more information.
Additional Sources for WORM Regulation information:
|Data Retention Regulations the pertain to Medicare and Medicaid||https://www.cms.gov/Outreach-and-Education/Medicare-Learning-Network-MLN/MLNMattersArticles/downloads/SE1022.pdf|
|Search for Data Retention Regulation information for additional Countries||http://us.practicallaw.com/2-502-1510|
|IronMountain European Document Rentention Guide (wonderful document, however requires registration to download)||http://www.ironmountain.co.uk/Knowledge-Center/Reference-Library/View-by-Document-Type/Best-Practices/E/European-Retention-Guide.aspx|