The Impact of Timestamps
After initial data seeding, Datadobi’s DobiMiner Suite (DobiMigrate, DobiReplicate, and DobiSync) compares timestamps on file system objects residing on both the source and destination platform. These comparisons determine whether full data copies or metadata copies are required when executing differential operations synchronizing content on the source and destination platforms. Intelligent use of these timestamps prevents unnecessary copy activity between the platforms.
Various operating systems capture and track different types of timestamp information. Generally, these differences correspond to the timestamps maintained in Linux/Unix (*nix) file systems and those maintained within the Microsoft Windows NTFS file system.
Linux/Unix Timestamps
Linux/Unix timestamps have 3 variants – sometimes referred to as MAC times
- (M) Modified/last written (mtime)
- (A) Last Accessed (atime)
- (C) Changed (ctime)
The difference between ctime and mtime are that mtime equates to the last time the file contents were updated while ctime equates to the last time the file system object’s metadata (such as permissions, ownership, etc.) was updated.
Linux/Unix Summary
- mtime: Changes when file contents are modified
- atime: Changes whenever the file is opened, modified, and/or copied.
- ctime: Changes when metadata (such as permissions or ownership) are modified. Essentially, anything that changes the inode triggers an update to ctime. Note that ctime is also updated when the file contents are modified/updated.
To view the times associated with files/directories use the stat (or stat –x on some platforms) command. For example:
[root@myhost ~]# stat install.log.syslog
File: `install.log.syslog' Size: 3314 Blocks: 8 IO Block: 4096 regular file Device: 801h/2049d Inode: 524292 Links: 1 Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root) Access: 2015-01-06 17:32:44.464999864 +0000 Modify: 2015-01-06 17:33:26.817999795 +0000 Change: 2015-01-06 17:34:04.650999734 +0000
Microsoft Windows NTFS Timestamps
NTFS timestamps can be broken down into 8 variants – but the main times stored in the $STANDARD_INFORMATION attribute are MACE times
For every file on an NTFS volume, there are the following dates:
- (M) File Modified/Last Write Time (posix equivalent is mtime)
- (A) File Accessed/Last Access Time (posix equivalent is atime)
- (C) File Created/Creation Time (birth time available in ext4 but not directly exposed)
- (E) MFT Entry Modified (posix equivalent is ctime) (not directly exposed in Windows)
NTFS Summary
- Last Write Time: File contents have been modified/updated. Changes to a file’s data trigger an update to this timestamp. Changing permissions or file attributes such as read-only, hidden, archive, etc. do not trigger an update to this timestamp.
- Last Access Time: This is the date/time the file was last accessed. An access can be a move, an open, or any other simple access. It can also be updated by Anti-virus scanners, or Windows system processes.
- Creation Time: This is the date/time the file was originally created on the volume.
- MFT Entry Modified: This timestamp doesn’t appear in Windows Explorer or the command line interface – it requires forensic tools to view/examine. This timestamp indicates when the MFT entry, which points to the file of interest, last changed. File content updates as well as metadata updates (filename, permissions, ownership, etc) trigger updates to this timestamp.
The MACE timestamp values (MFT Entry excluded) can be viewed in Windows Explorer or via command line/powershell. For example:
PS C:\Somedir> Get-ChildItem c:\somedir\*.txt | select name, *time Name CreationTime LastAccessTime LastWriteTime ---- ------------ -------------- ------------- icacls.txt 10/3/2016 2:16:24 PM 10/3/2016 2:16:24 PM 10/3/2016 2:16:24 PM sidlist-c$.txt 9/2/2016 1:15:07 PM 9/2/2016 1:15:07 PM 9/2/2016 1:15:24 PM takeown.txt 9/2/2016 2:30:11 PM 9/2/2016 2:30:11 PM 9/2/2016 2:30:11 PM
DobiMiner Suite Software Behavior
The following table lists the actions executed when comparing timestamp values between source and destination. For this reason it is important that NTP be configured on all servers.
Type | Source | Destination | Action | |
*nix | mtime | < | mtime | copy data |
*nix | mtime | > | mtime | copy data |
*nix | mtime | = | mtime | no action |
*nix | ctime | < | ctime | no action1 |
*nix | ctime | > | ctime | copy metadata |
*nix | ctime | = | ctime | no action |
NTFS | File Modified | < | File Modified | copy data |
NTFS | File Modified | > | File Modified | copy data |
NTFS | File Modified | = | File Modified | no action |
NTFS | MFT Last Written | > | MFT Last Written | copy metadata2 |
NTFS | MFT Last Written | = | MFT Last Written | no action |
NTFS | MFT Last Written | < | MFT Last Written | no action |
1no action when source ctime < destination ctime because ctime cannot be set on a file
2MFT Last Written will be updated when a permission, ownership, or attribute change is made
Other Copy Behavior
When new files are detected on the source system (ie, no corresponding entry on the destination) the new files will be copied to the destination along with their associated metadata.
When existing files are deleted from the source system the delete operation is propagated to the destination system so that previously copied files will be removed. Note that during rollback migrations deletes are not propagated back to the target system (ie, the original source system) – only new/modified files detected post-switchover are copied back.
An Example of How Timestamps Influence Copy Operations
We created a source->destination pair using a NetApp (vnetapp1) to an Isilon (Isilon1). A single test dataset stored in the NetApp QTree named “DeleteMe-QTreeMixedFileSize” served as the source – it’s just some test data we work with internally. First Scan and First Copy phases were executed then the source share was mounted to a windows host so that modifications could be made between Steady State (i.e., incremental) copies. Two modifications were made: 1) a metadata only modification involving a permissions change on two files and 2) a data/content change made to a single text file.
In the image below you can see the results of the initial data seeding. The First Scan and First Copy operations completed with roughly 197,000 files and 40,000 directories. The Steady State copy to come is what we’re interested in, however.
Metadata Change (Permissions)
To make the metadata change we mounted vnetapp1\DeleteMe-QTreeMixedFileSize on a Windows host. We selected the file AapAapAapAap26822.txt and modified permissions – removed “Full Control” from group “Everyone” and saved. This was also done for the file named VisNootMaanJan76473.txt (these file names come from our test data generator which is why their names are so highly randomized).
The image below shows the timestamps as reported by NTFS after the permissions modification was made. Only LastAccessTime has been updated.
Windows does not directly expose the ‘MFT Entry Modified’ timestamp. This is the timestamp which would indicate the most recent change in permissions. It is typically only viewable through forensic tools or tools with low level NTFS access. Datadobi’s software suite can access this data, however, to determine that only metadata changes have been made and that a full data copy is not required.
Content Change
Using the same Windows host and mounted share, we selected file AaapAaapAaapAaap7015.txt, made a small content change, and saved the file.
When we look at the timestamps as reported by NTFS after the modification made we see that LastAccessTime and LastWriteTime are both updated. This is as expected because a) we accessed the file and b) we updated and then saved the new content.
Incremental Copy following the metadata and content updates
At this point we have made two types of changes – metadata changes in the form of permissions changes to two files and a file content change to one file. Ideally, we would like the software to have the intelligence to determine that in some cases only a lightweight metadata copy is required as opposed to a full copy of the file. When we run an incremental copy in DobiMigrate we’ll see exactly this behavior (shown in the image below) – two metadata copies and one file copy operation. DobiReplicate and DobiSync behave in exactly the same manner.
If we look further into the details of the operations we can see at the file level we have exactly the type of copy operations expected. We see two metadata updates due to the permission changes made to AapAapAapAap26822.txt and VisNootMaanJaan76473.txt. The file copy relates to the data/content change made to file AapAapAapAap7015.txt. NOTE: The directory commit operation is required due to updates being made to content in the directory.
Summary
Datadobi’s DobiMiner Suite of data mobility products (DobiMigrate, DobiReplicate, and DobiSync) make intelligent use of timestamps to determine what type of copy operations are required. The decisions made by the software save network bandwidth (important if migrating/replicating over a WAN), source system resources, and the all-important variable of time which is critical during switchover events.