Why Simply Backing Up Microsoft 365 Is Not Data Retention
Moving data and IT solutions to the cloud is changing the way organizations manage their data and think about their processes for complying with data discovery, production, and retention regulations. In the past, organizations managed every level of the infrastructure and the process behind the comprehensive protection of all data and content from all threats to integrity and availability; therefore, there needed to be much tighter integration between managing live system storage, archival storage, backup purging and data discovery, production, data retention, and defensible destruction.
Essentially, because of a myriad of technology-driven requirements and not business-driven need, responsibility for all these functions fell to IT. Over time, many of the concepts around these terms and processes have become blurred, causing different usages by different speakers in different spheres of responsibility.
Many customers are used to using their archival backups, either off-site or otherwise redundantly stored to provide the retention of data to meet regulatory compliance. For a variety of reasons, this raises challenges of access for discovery, document production, and disposal of content at the granular item level. There were many reasons this was widely adopted, primary among them cost; though complexity and technology or IT industry immaturity played a role. But once an organization has shifted to the cloud, it becomes imperative to reevaluate the processes, roles, and responsibilities for maintaining backups and ensuring compliant data retention.
Backup should never have been the primary retention method in use. Backup’s primary function has always been to be the “copy of last resort that always exists” rather than your “officially retained record.” Unfortunately, due to the constraints mentioned above, backed up content search and retrieval had to be built into the backup or archival system, as this was the only copy of the content available for the retention period.
Once the content has moved out of “live storage” it would only be after interfacing with IT to load the tape, or re-mount the database that content would be available for e-discovery processes and/or document production. Making the situation even worse, in many ways backup and retention of records are directly at odds since a large part of the responsibility of records management and retention execution is proper and timely data destruction.
The records management team may be operating under the assumption their destruction of data is complete, while IT still has copies of backups on file that may (or may not, who knows?) contain data with critical destruction requirements.
This complexity in process due to limitations imposed by outside considerations (storage cost, off-site retrieval, etc) is no longer required. With cloud vendors at every level (I/PaaS such as Microsoft 365 as well as SaaS cloud backup solutions such as AvePoint Cloud Backup) offering very affordable storage options and licensing plans, storage cost considerations are no longer a constraint that should be driving the processes to preserve the integrity, availability, and proper retention of content.
To repeat that bluntly: in Microsoft 365 it is not necessary to move content from its original location if it is being retained for regulatory reasons. It does not materially affect the licensing costs of your cloud subscription or your compliance with regulatory requirements.
Organizations can be confused about what they are responsible for and what Microsoft is responsible for within their Microsoft 365 environment. Essentially, Microsoft provides disaster recovery for catastrophic events, like a natural disaster and very small, short-term mistakes. Conversely, customers are responsible for protecting their content over long periods of time (months and even years) and maintaining compliance with all their data retention regulations.
This is allowing organizations to eliminate the responsibility for these administrative tasks from the IT personnel that managed the hardware. Now, records managers can have the system access to audit and report on the records and their disposition. Legal can now be assigned eDiscovery access to the Compliance Center for discovery activities and case management. Privacy Officers can use the reports provided by sensitivity labels and AIP to ensure organizations are compliant and risk is properly managed. In this new world, IT can get back to what it does best: moving the organization forward using technology.
Why Backup Isn’t Retention and Retention Isn’t Backup
Simply put, retention and backup mean different things to different people within your organization depending on their sphere of responsibility. In IT, “backup” means making sure the content can be recovered and made available to users in case the need arises. “Retention” on the other hand, to the IT Guy just means “how long before the backed-up content can be deleted.”
But to a Lawyer, Records Manager, or Compliance Auditor “retention” means something different: the content must be available for discovery and legal document production while being able to defend its provenance, chain of custody, and its deletion or destruction. When talking to these different audiences, remember that they may be using the same words but not understanding the nuances involved in the terms each group is used to using.
When considering changing an internal process to be implemented into an IT system it is important to remember what the goal of the process or policy is: data retention is not put in place to make sure the content can be restored if it’s needed. That’s the backup’s job as the copy of last resort.
The goal of retention, when managing storage space and disaster recovery isn’t part of the job, is to make certain the Legal Department can discover the document and defend its provenance when and if required. In fact, using your “backup” of your Microsoft 365 content as your primary data retention compliance method makes ediscovery and data production less efficient. It means your ediscovery process will have to interrogate two systems to find information. This introduces a discovery variable that could expose the organization to risk if discovery within the backup system misses a critical piece of content for production.
Now that disaster recovery and data availability is not part of the primary responsibility set of the IT department in an organization, people in positions other than IT are beginning to get access to what have previously been “administrative” systems to perform job functions that used to be the concern of IT, before the organization moved to the cloud.
One of the best examples of this kind of access is the “Compliance Center” in Microsoft 365’s Administrative Portal. The assignment of an eDiscovery role to an organization’s legal team can allow them to search through all content in the organization’s Microsoft 365 tenant, manage data discovery and production cases, control who can access which data for discovery and produce reports of data that has been discovered in the subject matter case. All of these are important aspects of the “back-end” of data retention.
Microsoft 365 Backup Makes Retained Document Production Efficient
Backing up your Microsoft 365 tenant isn’t unrelated to meeting your data retention regulatory requirements, but it’s not the solution for doing so. The job of a good cloud backup solution is making certain a copy of data is (preferably) easily accessible for recovery.
A good cloud backup solution will include many features that make retention (and legal document production) quick and easy such as:
- Automatic detection of new content containers to include in backups
- Granular restore to the individual data unit level (document, list item, e-mail, etc)
- Full and Incremental backup and restore
- In-place and out-of-place restore
- Export files
- Backups are encrypted in storage
- Automatic purge of backups after longest default retention period ends
- Ability to find and remove item-level backed up data as needed
- Delegation to allow document production without admin credentials
- End-user self-service restore
- Comprehensive backup of the entire tenant – All Microsoft 365 workloads and all information kinds
All of this makes retaining and producing data easier. But ‘backup’ isn’t enough to make you compliant with all aspects of retention regulations, legal requirements to produce documentation, and information risk management. Microsoft 365 offers data managers a better tool than mere ‘backup’ to ensure data retention compliance: Retention Policies.
Why You Should Use Retention Policies
Retention policies enable organizations to:
- Decide proactively whether to retain content, delete content, or retain and then delete the content when needed.
- Apply a policy to all content or just content meeting certain conditions, such as items with specific keywords or specific types of sensitive information.
- Apply a single policy to the entire organization or specific locations or users.
- Maintain discoverability of content for lawyers and auditors, while protecting it from change or access by other users.
When data is subject to a retention policy, people can continue to edit and work with the data because the content is retained in place in its original location. The retention policy ensures the content is managed in the background until the timeframe for action has been reached. For example, if an organization has a retention policy for “destroy after 7 years,” this means the content will remain in place and accessible until the 7-year timeframe is reached. At this point the data will be destroyed.
“Retention Policies” are different than “Retention Label Policies” – they do the same thing – but a retention policy is auto-applied, whereas retention label policies are only applied if the content is tagged with the associated retention label. This tagging can be performed by automated process, though Microsoft’s vision for retention policy labels is that end-users and content creators will apply the retention labels manually. (Ask your local records manager how well that’s working out.)
Retention Label Policies only take effect when a user or process applies the label to content. Publishing the Retention Label Policies to a container or workspace merely makes it available for users to apply to content. It is also important to remember that Retention Label Policies do not move a copy of the content to the ‘Preservation Holds’ folder until the content under policy is changed next.
Retention Policies are available for all Microsoft 365 workspaces, though each have their own peculiarities and quirks to be aware of. Some common things to remember:
- All content retained under a retention (or other) policy is discoverable via the Compliance Center eDiscovery console, regardless of license, workspace, or visibility to end users
- Any policy-based hold whether retention, label, eDiscovery or other, will prevent the content from being moved to the second stage recycle bin (i.e All holds must be removed before content will be deleted)
- Content under retention is not removed from the second stage recycle bin until after the retention period ends, plus the default second stage recycle bin retention time period
- Content must be owned by a user with an appropriate Microsoft 365 license to have a retention policy applied to it
There can be a lot to absorb and understand when first encountering Microsoft 365 retention policies. Each workspace has its own nuances and defaults for retain data placed under a retention policy. Some Microsoft 365 customers express concern with being surprised by additional licensing costs to retain data, discover it, and restore it. Other customers are worried they will be charged for space consumption for data they have to retain but isn’t in use. It is even not uncommon to hear customers complain that retention policies in Microsoft 365 don’t actually work, and the content isn’t really retained, since it’s impossible to restore.
When Microsoft 365 retention policies are understood and properly implemented, all these concerns are alleviated.
How Microsoft 365 Retains Content
Office stores content in a variety of places when it’s being retained, depending on the type of retention being performed, (or as the Records Manager would say, what kind of hold is placed on the content.) Copied directly from Microsoft’s retention policy documentation:
- For SharePoint and OneDrive sites: The copy is retained in the Preservation Hold library. Be aware that the Preservation Hold library consumes storage quota for the site.
- For email and public folders: The copy is retained in the Recoverable Items folder.
- For Teams content: The copy is retained in the Exchange Recoverable Items folder.
- For Microsoft 365 groups (formerly Microsoft 365 groups):
- The group mailbox is retained in the Exchange Recoverable Items folder.
- Any site content is retained in the Preservation Hold library.
Exchange Online Retention
In Exchange Online, when you delete a user using either the Exchange or Microsoft 365 Admin portals that has had a retention policy assigned to it will be converted it an ‘inactive mailbox’ for the duration of any retention period. (The image below uses the default of 14 days.) Inactive mailboxes are discoverable, subject to retention, eDiscovery, and other holds, do not consume Exchange licenses or storage quota.
While retention policies in Exchange make it easy to discover content, restoring content for production will require using PowerShell to restore the content to a mailbox that IS consuming both a license and quota. The entire mailbox must be restored, and the content to be produced can then be located and extracted.
SharePoint and OneDrive Retention
As a collaborative environment SharePoint content isn’t associated with a single user. While this complicates the retention picture, Microsoft has an elegant solution: when a retention policy is applied to a SharePoint or OneDrive site, it enforces retention of all changes to the documents within. This is not the same as versioning; versioning replaces the old copy of a document with the new version in the same place.
When a SharePoint site has a retention policy applied to it, SharePoint Online will create a ‘Preservation Hold’ library–which is hidden for non-admins–and copy all retained content to the Preservation Hold library as it changes. Every change to the document is retained, even if versioning is not enabled. Retention Policies applied to versioned content creates a copy of the versions in the Preservation Hold library. When content is deleted from the SharePoint site, the copy is retained in the Preservation Hold library until the end of the retention period, at which time it is moved to the Site second stage Recycle bin.
Three important things to note here:
- While the name of the library is ‘Preservation Holds’ applying a retention policy to a site does not place a ‘preservation hold’ on any content.
- All content placed under retention – whether eDiscovery, Legal, Retention policy, – is retained in the Preservation Holds library within the site.
- While all content retained in the Preservation Hold library is discoverable, retained versions of content are not. (i.e. the old versions are not discovered, though available to be restored. Only the latest current version is discoverable.
OneDrive Retention When Deleting an Account
When a user with a OneDrive under a retention hold is deleted from Microsoft 365, the manager of that user (as defined in Azure AD) or the Secondary Owner of the OneDrive site will automatically be granted Full Control access to the deleted user’s OneDrive site until the end of the retention period. During this time, it will be discoverable and accessible by the designated user; sharing links that have previously been generated to the content will also continue to function. Once the retention period ends, the OneDrive site will be moved to the Site Collection recycle bin, for the default period of 93 days, during which it is not discoverable.
Groups and Teams Retention
Retention for Groups and Teams is more complex, but the good news is that Microsoft has managed the complexity for customers. Retentions placed on Teams and Groups create a ‘SubstrateHolds’ folder in every member’s mailbox. In this folder, Teams maintains the chat and conversation content in each user’s mailbox independently from retention policies applied to mailboxes. This is an important detail to remember: Teams and Groups retention is different than Exchange retention and Exchange retention policies do not apply to chats and conversations stored in user mailboxes in the hidden SubstrateHolds folder.
The wrinkle is that Teams and Groups retention policies do not affect SharePoint and OneDrive content – specifically files. To retain files within a Team or a Group, you must separately apply a SharePoint or OneDrive retention policy to the content.