BLOBs – Backup and Failover Strategies

    Having been in the backup and recovery space for SharePoint since 2001, and the Storage Optimization (BLOB Management) space since SharePoint Portal Server 2003, we’ve frequently been asked about Disaster Recovery strategies for SharePoint environments with storage optimization. To start, it’s absolutely not as simple as taking separate backups – relying on your database teams and file-share teams respectively to handle their data. To ensure an effective data protection strategy, both sides need to play together. If anyone says that we can back up the moving parts of SharePoint separately: taking a snapshot of the file shares hosting your BLOB content, backing up your SQL databases individually using any number of methods, and protecting your applications as separate and unrelated sets of data, let’s be clear about the further implications of this statement: – My backup SLA requires hourly backups against corruption. Do I need to back-up the content, or just the metadata of my documents? – If SharePoint goes down and I rebuild SharePoint at my DR site, do I need to recover my BLOBs, or will restoring the stubs be enough?
    – If I’m setting up a replication plan for a staging environment, do I need to mirror only my databases, or the file shares too? Storage Optimization works by separating the BLOB content from its associated metadata (the stub, the pointer, or simply the part that connects my files to the SharePoint ecosystem) – with it in place, the answers to all of the above questions appear much more complicated (at first). The root of this issue lies in that when using separate technologies for file-shares and databases: there is no way to backup or mirror both BLOBs and stubs synchronously (unless you intend to lock/take down SharePoint from editing during every backup window.) For example, this disparity is really apparent when using SQL mirroring to provide nearly synchronous disaster recovery protection for your SQL databases , but file share replication on a possibly longer (think 15 minute) schedule using hardware-based technology. That window could be even longer with software-based replication. Can we eliminate that out of sync window? Quantam physics says no. Can we get close? Sure – if we have the immense bandwidth and hardware to do it. But in the wake of events of this past decade, how many DR or failover sites are in a local geography to make this possible? So here are the three basic cases:We’re perfect! We use DFS or other virtual file share systems to ensure the destination has the data before we accept any BLOB through SharePoint. No challenges for disaster recovery here!

    We prepared! We restored the BLOB contents before restoring the last good backup of our databases. Yes, our BLOB store will have items created since the last good database backup, but DocAve uses garbage collection as a part of its solution to remove any orphaned BLOBs, so anything created in the window we specified above won’t be protected, and could lead to inconsistencies?.

    We overshot! We just failed-over to our standby database, which now contains pointers to BLOBs that haven’t been mirrored yet. And it gets worse: because those pointers represent the latest content, it’s the content a user is most likely to click on and notice is missing! So to clarify the point here:We need to guarantee consistency, and the only way to do that is to take a backup of our stubs and blobs at the same time, DocAve Granular Backup and Restore is the only way to answer this. And this isn’t just the case for AvePoint Storage Optimization solutions – custom RBS solutions can test this compatibility with our products! DocAve can take a backup and perform item-level recovery without pausing or locking a site, representing true 1-to-1 fidelity.

    If we’re talking about disaster recovery, it doesn’t matter what hardware vendor is doing the backup of the file-share, you need to recover the BLOBs first, and then recover an earlier database backup – in this case, you’ve got orphaned BLOBs, we handle the garbage collection, no impact on your SLA.

    If we’re talking about high-availability, it’s extremely likely you will have orphaned stubs, no matter how good your replication technology is; those orphans typically represent the most recent content updates in SharePoint. AvePoint has developed tools to handle this clean-up process prior to going live, which our engineers would be more than happy to walk you through. Remember, if you’re using EBS or RBS, your SharePoint SLAs are still just as important as they always have been, but they just got bigger. Untitled-2.png Stay tuned for future publications from AvePoint, where we walk you through a few customer testimonials of performing disaster recovery on a global scale – including the re-building of SharePoint farms or the re-attaching of externalized content to new environments.

    2 COMMENTS

    1. John,

      This was a great article, and the picture really helped drive your point home. Looking forward to the next one.

      -Alex

    LEAVE A REPLY

    Please enter your comment!
    Please enter your name here