Menu
Diligent CTO demystifies data deduplication

Diligent CTO demystifies data deduplication

Diligent's CTO, Neville Yates discusses deduplication technologies

Who are the vendors that do inline deduplication besides Diligent?

I believe Data Domain is the only other vendor doing inline processing. What's very interesting about that is the results from early beta tests support the claim that we make that post processing is slow when you have a large repository of data to deal with. A large repository, especially when it's hash-based, cause the knowledge base, the index and catalog to be incredibly active. When I say large, I mean anything the size of 20, 30 or 40TB.

If you are using disk as your endpoint instead of tape, is it better to choose a system that does post processing or inline processing, or does it make a difference?

The decision point is going to be based on the magnitude of the workload. If you only have a small workload and you are only backing up 1TB a night, then there are many different offerings that might suffice. There are other attributes that have to do with scalability, flexibility in configuration and expansion. When you are looking at large quantities of data, then you really need to be concerned about the configurations necessary to support the payload when it gets to 10 to 20TB a night. If you are dealing with those large payloads you are likely to find yourself buying more hardware to support a post-processing deployment.

If my goal is to send data off to tape from its staging area on disk, do I need to un-de-duplicate that data before sending it off to tape?

Yes, you should because the benefit of putting it on tape is likely to send it off-site and your use profile dictates in all probability that you need native access to that data, meaning that NetBackup, TSM or Legato can use those tapes directly. If you de-dupe the data and then put it on tape, it's a privately owned proprietary format on the tape that needs to be un-de-duplicated in order for the data to be of use to any application.

It seems that there would be opportunities for deduplication in areas other than in virtual tape libraries?

Deduplication works with any target. Diligent will be introducing file system deduplication with a Network File System interface and leveraging our deduplication engine to the network-attached storage topology. We are also developing an image interface in support of NetBackup. The technology is not bound by a VTL.

Join the CIO Australia group on LinkedIn. The group is open to CIOs, IT Directors, COOs, CTOs and senior IT managers.

Join the newsletter!

Or

Sign up to gain exclusive access to email subscriptions, event invitations, competitions, giveaways, and much more.

Membership is free, and your security and privacy remain protected. View our privacy policy before signing up.

Error: Please check your email address.

More about AdicFalconstorLegatoLogicalQuantumSECYATES

Show Comments
[]