How Egnyte Implements Hybrid Object Stores Using Public Clouds to Enhance Customer Experience
Until very recently, it was challenging for public storage clouds to beat the economics of internally deployed storage for large-scale, multiple-petabyte crunching applications. At Egnyte, we have insulated our customers from these external forces by implementing a novel object store.One of the interesting features of our object store is the ability to selectively proxy objects to and from another object store, while absorbing latency using a bank of fast caches. In this way, Egnyte adds an application layer with additional functionality - such as mobile access, namespace unification and collaboration - that’s currently missing from public storage clouds, while still leveraging complementary trends in public clouds.This enables our customers who have relationships with public storage cloud vendors, such as Microsoft, Google or Amazon, to store objects in their buckets with all the goodness of Egnyte’s hybrid cloud model. Some of the benefits include:Egnyte provides a rich end user interface with granular permissions, authentication and authorization.
- The Egnyte Local Cloud can be leveraged to provide offline access
- The Egnyte mobile interface provides data access to roaming users
- Public cloud geo location features enable copies to be stored closer to the customer
- The customer can configure his/her public cloud policies for off-site copies based on corporate compliance rules
- Egnyte’s encryption at rest secures data in the public cloud and ensures that a compromised account will not result in any data exposure
Proxying to a public storage cloud posed some interesting problems. We needed to replicate data to public clouds in near real-time while providing the same availability and accessibility to data once uploaded to Egnyte. Some of the challenges included:
- Upload/download: Provide at least the same upload/download throughput that customers would get with our own object store
- Durability: Provide the same data guarantee as our own object store
- Fast replication: We need to make sure data is replicated to the public cloud as fast as possible.
- Minimize public cloud API calls: Since most public clouds charge on for API calls, we must make sure we make minimum calls to the public cloud
- Public cloud data organization: Add enough metadata to objects in the public cloud to correlate them back to Egnyte Cloud File System and users’ local file paths
- Secure data transfers: Data must be replicated over secure links and encrypted before replication
- Public cloud outage isolation: Customers should be protected from public cloud outages
We explored different design strategies to get the data quickly to the public cloud. Synchronous commit to public clouds was ruled out to protect our customers from public cloud outages and unexpected network issues. After running several tests, we created the following near real-time asynchronous model:
In this model, we commit the uploaded files from a customer to a high-speed, protected bank of cache servers. When a user uploads a file, it flows through Egnyte and onwards to the public cloud as follows:
- The file is uploaded to a protected and redundant high-speed caching area of the Egnyte file system. From a user perspective, the transaction is complete and the file will be available for download any time the user requests it.
- A replication request is sent to the replicator via the replicator queue.
- The replicator picks up the file from the cache, commits to the public cloud and updates the public cloud object with essential metadata.
- The cache copy is left behind for a predefined time period to allow for subsequent download requests without hitting the public cloud.
When a public cloud user downloads a file, we first check for the file in the cache, download it from the public cloud, if it does not exist in cache, and cache it locally for a predefined time period to ensure minimum API calls to the public cloud and fast access to data.
With this enhancement, Egnyte now offers a unique combination of a cloud file system along with extreme control over enterprise data. This addresses a major point of contention with respect to cloud adoption. Enterprises can now pick the final resting place for their data and set up replication and security policies that suit their needs.As more enterprises move their data to the cloud, they will want features such as geographic location, offsite copies for disaster recovery, and a public cloud provider that can provide fine-grained control over the final resting place for their data.Stay tuned for more Egnyte enhancements in this area, including follow-me objects, which enable objects to follow the users across the world in a geo-balanced manner.