Advantages, disadvantages, and requirements for erasure coding
Before deciding whether to use replication or erasure coding to protect object data from loss, you should understand the advantages, disadvantages, and the requirements for erasure coding.
Advantages of erasure coding
When compared to replication, erasure coding offers improved reliability, availability, and storage efficiency.
- 
Reliability: Measured by the number of simultaneous failures that can be sustained without data loss. - 
Replication: Multiple identical object copies are stored on different nodes and across sites. 
- 
Erasure coding: An object is encoded into data and parity fragments and distributed across many nodes and sites. This dispersal provides both site and node failure protection. 
 
- 
- 
Availability: The ability to get objects if Storage Nodes fail or become inaccessible. When compared to replication, erasure coding provides increased availability at comparable storage costs. 
- 
Storage efficiency: For similar availability and reliability, erasure-coded objects use less disk space than replicated objects. For example, a 10 MB object that is replicated to two sites consumes 20 MB of disk space (two copies), while an object that is erasure-coded across three sites with a 6+3 erasure-coding scheme only consumes 15 MB of disk space. Disk space for erasure-coded objects is calculated as the object size plus the storage overhead. The storage overhead percentage is the number of parity fragments divided by the number of data fragments. 
Disadvantages of erasure coding
When compared to replication, erasure coding has the following disadvantages:
- 
An increased number of Storage Nodes and sites is recommended, depending on the erasure-coding scheme. In contrast, if you replicate object data, you need only one Storage Node for each copy. See Erasure-coding schemes for storage pools containing three or more sites and Erasure-coding schemes for one-site storage pools. 
- 
Increased cost and complexity of storage expansions. To expand a deployment that uses replication, you add storage capacity in every location where object copies are made. To expand a deployment that uses erasure coding, you must consider both the erasure-coding scheme in use and how full existing Storage Nodes are. For example, if you wait until existing nodes are 100% full, you must add at least k+mStorage Nodes, but if you expand when existing nodes are 70% full, you can add two nodes per site and still maximize usable storage capacity. For more information, see Add storage capacity for erasure-coded objects.
- 
There are increased retrieval latencies when you use erasure coding across geographically distributed sites. The object fragments for an object that is erasure-coded and distributed across remote sites take longer to retrieve over WAN connections than an object that is replicated and available locally (the same site to which the client connects). 
- 
When you use erasure coding across geographically distributed sites, there is higher WAN network traffic usage for retrievals and repairs, especially for frequently retrieved objects or for object repairs over WAN network connections. 
- 
When you use erasure coding across sites, the maximum object throughput declines sharply as network latency between sites increases. This decrease is the result of a decrease in TCP network throughput, which affects how quickly the StorageGRID system can store and retrieve object fragments. 
- 
Higher usage of compute resources. 
When to use erasure coding
Use erasure coding for the following requirements:
- 
Objects greater than 1 MB in size. Erasure coding is best suited for objects greater than 1 MB. Don't use erasure coding for objects smaller than 200 KB to avoid the overhead of managing very small erasure-coded fragments. 
- 
Long-term or cold storage for infrequently retrieved content. 
- 
High data availability and reliability. 
- 
Protection against complete site and node failures. 
- 
Storage efficiency. 
- 
Single-site deployments that require efficient data protection with only a single erasure-coded copy rather than multiple replicated copies. 
- 
Multiple-site deployments where the inter-site latency is less than 100 ms. 
 PDFs
PDFs