Advantages, disadvantages, and limitations of the data-protection options
Understanding the advantages and disadvantages of each of the three options for protecting data at ingest (Balanced, Strict, or Dual commit) can help you decide which one to select for an ILM rule.
Advantages of the Balanced and Strict options
When compared to Dual commit, which creates interim copies during ingest, the two synchronous placement options can provide the following advantages:
-
Better data security: Object data is immediately protected as specified in the ILM rule's placement instructions, which can be configured to protect against a wide variety of failure conditions, including the failure of more than one storage location. Dual commit can only protect against the loss of a single local copy.
-
More efficient grid operation: Each object is processed only once, as it is ingested. Because the StorageGRID system does not need to track or delete interim copies, there is less processing load and less database space is consumed.
-
(Balanced) Recommended: The Balanced option provides optimal ILM efficiency. Using the Balanced option is recommended unless Strict ingest behavior is required or the grid meets all of the criteria for using for Dual commit.
-
(Strict) Certainty about object locations: The Strict option guarantees that objects are immediately stored according to the placement instructions in the ILM rule.
Disadvantages of the Balanced and Strict options
When compared to Dual commit, the Balanced and Strict options have some disadvantages:
-
Longer client ingests: Client ingest latencies might be longer. When you use the Balanced and Strict options, an “ingest successful” message is not returned to the client until all erasure-coded fragments or replicated copies are created and stored. However, object data will most likely reach its final placement much faster.
-
(Strict) Higher rates of ingest failure: With the Strict option, ingest fails whenever StorageGRID cannot immediately make all copies specified in the ILM rule. You might see high rates of ingest failure if a required storage location is temporarily offline or if network issues cause delays in copying objects between sites.
-
(Strict) S3 multipart upload placements might not be as expected in some circumstances: With Strict, you expect objects either to be placed as described by the ILM rule or for ingest to fail. However, with an S3 multipart upload, ILM is evaluated for each part of the object as it ingested, and for the object as a whole when the multipart upload completes. In the following circumstances this might result in placements that are different than you expect:
-
If ILM changes while an S3 multipart upload is in progress: Because each part is placed according to the rule that is active when the part is ingested, some parts of the object might not meet current ILM requirements when the multipart upload completes. In these cases, ingest of the object does not fail. Instead, any part that is not placed correctly is queued for ILM re-evaluation, and is moved to the correct location later.
-
When ILM rules filter on size: When evaluating ILM for a part, StorageGRID filters on the size of the part, not the size of the object. This means that parts of an object can be stored in locations that do not meet ILM requirements for the object as a whole. For example, if a rule specifies that all objects 10 GB or larger are stored at DC1 while all smaller objects are stored at DC2, at ingest each 1 GB part of a 10-part multipart upload is stored at DC2. When ILM is evaluated for the object, all parts of the object are moved to DC1.
-
-
(Strict) Ingest does not fail when object tags or metadata are updated and newly required placements cannot be made: With Strict, you expect objects either to be placed as described by the ILM rule or for ingest to fail. However, when you update metadata or tags for an object that is already stored in the grid, the object is not re-ingested. This means that any changes to object placement that are triggered by the update are not made immediately. Placement changes are made when ILM is re-evaluated by normal background ILM processes. If required placement changes cannot be made (for example, because a newly required location is unavailable), the updated object retains its current placement until the placement changes are possible.
Limitations on object placements with the Balanced or Strict options
The Balanced or Strict options cannot be used for ILM rules that have any of these placement instructions:
-
Placement in a Cloud Storage Pool at day 0.
-
Placement in an Archive Node at day 0.
-
Placements in a Cloud Storage Pool or an Archive Node when the rule has a User Defined Creation Time as its Reference Time.
These restrictions exist because StorageGRID cannot synchronously make copies to a Cloud Storage Pool or an Archive Node, and a User Defined Creation Time could resolve to the present.
How ILM rules and consistency controls interact to affect data protection
Both your ILM rule and your choice of consistency control affect how objects are protected. These settings can interact.
For example, the ingest behavior selected for an ILM rule affects the initial placement of object copies, while the consistency control used when an object is stored affects the initial placement of object metadata. Because StorageGRID requires access to both an object's metadata and its data to fulfill client requests, selecting matching levels of protection for the consistency level and ingest behavior can provide better initial data protection and more predictable system responses.
Here is a brief summary of the consistency controls that are available in StorageGRID:
-
all: All nodes receive object metadata immediately or the request will fail.
-
strong-global: Object metadata is immediately distributed to all sites. Guarantees read-after-write consistency for all client requests across all sites.
-
strong-site: Object metadata is immediately distributed to other nodes at the site. Guarantees read-after-write consistency for all client requests within a site.
-
read-after-new-write: Provides read-after-write consistency for new objects and eventual consistency for object updates. Offers high availability and data protection guarantees.
-
available (eventual consistency for HEAD operations): Behaves the same as the “read-after-new-write” consistency level, but only provides eventual consistency for HEAD operations.
Example of how the consistency control and ILM rule can interact
Suppose you have a two-site grid with the following ILM rule and the following consistency level setting:
-
ILM rule: Create two object copies, one at the local site and one at a remote site. The Strict ingest behavior is selected.
-
Consistency level: “strong-global” (Object metadata is immediately distributed to all sites.)
When a client stores an object to the grid, StorageGRID makes both object copies and distributes metadata to both sites before returning success to the client.
The object is fully protected against loss at the time of the ingest successful message. For example, if the local site is lost shortly after ingest, copies of both the object data and the object metadata still exist at the remote site. The object is fully retrievable.
If you instead used the same ILM rule and the “strong-site” consistency level, the client might receive a success message after object data is replicated to the remote site but before object metadata is distributed there. In this case, the level of protection of object metadata does not match the level of protection for object data. If the local site is lost shortly after ingest, object metadata is lost. The object cannot be retrieved.
The inter-relationship between consistency levels and ILM rules can be complex. Contact NetApp if you require assistance.