Advantages, disadvantages, and limitations of the ingest options
Understanding the advantages and disadvantages of each of the three options for protecting data at ingest (Balanced, Strict, or Dual commit) can help you decide which one to select for an ILM rule.
For an overview of ingest options, see Ingest options.
Advantages of the Balanced and Strict options
When compared to Dual commit, which creates interim copies during ingest, the two synchronous placement options can provide the following advantages:
-
Better data security: Object data is immediately protected as specified in the ILM rule's placement instructions, which can be configured to protect against a wide variety of failure conditions, including the failure of more than one storage location. Dual commit can only protect against the loss of a single local copy.
-
More efficient grid operation: Each object is processed only once, as it is ingested. Because the StorageGRID system does not need to track or delete interim copies, there is less processing load and less database space is consumed.
-
(Balanced) Recommended: The Balanced option provides optimal ILM efficiency. Using the Balanced option is recommended unless Strict ingest behavior is required or the grid meets all of the criteria for using Dual commit.
-
(Strict) Certainty about object locations: The Strict option guarantees that objects are immediately stored according to the placement instructions in the ILM rule.
Disadvantages of the Balanced and Strict options
When compared to Dual commit, the Balanced and Strict options have some disadvantages:
-
Longer client ingests: Client ingest latencies might be longer. When you use the Balanced or Strict options, an "ingest successful" message is not returned to the client until all erasure-coded fragments or replicated copies are created and stored. However, object data will most likely reach its final placement much faster.
-
(Strict) Higher rates of ingest failure: With the Strict option, ingest fails whenever StorageGRID can't immediately make all copies specified in the ILM rule. You might see high rates of ingest failure if a required storage location is temporarily offline or if network issues cause delays in copying objects between sites.
-
(Strict) S3 multipart upload placements might not be as expected in some circumstances: With Strict, you expect objects either to be placed as described by the ILM rule or for ingest to fail. However, with an S3 multipart upload, ILM is evaluated for each part of the object as it is ingested, and for the object as a whole when the multipart upload completes. In the following circumstances this might result in placements that are different than you expect:
-
If ILM changes while an S3 multipart upload is in progress: Because each part is placed according to the rule that is active when the part is ingested, some parts of the object might not meet current ILM requirements when the multipart upload completes. In these cases, ingest of the object does not fail. Instead, any part that is not placed correctly is queued for ILM re-evaluation and is moved to the correct location later.
-
When ILM rules filter on size: When evaluating ILM for a part, StorageGRID filters on the size of the part, not the size of the object. This means that parts of an object can be stored in locations that don't meet ILM requirements for the object as a whole. For example, if a rule specifies that all objects 10 GB or larger are stored at DC1 while all smaller objects are stored at DC2, at ingest each 1 GB part of a 10-part multipart upload is stored at DC2. When ILM is evaluated for the object, all parts of the object are moved to DC1.
-
-
(Strict) Ingest does not fail when object tags or metadata are updated and newly required placements cannot be made: With Strict, you expect objects either to be placed as described by the ILM rule or for ingest to fail. However, when you update metadata or tags for an object that is already stored in the grid, the object is not re-ingested. This means that any changes to object placement that are triggered by the update aren't made immediately. Placement changes are made when ILM is re-evaluated by normal background ILM processes. If required placement changes can't be made (for example, because a newly required location is unavailable), the updated object retains its current placement until the placement changes are possible.
Limitations on object placements with the Balanced and Strict options
The Balanced or Strict options can't be used for ILM rules that have any of these placement instructions:
-
Placement in a Cloud Storage Pool at day 0.
-
Placements in a Cloud Storage Pool when the rule has a User defined creation time as its Reference time.
These restrictions exist because StorageGRID can't synchronously make copies to a Cloud Storage Pool, and a User defined creation time could resolve to the present.
How ILM rules and consistency interact to affect data protection
Both your ILM rule and your choice of consistency affect how objects are protected. These settings can interact.
For example, the ingest behavior selected for an ILM rule affects the initial placement of object copies, while the consistency used when an object is stored affects the initial placement of object metadata. Because StorageGRID requires access to both an object's data and metadata to fulfill client requests, selecting matching levels of protection for the consistency and ingest behavior can provide better initial data protection and more predictable system responses.
Here is a brief summary of the consistency values that are available in StorageGRID:
-
All: All nodes receive object metadata immediately or the request will fail.
-
Strong-global: Object metadata is immediately distributed to all sites. Guarantees read-after-write consistency for all client requests across all sites.
-
Strong-site: Object metadata is immediately distributed to other nodes at the site. Guarantees read-after-write consistency for all client requests within a site.
-
Read-after-new-write: Provides read-after-write consistency for new objects and eventual consistency for object updates. Offers high availability and data protection guarantees. Recommended for most cases.
-
Available: Provides eventual consistency for both new objects and object updates. For S3 buckets, use only as required (for example, for a bucket that contains log values that are rarely read, or for HEAD or GET operations on keys that don't exist). Not supported for S3 FabricPool buckets.
Before selecting a consistency value, read the full description of consistency. You should understand the benefits and limitations before changing the default value. |
Example of how consistency and ILM rules can interact
Suppose you have a two-site grid with the following ILM rule and the following consistency:
-
ILM rule: Create two object copies, one at the local site and one at a remote site. Use Strict ingest behavior.
-
consistency: Strong-global (object metadata is immediately distributed to all sites).
When a client stores an object to the grid, StorageGRID makes both object copies and distributes metadata to both sites before returning success to the client.
The object is fully protected against loss at the time of the ingest successful message. For example, if the local site is lost shortly after ingest, copies of both the object data and the object metadata still exist at the remote site. The object is fully retrievable.
If you instead used the same ILM rule and the strong-site consistency, the client might receive a success message after object data is replicated to the remote site but before object metadata is distributed there. In this case, the level of protection of object metadata does not match the level of protection for object data. If the local site is lost shortly after ingest, object metadata is lost. The object can't be retrieved.
The inter-relationship between consistency and ILM rules can be complex. Contact NetApp if you need assistance.