Lost and missing object data

Retrieval attempts are made for several reasons, including read requests from a client application, background verifications of replicated object data, ILM re-evaluations, and the restoration of object data during the recovery of a Storage Node.

The StorageGRID Webscale system uses object location information listed in an object’s metadata to determine the location from which it retrieves an object. If a copy of the object is not found at the expected location, the system attempts to retrieve another copy of the object from elsewhere in the system, provided an ILM policy that made two or more copies of an object is configured. If this retrieval is successful, the StorageGRID Webscale system replaces the missing copy of the object. Whether or not an alarm is triggered depends on the type of missing copy (erasure coded or replicated) and whether or not the system is able to retrieve another copy and replace the missing one. If another copy cannot be retrieved, the object is considered lost and a LOST (Lost Objects) alarm is triggered. All LOST (Lost Objects) alarms should be investigated immediately to determine the root cause of the loss and to determine if the object might still exist in an offline or otherwise currently unavailable Storage Node or Archive Node.

For erasure coded copies, if a copy cannot be retrieved from the expected location, the Corrupt Copies Detected (ECOR) attribute is incremented by one before an attempt is made to retrieve a copy from another location. If no other copy is found, the LOST (Lost objects) alarm is triggered, as it is for replicated object data.

In the case where object data without copies is lost, there is no recovery solution. The only action to take is to reset the Lost Object attribute under the DDS or the LDR service and clear the LOST (Lost Objects) alarm. Clearing the alarm will prevent known LOST alarm instances from masking any new LOST alarm instances.