Investigating lost objects

When a LOST (Lost Objects) alarm is triggered, investigate immediately. Collect information about the affected objects and call technical support.

Before you begin

  • You must be signed in to the Grid Manager using a supported browser.
  • You must have specific access permissions. For details, see information about controlling system access with administration user accounts and groups.

About this task

A LOST (Lost Objects) alarm indicates that StorageGRID webscale believes that there are no copies of an object in the grid. It may indicate that data has been permanently lost and is not retrievable.

Investigate lost object alarms immediately. You may need to take action to prevent further data loss. In some cases, you may be able to restore a LOST object if you take prompt action.

The Lost Objects attribute may be seen on either of the following pages:
  • Select Support > Grid Topology. Then select site > Storage Node > LDR > Data Store > Overview > Main.
  • Select Support > Grid Topology. Then select site > Storage Node > DDS > Data Store > Overview > Main.

This procedure shows the Lost Objects attribute on the LDR > Data Store page.

Steps

  1. Select Support > Grid Topology.
  2. Select site > Storage Node > LDR > Data Store > Overview > Main.
  3. Review the Lost Objects attribute to see how many lost objects have been identified.

    Overview: DDS: Data Store page
  4. Use the audit log to determine the identifier (CBID) of the object that triggered the LOST (Lost Objects) alarm:
    1. From the service laptop, log in to the Admin Node as admin and su to root using the password listed in the Passwords.txt file.
    2. Change to the directory where the audit logs are located. Enter: cd /var/local/audit/export/
    3. Use grep to extract the Object Lost (OLST) audit messages. Enter: grep OLST audit_file_name
    4. Note the CBID value included in the message. For example:
      Admin: # grep OLST audit.log
      2012-01-14T11:03:27.362483 [AUDT:[CBID(UI64):0x498D8A1F681F05B3][UUID(CSTR):"6213A021-91FC-49C0-AF44-EC6BF377D264"]
      [NOID(UI32):12088241][VOLI(UI64):2][RSLT(FC32):NONE][AVER(UI32):10][ATYP(FC32):OLST][ATIM(UI64):1350613602969243]
      [ATID(UI64):16956755694216746320][ANID(UI32):13959984][AMID(FC32):BCMS][ASQN(UI64):62]
      [ASES(UI64):1350580983645305]]
  5. Use the ObjectByCBID command to find the object by its identifier (CBID), and then determine if data is at risk.
    1. Telnet to localhost 1402 to access the LDR console.
    2. Enter: /proc/OBRP/ObjectByCBID -h hexadecimal_CBID_value
      In the following example, the object with CBID 0xFE1C42ABD3CD2AC0 has a UUID, but it has no locations listed.
      ade 21511404: / > /proc/OBRP/ObjectByCBID -h 0xFE1C42ABD3CD2AC0
       
      {
          "OID": "00006FFD00198494009DC7E0C02DEA4CC7BCFB513B11B81B8A",
          "TYPE(Object Type)": "Data object",
          "CHND(Content handle)": "9DC7E0C0-2DEA-4CC7-BCFB-513B11B81B8A",
          "NAME": "lost/testau.dat",
          "CBID": "0xFE1C42ABD3CD2AC0",
          "PHND(Parent handle, UUID)": "402BC3FE-1BB4-11E7-8FCB-18EB00C226D9",
          "PPTH(Parent path)": "LOST",
          "META": {
              "BASE(Protocol metadata)": {
                  "ISIA(Source client ip address)": "10.55.72.90",
                  "PHTP(HTTP protocol handler version)": "1",
                  "PAWS(S3 protocol version)": "1",
                  "ACCT(S3 account ID)": "10699577065449838288",
                  "*ctp(HTTP content MIME type)": "application/octet-stream"
              },
              "AWS3": {
                  "USDM(User-defined metadata)": "{\"s3b-last-modified\":[\"20161117T230402Z\"]}"
              
      
      },
              "BYCB(System metadata)": {
                  "SHSH(Supplementary Plaintext hash)": "MD5D 0xC9B110581DAC712BFAE0D1D8EF36CB7E",
                  
      
      "CSIZ(Plaintext object size)": "8204",
                  "BSIZ(Content block size)": "8886",
                  "CVER(Content block version)": "196612",
                  "CFLG(Content block flags)": "256",
                  "CTME(Object store begin timestamp)": "2017-04-10T20:01:58.399632",
                  
      
      "CTYP(Compression algorithm type)": "NONE",
                  "CHSH(Object hash)": "SHA1 0x7973967630676847CEB60C4C0D9384075F81A3C6",
                  
      
      "MTME(Object store modified timestamp)": "2017-04-10T20:01:58.406157"
              },
              "CMSM": {
                  "OWNR(ILM owner node ID)": "13895688",
                  "LATM(Object last access time)": "2017-04-10T20:01:58.399632"
              }
          }
      }
      
    3. Review the output of /proc/OBRP/ObjectByCBID, and take the appropriate action:
      Metadata Conclusion
      No object found ("ERROR":"" )

      or an object was found with no UUID metadata

      If the object is not found, the message "ERROR":"" is returned.

      If the object is not found, or if there is no UUID metadata, it is safe to ignore the alarm. The lack of an object, or the absence of a UUID indicates that the object was intentionally deleted.

      UUID is present

      Locations > 0

      If there is a UUID and there are locations listed in the output, the Lost Objects alarm was a false positive. There are other object locations in the grid. You can reset the Lost Objects alarm.
      UUID is present

      Locations = 0

      If there is a UUID but there are no locations listed in the output, the object is potentially missing.

      If the ILM policy does not include an ILM rule with only one active content placement instruction, contact technical support. You could also try to find and restore the object yourself.

      Support might ask you to determine if there is a storage recovery procedure in progress. That is, has a repair-data command been issued on any Storage Node and is the recovery still in progress. See "Restoring object data to a storage volume" in the recovery and maintenance instructions.