Audit message flow and retention

As StorageGRID Webscale services perform their various activities and process events, audit messages are generated to retain a record of this activity.

Audit messages are processed by the Audit Management System (AMS) service, which is hosted by the Admin Node, and they are stored in the form of text log files.

Audit message flow

Audit messages are generated internally by each service. All services generate audit messages during normal system operation. These messages are sent to all connected AMS services for processing and storage, so that each AMS service maintains a complete record of system activity.

Some services can be designated as audit message relay services. They act as collection points to reduce the need for every service to send its audit messages to all connected AMS services. As shown in the audit message flow diagram, each relay service must send messages to all AMS service destinations, whereas services can send messages to just one relay service.

Diagram that summarizes audit message flow through relays

Relay services are designated at the time the topology of the StorageGRID Webscale deployment is configured. In a StorageGRID Webscale system, the ADC service is designated as the audit message relay.

Message retention

After an audit message is generated, it is stored on the grid node of the originating service until it has been committed to all connected AMS services, or a designated audit relay service. The relays in turn store the message until it is committed at all AMS services. This process includes a confirmation (positive acknowledgment) to ensure that no messages are lost.

diagram that summarizes audit message receipt at the AMS

Messages arrive at the AMS service and are stored in a queue pending a confirmed write to the audit log file (audit.log). Confirmation of the arrival of messages is sent to the originating service (or audit relay) to permit the originator to delete its copy of the message.

Only after a message has been committed to storage at the AMS service can it be removed from the queue. If the backlog becomes unusually large, the local message buffer at the audit relay service (ADC) and the AMS service each have an alarm (AMQS) associated with it. During peak activity, the rate at which audit messages arrive can be faster than they can be relayed to the audit repository on the AMS service or committed to storage in the audit log file, causing a temporary backlog that clears itself when system activity declines.

Once a day the active audit log is saved to a file named for the date the file is saved (in the format YYYY-MM-DD.txt) and a new audit log file is started. If more than one audit log is created in a single day, each log is saved to a file named using the date when the file is saved and is appended with a number (in the format YYYY-MM-DD.txt.#): for example, 2010-04-23.txt.1. Subsequent audit messages generated on the same day are saved to a new audit log. This new audit log is saved with the same date as the other, but with the appended number incremented by one: for example, 2010-04-23.txt.2.

Audit logs are compressed after one day and are renamed YYYY-MMDD.txt.gz (where the original date is preserved). Audit logs files are saved to the Admin Node’s /var/local/audit/export directory. Over time, this results in the consumption of storage allocated for audit logs on the Admin Node. A script monitors the audit log space consumption and deletes log files as necessary to free space in the /var/local/audit/export directory. Audit logs are deleted based on the date they were created, with the oldest being deleted first. You can monitor the script's actions in the manage-audit.log file.

Duplicate messages

Audit messages are queued for storage by the AMS service. If system communications are interrupted (for example, because of service failures or network interruptions), the write status of some audit messages might be in doubt. The StorageGRID Webscale system takes a conservative approach in this case: all queued audit messages are resubmitted to the AMS service. This can result in duplicate messages in the audit log.

If duplicate messages are a cause for concern (for example, if the audit log is used for billing applications), you must detect and discard duplicate audit messages manually. To detect duplicate audit messages, you use the audit sequence count number (ASQN). Duplicate messages have the same ASQN.