Skip to main content

What is a Storage Node?

Contributors netapp-madkat netapp-perveilerk netapp-lhalbert

Storage Nodes manage and store object data and metadata. Storage Nodes include the services and processes required to store, move, verify, and retrieve object data and metadata on disk.

Each site in your StorageGRID system must have at least three Storage Nodes.

Types of Storage Nodes

All Storage Nodes that were installed before StorageGRID 11.8 store both objects and the metadata for those objects. Starting in StorageGRID 11.8, you can choose the Storage Node type for new software-based storage nodes:

Object and metadata Storage Nodes

By default, all new Storage Nodes installed in StorageGRID 11.8 will store both objects and metadata.

Metadata-only Storage Nodes (software-based nodes only)

You can specify that a new software-based Storage Node be used to store only metadata. You can also add a metadata-only software-based Storage Node to your StorageGRID system during StorageGRID system expansion.

Note You can only select the Storage Node type when initially installing the software-based node or when you install the software-based node during StorageGRID system expansion. You can't change the type after the node installation is complete.

Installing a metadata-only node is typically not required. However, using a Storage Node exclusively for metadata can make sense if your grid stores a very large number of small objects. Installing dedicated metadata capacity provides a better balance between the space needed for a very large number of small objects and the space needed for the metadata for all those objects.

When installing a grid with software-based metadata-only nodes, the grid must also contain a minimum number of nodes for object storage:

  • For a single-site grid, at least two Storage Nodes are configured for objects and metadata.

  • For a multi-site grid, at least one Storage Node per site are configured for objects and metadata.

Software-based Storage Nodes display a metadata-only indication for each metadata-only node on all pages that list the Storage Node type.

Primary services for Storage Nodes

The following table shows the primary services for Storage Nodes; however, this table does not list all node services.

Note Some services, such as the ADC service and the RSM service, typically exist only on three Storage Nodes at each site.
Service Key function

Account (acct)

Manages tenant accounts.

Administrative Domain Controller (ADC)

Maintains topology and grid-wide configuration.

Details

The Administrative Domain Controller (ADC) service authenticates grid nodes and their connections with each other. The ADC service is hosted on a minimum of three Storage Nodes at a site.

The ADC service maintains topology information including the location and availability of services. When a grid node requires information from another grid node or an action to be performed by another grid node, it contacts an ADC service to find the best grid node to process its request. In addition, the ADC service retains a copy of the StorageGRID deployment's configuration bundles, allowing any grid node to retrieve current configuration information.

To facilitate distributed and islanded operations, each ADC service synchronizes certificates, configuration bundles, and information about services and topology with the other ADC services in the StorageGRID system.

In general, all grid nodes maintain a connection to at least one ADC service. This ensures that grid nodes are always accessing the latest information. When grid nodes connect, they cache other grid nodes' certificates, enabling systems to continue functioning with known grid nodes even when an ADC service is unavailable. New grid nodes can only establish connections by using an ADC service.

The connection of each grid node lets the ADC service gather topology information. This grid node information includes the CPU load, available disk space (if it has storage), supported services, and the grid node's site ID. Other services ask the ADC service for topology information through topology queries. The ADC service responds to each query with the latest information received from the StorageGRID system.

Cassandra

Stores and protects object metadata.

Cassandra Reaper

Performs automatic repairs of object metadata.

Chunk

Manages erasure-coded data and parity fragments.

Data Mover (dmv)

Moves data to Cloud Storage Pools.

Distributed Data Store (DDS)

Monitors object metadata storage.

Details

Each Storage Node includes the Distributed Data Store (DDS) service. This service interfaces with the Cassandra database to perform background tasks on the object metadata stored in the StorageGRID system.

The DDS service tracks the total number of objects ingested into the StorageGRID system as well as the total number of objects ingested through each of the system's supported interfaces (S3 or Swift).

Identity (idnt)

Federates user identities from LDAP and Active Directory.

Local Distribution Router (LDR)

Processes object storage protocol requests and manages object data on disk.

Details

Each Storage Node includes the Local Distribution Router (LDR) service. This service handles content transport functions, including data storage, routing, and request handling. The LDR service does most of the StorageGRID system's hard work by handling data transfer loads and data traffic functions.

The LDR service handles the following tasks:

  • Queries

  • Information lifecycle management (ILM) activity

  • Object deletion

  • Object data storage

  • Object data transfers from another LDR service (Storage Node)

  • Data storage management

  • Protocol interfaces (S3 and Swift)

The LDR service also maps each S3 and Swift object to its unique UUID.

Object stores

The underlying data storage of an LDR service is divided into a fixed number of object stores (also known as storage volumes). Each object store is a separate mount point.

The object stores in a Storage Node are identified by a hexadecimal number from 0000 to 002F, which is known as the volume ID. Space is reserved in the first object store (volume 0) for object metadata in a Cassandra database; any remaining space on that volume is used for object data. All other object stores are used exclusively for object data, which includes replicated copies and erasure-coded fragments.

To ensure even space usage for replicated copies, object data for a given object is stored to one object store based on available storage space. When an object store fills to capacity, the remaining object stores continue to store objects until there is no more room on the Storage Node.

Metadata protection

StorageGRID stores object metadata in a Cassandra database, which interfaces with the LDR service.

To ensure redundancy and thus protection against loss, three copies of object metadata are maintained at each site. This replication is non-configurable and performed automatically. For details, see Manage object metadata storage.

Replicated State Machine (RSM)

Ensures that S3 platform services requests are sent to their respective endpoints.

Server Status Monitor (SSM)

Monitors the operating system and underlying hardware.