What is a Storage Node?
Storage Nodes manage and store object data and metadata. Storage Nodes include the services and processes required to store, move, verify, and retrieve object data and metadata on disk.
Each site in your StorageGRID system must have at least three Storage Nodes.
Types of Storage Nodes
During installation, you can select the type of Storage Node you want to install. These types are available for software-based Storage Nodes and for appliance-based Storage Nodes that support the feature:
-
Combined data and metadata Storage Node
-
Metadata-only Storage Node
-
Data-only Storage Node
You can select the Storage Node type in these situations:
-
When initially installing a Storage Node
-
When you add a Storage Node during StorageGRID system expansion
You can't change the type after the Storage Node installation is complete. |
- Data and metadata Storage Node (combined)
-
By default, all new Storage Nodes will store both object data and metadata. This type of Storage Node is called a combined Storage Node.
- Metadata-only Storage Node
-
Using a Storage Node exclusively for metadata can make sense if your grid stores a very large number of small objects. Installing dedicated metadata capacity provides a better balance between the space needed for a very large number of small objects and the space needed for the metadata for those objects. Additionally, metadata-only Storage Nodes hosted on high-performance appliances can increase performance.
When installing metadata-only nodes, the grid must also contain a minimum number of nodes for data storage:
-
For a single-site grid, configure at least two combined or data-only Storage Nodes.
-
For a multi-site grid, configure at least one combined or data-only Storage Node per site.
Although metadata-only Storage Nodes contain the LDR service and can process S3 client requests, StorageGRID performance might not increase. |
- Data-only Storage Node
-
Using a Storage Node exclusively for data can make sense if your Storage Nodes have differing performance characteristics. For example, to potentially increase performance, you could have data-only, high-capacity spinning-disk Storage Nodes accompanied by metadata-only high-performance Storage Nodes.
When installing data-only nodes, the grid must contain the following:
-
A minimum of two combined or data-only Storage Nodes per grid
-
At least one combined or data-only Storage Node per site
-
A minimum of three combined or metadata-only Storage Nodes per site
Primary services for Storage Nodes
The following table shows the primary services for Storage Nodes; however, this table does not list all node services.
Some services, such as the ADC service and the RSM service, typically exist only on three Storage Nodes at each site. |
Service | Key function |
---|---|
Account (acct) |
Manages tenant accounts. |
Administrative Domain Controller (ADC) |
Maintains topology and grid-wide configuration. Note: Data-only Storage Nodes don't host the ADC service. DetailsThe Administrative Domain Controller (ADC) service authenticates grid nodes and their connections with each other. The ADC service is hosted on a minimum of three Storage Nodes at a site. The ADC service maintains topology information including the location and availability of services. When a grid node requires information from another grid node or an action to be performed by another grid node, it contacts an ADC service to find the best grid node to process its request. In addition, the ADC service retains a copy of the StorageGRID deployment's configuration bundles, allowing any grid node to retrieve current configuration information. To facilitate distributed and islanded operations, each ADC service synchronizes certificates, configuration bundles, and information about services and topology with the other ADC services in the StorageGRID system. In general, all grid nodes maintain a connection to at least one ADC service. This ensures that grid nodes are always accessing the latest information. When grid nodes connect, they cache other grid nodes' certificates, enabling systems to continue functioning with known grid nodes even when an ADC service is unavailable. New grid nodes can only establish connections by using an ADC service. The connection of each grid node lets the ADC service gather topology information. This grid node information includes the CPU load, available disk space (if it has storage), supported services, and the grid node's site ID. Other services ask the ADC service for topology information through topology queries. The ADC service responds to each query with the latest information received from the StorageGRID system. |
Cassandra |
Stores and protects object metadata. Note: Data-only Storage Nodes don't host the Cassandra service. |
Cassandra Reaper |
Performs automatic repairs of object metadata. Note: Data-only Storage Nodes don't host the Cassandra Reaper service. |
Chunk |
Manages erasure-coded data and parity fragments. |
Data Mover (dmv) |
Moves data to Cloud Storage Pools. |
Distributed Data Store (DDS) |
Monitors object metadata storage. DetailsEach Storage Node includes the Distributed Data Store (DDS) service. This service interfaces with the Cassandra database to perform background tasks on the object metadata stored in the StorageGRID system. The DDS service tracks the total number of objects ingested into the StorageGRID system as well as the total number of objects ingested through each of the system's supported interfaces (S3). |
Identity (idnt) |
Federates user identities from LDAP and Active Directory. |
Processes object storage protocol requests and manages object data on disk. DetailsEach combined, data-only, and metadata-only Storage Node includes the Local Distribution Router (LDR) service. This service handles content transport functions, including data storage, routing, and request handling. The LDR service does most of the StorageGRID system's hard work by handling data transfer loads and data traffic functions. The LDR service handles the following tasks:
The LDR service also maps each S3 object to its unique UUID.
|
|
Replicated State Machine (RSM) |
Ensures that S3 platform services requests are sent to their respective endpoints. |
Server Status Monitor (SSM) |
Monitors the operating system and underlying hardware. |