Stop and start the cluster

01/31/2023 Contributors

Gracefully stopping and starting the HA cluster.

Overview

This section describes how to gracefully shutdown and restart the BeeGFS cluster. Example scenarios where this may be required include electrical maintenance or migrating between datacenters or racks.

Steps

If for any reason you need to stop the entire BeeGFS cluster and shutdown all services run:

pcs cluster stop --all

It is also possible to stop the cluster on individual nodes (which will automatically failover services to another node), though it is recommended to first put the node in standby (see the failover section):

pcs cluster stop <HOSTNAME>

To start cluster services and resources on all nodes run:

pcs cluster start --all

Or start services on a specific node with:

pcs cluster start <HOSTNAME>

At this point run pcs status and verify the cluster and BeeGFS services start on all nodes, and services are running on the nodes you expect.

Depending on the size of the cluster it can take sometime (seconds to minutes) for the entire cluster to stop, or show started in pcs status. If pcs cluster <COMMAND> hangs for more than five minutes, before running "Ctrl+C" to cancel the command, login to each node of the cluster and use pcs status to see if cluster services (Corosync/Pacemaker) are still running on that node. From any node where the cluster is still active you can check what resources are blocking the cluster. Manually address the issue and the command should either complete or can be rerun to stop any remaining services.

Stop and start the cluster

Creating your file...

Overview

Steps