Troubleshoot issues

Contributors netapp-soumikd Download PDF of this page

Issue: Redis Pods get stuck in a CrashLoopBackOff state

Description
In a high availability configuration, the AKS cluster does not come back to working state if all the nodes of the cluster are down. When you restart all the nodes, you might find all the Redis Pods to be in CrashLoopBackOff state.

Solution
You should run the following commands to restore the system.

  1. Log into the Connector.

  2. Delete all the Redis Pods.

    • docker exec -it cloudmanager_snapcenter — sh

    • kubectl scale --replicas=0 sts sc-dependencies-redis-node -n snapcenter

  3. Verify if all the Redis Pods are deleted.
    kubectl get pods -n snapcenter

  4. If the Redis Pods are not deleted, run the following commands:

    • kubectl delete pod sc-dependencies-redis-node-0 -n snapcenter

    • kubectl delete pod sc-dependencies-redis-node-1 -n snapcenter

    • kubectl delete pod sc-dependencies-redis-node-2 -n snapcenter

  5. After all the Redis Pods are deleted, run:
    kubectl scale --replicas=3 sts sc-dependencies-redis-node -n snapcenter

  6. Verify if all the deleted pods are up and running.
    Kubectl get pods -n snapcenter

Issue: Jobs are failing after restarting the cluster nodes

Description
In a high availability configuration, the AKS cluster does not come back to working state if all the nodes of the cluster are down. When you restart all the nodes, you might see jobs are failing with granular tasks either greyed out or timed out.

Solution
You should run the following commands:

  1. Log into the Connector.

  2. Save the RabbitMQ statefulset (sts) deployment.

    • docker exec -it cloudmanager_snapcenter — sh

    • kubectl get sts rabbitmq -o yaml -n snapcenter > rabbitmq_sts.yaml

  3. Identify the persistent volumes (PVs) attached to RabbitMQ pods.
    kubectl get pv | grep rabbitmq

  4. Delete the persistent volume claims (PVCs) attached to RabbitMQ pods.
    kubectl get pvc -n snapcenter| grep rabbitmq | awk {'print $1'} | xargs kubectl delete pvc -n snapcenter

  5. Delete each of the PVs that you identified earlier in step 3.
    kubectl delete pv 'pvname'

  6. Create a RabbitMQ sts.
    kubectl create -f rabbitmq_sts.yaml -n snapcenter

Issue: Backup operation fails during tenant database creation

Description
While creating a tenant database if an on-demand or a scheduled backup is initiated, the backup operation fails.

Solution
Creating a tenant database is a maintenance operation on the SAP HANA system.

You should put the SAP HANA system in the maintenance mode using SnapCenter Service before creating the tenant database. After putting the SAP HANA system in maintenance mode no operations can be initiated.

After creating the tenant database, you should bring back the SAP HANA system to production mode.