Known issues
Known issues identify problems that might prevent you from using this release of the product successfully.
The following known issues affect the current release:
-
During app restore from backup Trident creates a larger PV than the original PV
-
App clones fail when using Service Account level OCP Security Context Constraints (SCC)
-
Reusing buckets between instances of Astra Control Center causes failures
-
Selecting a bucket provider type with credentials for another type causes data protection failures
-
Backups and snapshots might not be retained during removal of an Astra Control Center instance
-
500 internal service error when attempting Trident app data management
-
Custom app execution hook scripts time out and cause post-snapshot scripts not to execute
-
Can't determine ASUP tar bundle status in scaled environment
-
Snapshots eventually begin to fail when using external-snapshotter version 4.2.0
-
Simultaneous app restore operations in the same namespace can fail
-
Uninstall of Astra Control Center fails to clean up Traefik CRDs
App with user-defined label goes into "removed" state
If you define an app with a non-existent k8s label, Astra Control Center will create, manage, and then immediately remove the app. To avoid this, add the k8s label to pods and resources after the app is managed by Astra Control Center.
Unable to stop running app backup
There is no way to stop a running backup. If you need to delete the backup, wait until it has completed and then use the instructions in Delete backups. To delete a failed backup, use the Astra Control API.
During app restore from backup Trident creates a larger PV than the original PV
If you resize a persistent volume after creating a backup and then restore from that backup, the persistent volume size will match the new size of the PV instead of using the size of the backup.
Clone performance impacted by large persistent volumes
Clones of very large and consumed persistent volumes might be intermittently slow, dependent on cluster access to the object store. If the clone is hung and no data has been copied for more than 30 minutes, Astra Control terminates the clone action.
App clones fail using a specific version of PostgreSQL
App clones within the same cluster consistently fail with the Bitnami PostgreSQL 11.5.0 chart. To clone successfully, use an earlier or later version of the chart.
App clones fail when using Service Account level OCP Security Context Constraints (SCC)
An application clone might fail if the original security context constraints are configured at the service account level within the namespace on the OCP cluster. When the application clone fails, it appears in the Managed Applications area in Astra Control Center with status Removed
. See the knowledgebase article for more information.
App clones fail after an application is deployed with a set storage class
After an application is deployed with a storage class explicitly set (for example, helm install …-set global.storageClass=netapp-cvs-perf-extreme
), subsequent attempts to clone the application require that the target cluster have the originally specified storage class.
Cloning an application with an explicitly set storage class to a cluster that does not have the same storage class will fail. There are no recovery steps in this scenario.
Reusing buckets between instances of Astra Control Center causes failures
If you try to reuse a bucket used by another or previous installation of Astra Control Center, backup and restore operations will fail. You must use a different bucket or completely clean out the previously used bucket. You can't share buckets between instances of Astra Control Center.
Selecting a bucket provider type with credentials for another type causes data protection failures
When you add a bucket, select the correct bucket provider and enter the right credentials for that provider. For example, the UI accepts NetApp ONTAP S3 as the type and accepts StorageGRID credentials; however, this will cause all future app backups and restores using this bucket to fail.
Backups and snapshots might not be retained during removal of an Astra Control Center instance
If you have an evaluation license, be sure you store your account ID to avoid data loss in the event of Astra Control Center failure if you are not sending ASUPs.
Clone operation can't use other buckets besides the default
During an app backup or app restore, you can optionally specify a bucket ID. An app clone operation, however, always uses the default bucket that has been defined. There is no option to change buckets for a clone. If you want control over which bucket is used, you can either change the bucket default or do a backup followed by a restore separately.
Managing a cluster with Astra Control Center fails when default kubeconfig file contains more than one context
You cannot use a kubeconfig with more than one cluster and context in it. See the knowledgebase article for more information.
500 internal service error when attempting Trident app data management
If Trident on an app cluster goes offline (and is brought back online) and 500 internal service errors are encountered when attempting app data management, restart all of the Kubernetes nodes in the app cluster to restore functionality.
Custom app execution hook scripts time out and cause post-snapshot scripts not to execute
If an execution hook takes longer than 25 minutes to run, the hook will fail, creating an event log entry with a return code of "N/A". Any affected snapshot will timeout and be marked as failed, with a resulting event log entry noting the timeout.
Because execution hooks often reduce or completely disable the functionality of the application they are running against, you should always try to minimize the time your custom execution hooks take to run.
Can't determine ASUP tar bundle status in scaled environment
During ASUP collection, the status of the bundle in the UI is reported as either collecting
or done
. Collection can take up to an hour for large environments. During ASUP download, the network file transfer speed for the bundle might be insufficient, and the download might time out after 15 minutes without any indication in the UI. Download issues depend on the size of the ASUP, the scaled cluster size, and if collection time goes beyond the seven-day limit.
Snapshots eventually begin to fail when using external-snapshotter version 4.2.0
When you use Kubernetes snapshot-controller (also known as external-snapshotter) version 4.2.0 with Kubernetes 1.20 or 1.21, snapshots can eventually begin to fail. To prevent this, use a different supported version of external-snapshotter, such as version 4.2.1, with Kubernetes versions 1.20 or 1.21.
Simultaneous app restore operations in the same namespace can fail
If you try to restore one or more individually managed apps within a namespace simultaneously, the restore operations can fail after a long period of time. As a workaround, restore each app one at a time.
Upgrade fails if source version uses a container image registry that does not require authentication and target version uses a container image registry that requires authentication
If you upgrade an Astra Control Center system that uses a registry that doesn't require authentication to a newer version that uses a registry that requires authentication, the upgrade fails. As a workaround, perform the following steps:
-
Log in to a host that has network access to the Astra Control Center cluster.
-
Ensure that the host has the following configuration:
-
kubectl
version 1.19 or later is installed -
The KUBECONFIG environment variable is set to the kubeconfig file for the Astra Control Center cluster
-
-
Execute the following script:
namespace="<netapp-acc>" statefulsets=("polaris-vault" "polaris-mongodb" "influxdb2" "nats" "loki") for ss in ${statefulsets[@]}; do existing=$(kubectl get -n ${namespace} statefulsets.apps ${ss} -o jsonpath='{.spec.template.spec.imagePullSecrets}') if [ "${existing}" = "[{}]" ] || [ "${existing}" = "[{},{},{}]" ]; then kubectl patch -n ${namespace} statefulsets.apps ${ss} --type merge --patch '{"spec": {"template": {"spec": {"imagePullSecrets": []}}}}' else echo "${ss} not patched" fi done
You should see output similar to the following:
statefulset.apps/polaris-vault patched statefulset.apps/polaris-mongodb patched statefulset.apps/influxdb2 patched statefulset.apps/nats patched statefulset.apps/loki patched
-
Proceed with the upgrade using the Astra Control Center upgrade instructions.
Uninstall of Astra Control Center fails to clean up the monitoring-operator pod on the managed cluster
If you did not unmanage your clusters before you uninstalled Astra Control Center, you can manually delete the pods in the netapp-monitoring namespace and the namespace with the following commands:
-
Delete
acc-monitoring
agent:oc delete agents acc-monitoring -n netapp-monitoring
Result:
agent.monitoring.netapp.com "acc-monitoring" deleted
-
Delete the namespace:
oc delete ns netapp-monitoring
Result:
namespace "netapp-monitoring" deleted
-
Confirm resources removed:
oc get pods -n netapp-monitoring
Result:
No resources found in netapp-monitoring namespace.
-
Confirm monitoring agent removed:
oc get crd|grep agent
Sample result:
agents.monitoring.netapp.com 2021-07-21T06:08:13Z
-
Delete custom resource definition (CRD) information:
oc delete crds agents.monitoring.netapp.com
Result:
customresourcedefinition.apiextensions.k8s.io "agents.monitoring.netapp.com" deleted
Uninstall of Astra Control Center fails to clean up Traefik CRDs
You can manually delete the Traefik CRDs. CRDs are global resources, and deleting them might impact other applications on the cluster.
-
List Traefik CRDs installed on the cluster:
kubectl get crds |grep -E 'traefik'
Response
ingressroutes.traefik.containo.us 2021-06-23T23:29:11Z ingressroutetcps.traefik.containo.us 2021-06-23T23:29:11Z ingressrouteudps.traefik.containo.us 2021-06-23T23:29:12Z middlewares.traefik.containo.us 2021-06-23T23:29:12Z middlewaretcps.traefik.containo.us 2021-06-23T23:29:12Z serverstransports.traefik.containo.us 2021-06-23T23:29:13Z tlsoptions.traefik.containo.us 2021-06-23T23:29:13Z tlsstores.traefik.containo.us 2021-06-23T23:29:14Z traefikservices.traefik.containo.us 2021-06-23T23:29:15Z
-
Delete the CRDs:
kubectl delete crd ingressroutes.traefik.containo.us ingressroutetcps.traefik.containo.us ingressrouteudps.traefik.containo.us middlewares.traefik.containo.us serverstransports.traefik.containo.us tlsoptions.traefik.containo.us tlsstores.traefik.containo.us traefikservices.traefik.containo.us middlewaretcps.traefik.containo.us