Troubleshooting sign-on errors

If you experience an error when you are signing in to a StorageGRID Admin Node, your system might have an issue with the identity federation configuration, a networking or hardware problem, an issue with Admin Node services, or an issue with the Cassandra database on connected Storage Nodes.

Before you begin

About this task

Use these troubleshooting guidelines if you see any of the following error messages when attempting to sign in to an Admin Node:
  • Your credentials for this account were invalid. Please try again.
  • Waiting for services to start...
  • Internal server error. The server encountered an error and could not complete your request. Please try again. If the problem persists, contact Technical Support.
  • Unable to communicate with server. Reloading page...

Procedure

  1. Wait 10 minutes, and try signing in again.
    If the error is not resolved automatically, go to the next step.
  2. If your StorageGRID system has more than one Admin Node, try signing in to the Grid Manager from another Admin Node.
    • If you are able to sign in, you can use the Dashboard, Nodes, Alerts, and Support > Grid Topology options to help determine the cause of the error.
    • If you have only one Admin Node or you still cannot sign in, go to the next step.
  3. Determine if the node's hardware is offline.
  4. If single sign-on (SSO) is enabled for your StorageGRID system, refer to the steps for configuring single sign-on, in the instructions for administering StorageGRID.
    You might need to temporarily disable and re-enable SSO for a single Admin Node to resolve any issues.
    Note: If SSO is enabled, you cannot sign on using a restricted port. You must use port 443.
  5. Determine if the account you are using belongs to a federated user.
    If the federated user account is not working, try signing in to the Grid Manager as a local user, such as root.
    • If the local user can sign in:
      1. Review any displayed alarms.
      2. Select Configuration > Identity Federation.
      3. Click Test Connection to validate your connection settings for the LDAP server.
      4. If the test fails, resolve any configuration errors.
    • If the local user cannot sign in and you are confident that the credentials are correct, go to the next step.
  6. Use Secure Shell (ssh) to log in to the Admin Node:
    1. Enter the following command: ssh admin@Admin_Node_IP
    2. Enter the password listed in the Passwords.txt file.
    3. Enter the following command to switch to root: su -
    4. Enter the password listed in the Passwords.txt file.
      When you are logged in as root, the prompt changes from $ to #.
  7. View the status of all services running on the grid node: storagegrid-status
    Make sure the nms, mi, nginx, and mgmt api services are all running.
    The output is updated immediately if the status of a service changes.
    $ storagegrid-status
    Host Name                      99-211
    IP Address                     10.96.99.211
    Operating System Kernel        4.19.0         Verified
    Operating System Environment   Debian 10.1    Verified
    StorageGRID Webscale Release   11.4.0         Verified
    Networking                                    Verified
    Storage Subsystem                             Verified
    Database Engine                5.5.9999+default Running
    Network Monitoring             11.4.0         Running
    Time Synchronization           1:4.2.8p10+dfsg Running
    ams                            11.4.0         Running
    cmn                            11.4.0         Running
    nms                            11.4.0         Running
    ssm                            11.4.0         Running
    mi                             11.4.0         Running
    dynip                          11.4.0         Running
    nginx                          1.10.3         Running
    tomcat                         9.0.27         Running
    grafana                        6.4.3          Running
    mgmt api                       11.4.0         Running
    prometheus                     11.4.0         Running
    persistence                    11.4.0         Running
    ade exporter                   11.4.0         Running
    alertmanager                   11.4.0         Running
    attrDownPurge                  11.4.0         Running
    attrDownSamp1                  11.4.0         Running
    attrDownSamp2                  11.4.0         Running
    node exporter                  0.17.0+ds      Running
    sg snmp agent                  11.4.0         Running
  8. Confirm that the Apache web server is running: # service apache2 status
  9. Use Lumberjack to collect logs: # /usr/local/sbin/lumberjack.rb
    If the failed authentication happened in the past, you can use the –start and –end Lumberjack script options to specify the appropriate time range. Use lumberjack -h for details on these options.

    The output to the terminal indicates where the log archive has been copied.

  10. Review the following logs:
    • /var/local/log/bycast.log
    • /var/local/log/bycast-err.log
    • /var/local/log/nms.log
    • **/*commands.txt
  11. If you could not identify any issues with the Admin Node, issue either of the following commands to determine the IP addresses of the three Storage Nodes that run the ADC service at your site. Typically, these are the first three Storage Nodes that were installed at the site.
    # cat /etc/hosts
    # vi /var/local/gpt-data/specs/grid.xml
    Admin Nodes use the ADC service during the authentication process.
  12. From the Admin Node, log in to each of the ADC Storage Nodes, using the IP addresses you identified.
    1. Enter the following command: ssh admin@grid_node_IP
    2. Enter the password listed in the Passwords.txt file.
    3. Enter the following command to switch to root: su -
    4. Enter the password listed in the Passwords.txt file.
      When you are logged in as root, the prompt changes from $ to #.
  13. View the status of all services running on the grid node: storagegrid-status
    Make sure the idnt, acct, nginx, and cassandra services are all running.
  14. Repeat steps 9 and 10 to review the logs on the Storage Nodes.
  15. If you are unable to resolve the issue, contact technical support.
    Provide the logs you collected to technical support.