Skip to main content

Analyze error logs in Workload Factory

Contributors netapp-rlithman

Use the smart error log analyzer to automatically interpret Microsoft SQL Server error logs so that you can quickly identify and resolve issues. The Agentic AI-based analysis requires Amazon Bedrock integration.

About this task

Error log analysis and remediation help maintain the health and performance of SQL Server instances. Interpreting SQL Server error logs effectively requires careful analysis and expertise. Manual monitoring, error detection, and root cause analysis are time-intensive and prone to errors. These challenges can delay issue resolution, increased downtime, and operational inefficiencies. The smart error log analyzer addresses these challenges with these key benefits:

  • Smart grouping: Intelligently consolidates errors by uniqueness, severity, and category, and simplifies the troubleshooting process for faster, more effective resolutions.

  • AI-driven investigation: Leverages AI to proactively analyze errors, providing clear, actionable insights to accelerate issue identification without requiring deep expertise.

  • Error enrichment: Enhances error logs with external references, offering contextual clarity to improve understanding and decision-making.

  • Best-practice remediation: Delivers tailored, remediation recommendations for SQL Server workloads running on FSx for ONTAP, empowering users of all skill levels to resolve issues confidently.

Whenever you use the error log analyzer, you maintain full control over your environment while benefiting from advanced AI analysis.

To use the error log analyzer, you need to activate Amazon Bedrock, select the model Workload Factory uses, create a private endpoint to connect to Amazon Bedrock, add permissions, and create an enterprise license.

Data privacy and security

The feature ensures data privacy and security with the following measures:

Data sovereignty

Log data and aggregations stay within your AWS account, communicated via private VPC endpoint (Amazon Bedrock), ensuring no public internet exposure.

No AI Training

Customer data is not used to train or improve models. Amazon Bedrock processes logs in real time but does not train on your data. Results are stored in your environment for reference only. For more details, refer to the Amazon Bedrock data protection documentation.

Before you begin

To use the error log analyzer, you must meet the following prerequisites:

  • You must have AWS account credentials and read/write mode permissions to create a new database host in Workload Factory.

  • Register a SQL Server instance in Workload Factory.

  • The following prerequisites also must be met. You will be prompted to complete these prerequisites as part of the steps to analyze log errors.

    • Amazon Bedrock activation

      Amazon Bedrock is required so that the AI agent running on the SQL node from Workload Factory can seamlessly connect with Bedrock and fetch AI-based insights for the identified error logs.

    • Networking

      The Amazon Bedrock VPC endpoint ensures private communication of your SQL node with Amazon Bedrock APIs and eliminates public internet exposure. Ensure Amazon Bedrock VPC endpoint is associated with the SQL Server node's subnet (example: vpce-050cb2f33a1380ffd).

    • AWS IAM permissions

      The following permissions are required for the EC2 instance profile role associated with the SQL node and for the AWS credentials associated with Workload Factory.

      • EC2 instance profile role with "bedrock:InvokeModel" permission

        This permission enables the EC2 instance on the corresponding SQL node to invoke Bedrock models for proactive error investigation and remediation guidance. This profile also ensures secure AI access for tailored insights.

      • AWS credentials associated with Workload Factory: "bedrock:GetFoundationModelAvailability" and "bedrock:ListInferenceProfiles" permissions

        These permissions verify model availability and configuration in the region of the SQL node, and ensure reliable, region-specific performance.

Analyze error logs

Use the Workload Factory console to analyze SQL Server error logs.

Steps
  1. Log in using one of the console experiences.

  2. In the Databases tile, select Go to Databases inventory.

  3. From the Databases menu, select Inventory.

  4. In the Inventory, select Microsoft SQL Server as the database engine type.

  5. From the Instances tab, locate the specific SQL Server instance you want to analyze and then select Investigate errors from the menu.

  6. From the Error investigation tab, complete the following prerequisites as described in the console:

    • Amazon Bedrock

    • Networking: Private endpoint for Amazon Bedrock

    • Permissions for EC2 instance profile role

    • Credentials associated with Workload Database Management (wlmdb)

  7. When prerequisites are met, select Investigate now to use the error log analyzer to gain insights into your SQL Server error logs.

    After the scan, errors are displayed in the console, providing a comprehensive view of the issues detected by the Smart error log analyzer.

  8. Use filters to refine the displayed errors based on criteria such as severity, time frame, and error code.

  9. Review the detailed error information, including original error message, AI-based explanation, and suggested remediation steps to resolve the errors.