Skip to main content
NetApp Solutions
简体中文版经机器翻译而成,仅供参考。如与英语版出现任何冲突,应以英语版为准。

TR-4包:《AWS EC2中采用PacMaker集群和FSx ONTAP的Oracle HA》

贡献者

NetApp公司Allen Cao、Niyaz Mohamed

本解决方案概述了如何在AWS EC2中通过Redhat Enterprise Linux (RHEL)和Amazon FSx ONTAP上的PacMaker集群通过NFS协议为数据库存储HA启用Oracle高可用性(HA)、并提供了相关详细信息。

目的

许多努力在公有云中自行管理和运行Oracle的客户需要克服一些挑战。其中一个挑战就是为Oracle数据库启用高可用性。过去、Oracle客户依靠称为"Real Application Cluster"或RAC的Oracle数据库功能在多个集群节点上提供主动-主动事务支持。一个故障节点不会拖延应用程序处理。遗憾的是、Oracle RAC实施在许多常见公有云(例如AWS EC2)中不易获得或不受支持。通过利用RHEL和Amazon FSx ONTAP中的内置PacMaker集群(PCS)、客户可以获得一种可行的替代方案、无需支付Oracle RAC许可证成本、即可在计算和存储上进行主动-被动集群、以支持AWS云中的任务关键型Oracle数据库工作负载。

本文档详细介绍了如何在RHEL上设置Pacemker集群、使用NFS协议在EC2和Amazon FSx ONTAP上部署Oracle数据库、在Pacemker中配置Oracle资源以实现HA、以及在最常遇到的HA情形下通过验证结束演示。该解决方案还提供了有关使用NetApp SnapCenter UI工具快速备份、还原和克隆Oracle数据库的信息。

此解决方案 可解决以下使用情形:

  • 在RHEL中设置和配置起搏器HA集群。

  • 在AWS EC2和Amazon FSx ONTAP中部署Oracle数据库HA。

audience

此解决方案 适用于以下人员:

  • 希望在AWS EC2和Amazon FSx ONTAP中部署Oracle的数据库BA。

  • 希望在AWS EC2和Amazon FSx ONTAP中测试Oracle工作负载的数据库解决方案架构师。

  • 希望在AWS EC2和Amazon FSx ONTAP中部署和管理Oracle数据库的存储管理员。

  • 希望在AWS EC2和Amazon FSx ONTAP中建立Oracle数据库的应用程序所有者。

解决方案 测试和验证环境

此解决方案的测试和验证是在实验室环境中执行的、可能与最终部署环境不匹配。请参见一节 部署注意事项的关键因素 有关详细信息 …​

架构

此图详细展示了AWS EC2中采用PacMaker集群和FSx ONTAP的Oracle HA。

硬件和软件组件

* 硬件 *

Amazon FSx ONTAP存储

AWS提供的当前版本

us-east-1中的单可用性(AZ)、1024 GiB容量、128 MB/秒吞吐量

数据库服务器的EC2实例

t2.xlarge/4vCPU/16G

两个EC2 T2大型EC2实例、一个用作主数据库服务器、另一个用作备用数据库服务器

适用于AnsStorage控制器的VM

4个vCPU、16 GiB RAM

一个Linux VM、用于在NFS上运行自动化AWS EC2/FSx配置和Oracle部署

软件

RedHat Linux

RHEL Linux 8.6 (LVM)- x64 Gen2

已部署RedHat订阅以进行测试

Oracle 数据库

版本19.18

已应用RU修补程序p34765931_190000_Linux-x86-64.zip

Oracle OPatch

版本12.2.0.1.36

最新修补程序p6880880_190000_Linux-x86-64.zip

起搏器

0.10.18版

RedHat推出的适用于RHEL 8.0的高可用性附加软件

NFS

版本 3.0

已启用Oracle DNFS

Ansible

核心2.16.2.

Python 3.6.8

AWS EC2/FSx实验室环境中的Oracle数据库主动/被动配置

* 服务器 *

* 数据库 *

DB存储

主节点:orapm01/ip-172.30.15.111

NTAP)(NTAP_PDB1、NTAP_PDB2、NTAP_PDB3)

/u01、/u02、/u03 NFS挂载到Amazon FSx ONTAP卷上

备用节点:orapm02/ip-172.30.15.5

故障转移时的NTAP)(NTAP_PDB1、NTAP_PDB2、NTAP_PDB3)

/u01、/u02、/u03故障转移时NFS挂载

部署注意事项的关键因素

  • *Amazon FSx ONTAP HA.*默认情况下、Amazon FSx ONTAP配置在单个或多个可用性区域的存储控制器HA对中。它以主动/被动方式为任务关键型数据库工作负载提供存储冗余。存储故障转移对最终用户是透明的。发生存储故障转移时、不需要用户干预。

  • *PCS资源组和资源排序。*一个资源组允许在同一集群节点上运行多个具有依赖关系的资源。资源顺序会反向执行资源启动顺序和关闭顺序。

  • *首选节点。*PacMaker集群专门部署在主动/被动集群中(PacMaker不是要求)、并与FSx ONTAP集群同步。如果活动EC2实例可用且存在位置限制、则此实例将配置为Oracle资源的首选节点。

  • *备用节点上的隔离延迟。*在双节点PCS集群中、仲裁会人为设置为1。如果集群节点之间出现通信问题、则任一节点都可能尝试隔离另一节点、从而可能导致数据损坏。在备用节点上设置延迟可缓解此问题、并允许主节点在隔离备用节点期间继续提供服务。

  • *部署多可用性分区的注意事项。*该解决方案在一个可用性区域中进行部署和验证。对于多可用性分区部署、需要额外的AWS网络资源才能在可用性分区之间移动PC浮动IP。

  • *Oracle数据库存储布局。*在此解决方案演示中、我们将为测试数据库NTAN配置三个数据库卷、以托管Oracle二进制文件、数据和日志。卷会通过NFS以/u01 -二进制文件、/u02 -数据和/u03 -日志的形式挂载在Oracle数据库服务器上。在/u02和/u03挂载点上配置双控制文件、以实现冗余。

  • *DNFS配置。*通过使用DNFS (自Oracle 11g起提供)、在DB VM上运行的Oracle数据库可以比本机NFS客户端驱动更多的I/O。默认情况下、Oracle自动化部署会在NFSv3上配置DNFS。

  • 数据库备份。 NetApp提供了一个SnapCenter软件套件、可通过用户友好的用户界面进行数据库备份、还原和克隆。NetApp建议实施此类管理工具、以实现快速(不到一分钟)的快照备份、快速(几分钟)的数据库还原和数据库克隆。

解决方案 部署

以下各节介绍了在采用PacMaker集群和Amazon FSx ONTAP的AWS EC2中部署和配置Oracle数据库HA以实现数据库存储保护的分步过程。

部署的前提条件

Details

部署需要满足以下前提条件。

  1. 已设置AWS帐户、并已在您的AWS帐户中创建必要的VPC和网段。

  2. 将Linux VM配置为安装了最新版本的Ansv近 和Git的Ansv可 控制器节点。有关详细信息、请参见以下链接: "NetApp解决方案 自动化入门" 在第-节中
    Setup the Ansible Control Node for CLI deployments on RHEL / CentOS
    Setup the Ansible Control Node for CLI deployments on Ubuntu / Debian

    在Ans得 控制器和EC2实例数据库VM之间启用ssh公共/专用密钥身份验证。

配置EC2实例和Amazon FSx ONTAP存储集群

Details

虽然可以从AWS控制台手动配置EC2实例和Amazon FSx ONTAP、但建议使用基于NetApp Terraform的自动化工具包来自动配置EC2实例和FSx ONTAP存储集群。以下是详细过程。

  1. 从AWS CloudShell或Ans得 控制器VM克隆一份适用于EC2和FSx ONTAP的自动化工具包副本。

    git clone https://bitbucket.ngage.netapp.com/scm/ns-bb/na_aws_fsx_ec2_deploy.git
    备注 如果此工具包不是从AWS CloudShell执行的、则需要使用AWS用户帐户访问/机密密钥对对对对您的AWS帐户进行AWS命令行界面身份验证。
  2. 查看工具包中的readme.md文件。根据需要修改所需AWS资源的main.tf和关联参数文件。

    An example of main.tf:
    
    resource "aws_instance" "orapm01" {
      ami                           = var.ami
      instance_type                 = var.instance_type
      subnet_id                     = var.subnet_id
      key_name                      = var.ssh_key_name
    
      root_block_device {
        volume_type                 = "gp3"
        volume_size                 = var.root_volume_size
      }
    
      tags = {
        Name                        = var.ec2_tag1
      }
    }
    
    resource "aws_instance" "orapm02" {
      ami                           = var.ami
      instance_type                 = var.instance_type
      subnet_id                     = var.subnet_id
      key_name                      = var.ssh_key_name
    
      root_block_device {
        volume_type                 = "gp3"
        volume_size                 = var.root_volume_size
      }
    
      tags = {
        Name                        = var.ec2_tag2
      }
    }
    
    resource "aws_fsx_ontap_file_system" "fsx_01" {
      storage_capacity              = var.fs_capacity
      subnet_ids                    = var.subnet_ids
      preferred_subnet_id           = var.preferred_subnet_id
      throughput_capacity           = var.fs_throughput
      fsx_admin_password            = var.fsxadmin_password
      deployment_type               = var.deployment_type
    
      disk_iops_configuration {
        iops                        = var.iops
        mode                        = var.iops_mode
      }
    
      tags                          = {
        Name                        = var.fsx_tag
      }
    }
    
    resource "aws_fsx_ontap_storage_virtual_machine" "svm_01" {
      file_system_id                = aws_fsx_ontap_file_system.fsx_01.id
      name                          = var.svm_name
      svm_admin_password            = var.vsadmin_password
    }
  3. 验证并执行Terraform计划。成功执行将在目标AWS帐户中创建两个EC2实例和一个FSx ONTAP存储集群。自动化输出将显示EC2实例IP地址和FSx ONTAP集群端点。

    terraform plan -out=main.plan
    terraform apply main.plan

至此、为Oracle完成了EC2实例和FSx ONTAP配置。

起搏器集群设置

Details

适用于RHEL的高可用性附加组件是一个集群模式系统、可为Oracle数据库服务等关键生产服务提供可靠性、可扩展性和可用性。在此使用情形演示中、我们会设置并配置一个双节点PacMaker集群、以便在主动/被动集群方案中支持Oracle数据库的高可用性。  

以EC2-user身份登录到EC2实例、在 `both`EC2实例上完成以下任务:

  1. 删除AWS Red Hat Update Infrastructure (RHUI)客户端。

    sudo -i yum -y remove rh-amazon-rhui-client*
  2. 向Red Hat注册EC2实例VM。

    sudo subscription-manager register --username xxxxxxxx --password 'xxxxxxxx' --auto-attach
  3. 启用RHEL高可用性rpm。

    sudo subscription-manager config --rhsm.manage_repos=1
    sudo subscription-manager repos --enable=rhel-8-for-x86_64-highavailability-rpms
  4. 安装起搏器和防护剂。

    sudo yum update -y
    sudo yum install pcs pacemaker fence-agents-aws
  5. 在所有集群节点上为hacluser创建密码。对所有节点使用相同密码。

    sudo passwd hacluster
  6. 启动pcs服务并使其在启动时启动。

    sudo systemctl start pcsd.service
    sudo systemctl enable pcsd.service
  7. 验证PCSD服务。

    sudo systemctl status pcsd
    [ec2-user@ip-172-30-15-5 ~]$ sudo systemctl status pcsd
    ● pcsd.service - PCS GUI and remote configuration interface
       Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
       Active: active (running) since Tue 2024-09-10 18:50:22 UTC; 33s ago
         Docs: man:pcsd(8)
               man:pcs(8)
     Main PID: 65302 (pcsd)
        Tasks: 1 (limit: 100849)
       Memory: 24.0M
       CGroup: /system.slice/pcsd.service
               └─65302 /usr/libexec/platform-python -Es /usr/sbin/pcsd
    
    Sep 10 18:50:21 ip-172-30-15-5.ec2.internal systemd[1]: Starting PCS GUI and remote configuration interface...
    Sep 10 18:50:22 ip-172-30-15-5.ec2.internal systemd[1]: Started PCS GUI and remote configuration interface.
  8. 将集群节点添加到主机文件。

    sudo vi /etc/hosts
    [ec2-user@ip-172-30-15-5 ~]$ cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    
    # cluster nodes
    172.30.15.111   ip-172-30-15-111.ec2.internal
    172.30.15.5     ip-172-30-15-5.ec2.internal
  9. 安装和配置awscli以连接到AWS帐户。

    sudo yum install awscli
    sudo aws configure
    [ec2-user@ip-172-30-15-111 ]# sudo aws configure
    AWS Access Key ID [None]: XXXXXXXXXXXXXXXXX
    AWS Secret Access Key [None]: XXXXXXXXXXXXXXXX
    Default region name [None]: us-east-1
    Default output format [None]: json
  10. 安装资源代理包(如果尚未安装)。

    sudo yum install resource-agents

在 `only one`集群节点上、完成以下任务以创建pcs集群。

  1. 对pcs用户haclCluster进行身份验证。

    sudo pcs host auth ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs host auth ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    Username: hacluster
    Password:
    ip-172-30-15-111.ec2.internal: Authorized
    ip-172-30-15-5.ec2.internal: Authorized
  2. 创建pcs集群。

    sudo pcs cluster setup ora_ec2nfsx ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs cluster setup ora_ec2nfsx ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    No addresses specified for host 'ip-172-30-15-5.ec2.internal', using 'ip-172-30-15-5.ec2.internal'
    No addresses specified for host 'ip-172-30-15-111.ec2.internal', using 'ip-172-30-15-111.ec2.internal'
    Destroying cluster on hosts: 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'...
    ip-172-30-15-5.ec2.internal: Successfully destroyed cluster
    ip-172-30-15-111.ec2.internal: Successfully destroyed cluster
    Requesting remove 'pcsd settings' from 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful removal of the file 'pcsd settings'
    ip-172-30-15-5.ec2.internal: successful removal of the file 'pcsd settings'
    Sending 'corosync authkey', 'pacemaker authkey' to 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'corosync authkey'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'pacemaker authkey'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'corosync authkey'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'pacemaker authkey'
    Sending 'corosync.conf' to 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'corosync.conf'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'corosync.conf'
    Cluster has been successfully set up.
  3. 启用集群。

    sudo pcs cluster enable --all
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs cluster enable --all
    ip-172-30-15-5.ec2.internal: Cluster Enabled
    ip-172-30-15-111.ec2.internal: Cluster Enabled
  4. 启动并验证集群。

    sudo pcs cluster start --all
    sudo pcs status
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    No stonith devices and stonith-enabled is not false
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Wed Sep 11 15:43:23 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Wed Sep 11 15:43:06 2024 by hacluster via hacluster on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 0 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    
    Full List of Resources:
      * No resources
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

这样便完成了PacMaker集群设置和初始配置。

p搏 器集群隔离配置

Details

生产集群必须配置起搏器隔离配置。它可确保自动隔离AWS EC2集群上发生故障的节点、从而防止该节点占用集群的资源、损害集群的功能或损坏共享数据。本节演示了如何使用fence_AWS隔离代理配置集群隔离。

  1. 以root用户身份输入以下AWS元数据查询、以获取每个EC2实例节点的实例ID。

    echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
    [root@ip-172-30-15-111 ec2-user]# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
    i-0d8e7a0028371636f
    
    or just get instance-id from AWS EC2 console
  2. 输入以下命令以配置隔离设备。使用PCMK_HOST_MAP命令将RHEL主机名映射到实例ID。使用您先前用于AWS身份验证的AWS用户帐户的AWS访问密钥和AWS机密访问密钥。

    sudo pcs stonith \
    create clusterfence fence_aws access_key=XXXXXXXXXXXXXXXXX secret_key=XXXXXXXXXXXXXXXXXX \
    region=us-east-1 pcmk_host_map="ip-172-30-15-111.ec2.internal:i-0d8e7a0028371636f;ip-172-30-15-5.ec2.internal:i-0bc54b315afb20a2e" \
    power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4
  3. 验证隔离配置。

    pcs status
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Wed Sep 11 21:17:18 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Wed Sep 11 21:16:40 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 1 resource instance configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  4. 在集群级别将stosth-action设置为off、而不是重新启动。

    pcs property set stonith-action=off
    [root@ip-172-30-15-111 ec2-user]# pcs property config
    Cluster Properties:
     cluster-infrastructure: corosync
     cluster-name: ora_ec2nfsx
     dc-version: 2.1.7-5.1.el8_10-0f7f88312
     have-watchdog: false
     last-lrm-refresh: 1726257586
     stonith-action: off
    备注 将st年 操作设置为off时、隔离的集群节点最初将关闭。在stinith power_timeout中定义的时间段(240秒)之后、隔离的节点将重新启动并重新加入集群。
  5. 将备用节点的隔离延迟设置为10秒。

    pcs stonith update clusterfence pcmk_delay_base="ip-172-30-15-111.ec2.internal:0;ip-172-30-15-5.ec2.internal:10s"
    [root@ip-172-30-15-111 ec2-user]# pcs stonith config
    Resource: clusterfence (class=stonith type=fence_aws)
      Attributes: clusterfence-instance_attributes
        access_key=XXXXXXXXXXXXXXXX
        pcmk_delay_base=ip-172-30-15-111.ec2.internal:0;ip-172-30-15-5.ec2.internal:10s
        pcmk_host_map=ip-172-30-15-111.ec2.internal:i-0d8e7a0028371636f;ip-172-30-15-5.ec2.internal:i-0bc54b315afb20a2e
        pcmk_reboot_retries=4
        pcmk_reboot_timeout=480
        power_timeout=240
        region=us-east-1
        secret_key=XXXXXXXXXXXXXXXX
      Operations:
        monitor: clusterfence-monitor-interval-60s
          interval=60s
备注 执行 `pcs stonith refresh`命令以刷新已停止的storith防护代理或清除失败的storith资源操作。

在PCS集群中部署Oracle数据库

Details

我们建议您利用NetApp提供的《Andsute操作手册》在PCS集群上使用预定义的参数执行数据库安装和配置任务。对于这种自动化Oracle部署、在执行操作手册之前、需要用户输入三个用户定义的参数文件。

  • 主机—定义运行自动化操作手册的目标。

  • vars/vars.yml—用于定义应用于所有目标的变量的全局变量文件。

  • host_vars/host_name.yml—用于定义仅适用于指定目标的变量的本地变量文件。在我们的使用情形中、这些是Oracle数据库服务器。

除了这些用户定义的变量文件之外、还有多个默认变量文件包含默认参数、除非必要、否则不需要更改这些参数。下面显示了在PCS集群配置中的AWS EC2和FSx ONTAP中自动部署Oracle的详细信息。

  1. 从NetApp控制器管理员用户主目录中、克隆适用于NFS的Oracle部署自动化工具包的副本。

    git clone https://bitbucket.ngage.netapp.com/scm/ns-bb/na_oracle_deploy_nfs.git
    备注 只要它们之间存在网络连接、则可将Ans得 控制器与数据库EC2实例位于同一个VPC中、也可以位于内部环境中。
  2. 在hosts参数文件中填写用户定义的参数。以下是典型主机文件配置的示例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat hosts
    #Oracle hosts
    [oracle]
    orapm01 ansible_host=172.30.15.111 ansible_ssh_private_key_file=ec2-user.pem
    orapm02 ansible_host=172.30.15.5 ansible_ssh_private_key_file=ec2-user.pem
  3. 在vars/vars.yml参数文件中填写用户定义的参数。以下是典型的vars.yml文件配置示例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat vars/vars.yml
    ######################################################################
    ###### Oracle 19c deployment user configuration variables       ######
    ###### Consolidate all variables from ONTAP, linux and oracle   ######
    ######################################################################
    
    ###########################################
    ### ONTAP env specific config variables ###
    ###########################################
    
    # Prerequisite to create three volumes in NetApp ONTAP storage from System Manager or cloud dashboard with following naming convention:
    # db_hostname_u01 - Oracle binary
    # db_hostname_u02 - Oracle data
    # db_hostname_u03 - Oracle redo
    # It is important to strictly follow the name convention or the automation will fail.
    
    
    ###########################################
    ### Linux env specific config variables ###
    ###########################################
    
    redhat_sub_username: xxxxxxxx
    redhat_sub_password: "xxxxxxxx"
    
    
    ####################################################
    ### DB env specific install and config variables ###
    ####################################################
    
    # Database domain name
    db_domain: ec2.internal
    
    # Set initial password for all required Oracle passwords. Change them after installation.
    initial_pwd_all: "xxxxxxxx"
  4. 在host_vars/host_name.yml参数文件中填写用户定义的参数。以下是典型的host_vars/host_name.yml文件配置示例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat host_vars/orapm01.yml
    # User configurable Oracle host specific parameters
    
    # Database SID. By default, a container DB is created with 3 PDBs within the CDB
    oracle_sid: NTAP
    
    # CDB is created with SGA at 75% of memory_limit, MB. Consider how many databases to be hosted on the node and
    # how much ram to be allocated to each DB. The grand total of SGA should not exceed 75% available RAM on node.
    memory_limit: 8192
    
    # Local NFS lif ip address to access database volumes
    nfs_lif: 172.30.15.95
    备注 可以从上一节中自动EC2和FSx ONTAP部署的FSx ONTAP集群端点输出中检索到NFS_luf地址。
  5. 从AWS FSx控制台创建数据库卷。确保使用PCS主节点主机名(orapm01)作为卷的前缀、如下所示。

    此图提供了从AWS FSx控制台配置Amazon FSx ONTAP卷的过程 此图提供了从AWS FSx控制台配置Amazon FSx ONTAP卷的过程 此图提供了从AWS FSx控制台配置Amazon FSx ONTAP卷的过程 此图提供了从AWS FSx控制台配置Amazon FSx ONTAP卷的过程 此图提供了从AWS FSx控制台配置Amazon FSx ONTAP卷的过程

  6. Stage following Oracle 19c installation files on PCS Primary node EC2 instance./tmp/archive directory with 777 permission (PCS主节点EC2实例ip-172-30-15-111.ec2.internal /tmp/archive目录上具有777权限)。

    installer_archives:
      - "LINUX.X64_193000_db_home.zip"
      - "p34765931_190000_Linux-x86-64.zip"
      - "p6880880_190000_Linux-x86-64.zip"
  7. 执行Linux配置操作手册 all nodes

    ansible-playbook -i hosts 2-linux_config.yml -u ec2-user -e @vars/vars.yml
    [admin@ansiblectl na_oracle_deploy_nfs]$ ansible-playbook -i hosts 2-linux_config.yml -u ec2-user -e @vars/vars.yml
    
    PLAY [Linux Setup and Storage Config for Oracle] ****************************************************************************************************************************************************************************************************************************************************************************
    
    TASK [Gathering Facts] ******************************************************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    ok: [orapm02]
    
    TASK [linux : Configure RedHat 7 for Oracle DB installation] ****************************************************************************************************************************************************************************************************************************************************************
    skipping: [orapm01]
    skipping: [orapm02]
    
    TASK [linux : Configure RedHat 8 for Oracle DB installation] ****************************************************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/linux/tasks/rhel8_config.yml for orapm01, orapm02
    
    TASK [linux : Register subscriptions for RedHat Server] *********************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    ok: [orapm02]
    .
    .
    .
  8. 执行Oracle配置操作手册 only on primary node(在主机文件中注释掉备用节点)。

    ansible-playbook -i hosts 4-oracle_config.yml -u ec2-user -e @vars/vars.yml --skip-tags "enable_db_start_shut"
    [admin@ansiblectl na_oracle_deploy_nfs]$ ansible-playbook -i hosts 4-oracle_config.yml -u ec2-user -e @vars/vars.yml --skip-tags "enable_db_start_shut"
    
    PLAY [Oracle installation and configuration] ********************************************************************************************************************************************************************************************************************************************************************************
    
    TASK [Gathering Facts] ******************************************************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    
    TASK [oracle : Oracle software only install] ********************************************************************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/oracle/tasks/oracle_install.yml for orapm01
    
    TASK [oracle : Create mount points for NFS file systems / Mount NFS file systems on Oracle hosts] ***************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/oracle/tasks/oracle_mount_points.yml for orapm01
    
    TASK [oracle : Create mount points for NFS file systems] ********************************************************************************************************************************************************************************************************************************************************************
    changed: [orapm01] => (item=/u01)
    changed: [orapm01] => (item=/u02)
    changed: [orapm01] => (item=/u03)
    .
    .
    .
  9. 部署数据库后、请在主节点上的/etc/fstab中注释掉/u01、/u02、/u03挂载、因为挂载点仅由PC管理。

    sudo vi /etc/fstab
    [root@ip-172-30-15-111 ec2-user]# cat /etc/fstab
    UUID=eaa1f38e-de0f-4ed5-a5b5-2fa9db43bb38       /       xfs     defaults        0       0
    /mnt/swapfile swap swap defaults 0 0
    #172.30.15.95:/orapm01_u01 /u01 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
    #172.30.15.95:/orapm01_u02 /u02 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
    #172.30.15.95:/orapm01_u03 /u03 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
  10. 将/etc/oratab /etc/oraInst.loc、/HOME/oracle/.bash_profile复制到备用节点。确保保持正确的文件所有权和权限。

  11. 关闭主节点上的数据库、侦听器和umount /u01、/u02、/u03。

    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Wed Sep 18 16:51:02 UTC 2024
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Wed Sep 18 16:51:16 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> shutdown immediate;
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    [oracle@ip-172-30-15-111 ~]$ lsnrctl stop listener.ntap
    
    [oracle@ip-172-30-15-111 ~]$ exit
    logout
    [root@ip-172-30-15-111 ec2-user]# umount /u01
    [root@ip-172-30-15-111 ec2-user]# umount /u02
    [root@ip-172-30-15-111 ec2-user]# umount /u03
  12. 在备用节点IP-172-30-15-5上创建挂载点。

    mkdir /u01
    mkdir /u02
    mkdir /u03
  13. 在备用节点IP-172-30-15-5上挂载FSx ONTAP数据库卷。

    mount -t nfs 172.30.15.95:/orapm01_u01 /u01 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount -t nfs 172.30.15.95:/orapm01_u02 /u02 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount -t nfs 172.30.15.95:/orapm01_u03 /u03 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
  14. 已更改为Oracle用户、请重新链接二进制文件。

    [root@ip-172-30-15-5 ec2-user]# su - oracle
    Last login: Thu Sep 12 18:09:03 UTC 2024 on pts/0
    [oracle@ip-172-30-15-5 ~]$ env | grep ORA
    ORACLE_SID=NTAP
    ORACLE_HOME=/u01/app/oracle/product/19.0.0/NTAP
    [oracle@ip-172-30-15-5 ~]$ cd $ORACLE_HOME/bin
    [oracle@ip-172-30-15-5 bin]$ ./relink
    writing relink log to: /u01/app/oracle/product/19.0.0/NTAP/install/relinkActions2024-09-12_06-21-40PM.log
  15. 将dnfs lib复制回ODM文件夹。重新链接可能会丢失dfns库文件。

    [oracle@ip-172-30-15-5 odm]$ cd /u01/app/oracle/product/19.0.0/NTAP/rdbms/lib/odm
    [oracle@ip-172-30-15-5 odm]$ cp ../../../lib/libnfsodm19.so .
  16. 启动数据库以在备用节点IP-172-30-15-5上验证。

    [oracle@ip-172-30-15-5 odm]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Thu Sep 12 18:30:04 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Connected to an idle instance.
    
    SQL> startup;
    ORACLE instance started.
    
    Total System Global Area 6442449688 bytes
    Fixed Size                  9177880 bytes
    Variable Size            1090519040 bytes
    Database Buffers         5335154688 bytes
    Redo Buffers                7598080 bytes
    Database mounted.
    Database opened.
    SQL> select name, open_mode from v$database;
    
    NAME      OPEN_MODE
    --------- --------------------
    NTAP      READ WRITE
    
    SQL> show pdbs
    
        CON_ID CON_NAME                       OPEN MODE  RESTRICTED
    ---------- ------------------------------ ---------- ----------
             2 PDB$SEED                       READ ONLY  NO
             3 NTAP_PDB1                      READ WRITE NO
             4 NTAP_PDB2                      READ WRITE NO
             5 NTAP_PDB3                      READ WRITE NO
  17. 关闭数据库并将数据库故障恢复到主节点IP-172-30-15-111。

    SQL> shutdown immediate;
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
    SQL> exit
    
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
    
    [root@ip-172-30-15-5 ec2-user]# umount /u01
    [root@ip-172-30-15-5 ec2-user]# umount /u02
    [root@ip-172-30-15-5 ec2-user]# umount /u03
    
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u01 /u01 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u02 /u02 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u03 /u03 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.8G   48M  7.7G   1% /dev/shm
    tmpfs                      7.8G   33M  7.7G   1% /run
    tmpfs                      7.8G     0  7.8G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   29G   22G  58% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Thu Sep 12 18:13:34 UTC 2024 on pts/1
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Thu Sep 12 18:38:46 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Connected to an idle instance.
    
    SQL> startup;
    ORACLE instance started.
    
    Total System Global Area 6442449688 bytes
    Fixed Size                  9177880 bytes
    Variable Size            1090519040 bytes
    Database Buffers         5335154688 bytes
    Redo Buffers                7598080 bytes
    Database mounted.
    Database opened.
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    [oracle@ip-172-30-15-111 ~]$ lsnrctl start listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 12-SEP-2024 18:39:17
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Starting /u01/app/oracle/product/19.0.0/NTAP/bin/tnslsnr: please wait...
    
    TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    System parameter file is /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Log messages written to /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                12-SEP-2024 18:39:17
    Uptime                    0 days 0 hr. 0 min. 0 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
    The listener supports no services
    The command completed successfully

配置用于PC管理的Oracle资源

Details

配置PacMaker集群的目标是、设置一个主动/被动高可用性解决方案、以便在发生故障时、以最少的用户干预在AWS EC2和FSx ONTAP环境中运行Oracle。下面演示了用于PC管理的Oracle资源配置。

  1. 以主EC2实例IP-172-30-15-111的root用户身份、使用VPC CIDR块中未使用的专用IP地址创建一个二级专用IP地址作为浮动IP。在此过程中、创建二级专用IP地址所属的Oracle资源组。

    pcs resource create privip ocf:heartbeat:awsvip secondary_private_ip=172.30.15.33 --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:25:35 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:25:23 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 2 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-5.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    备注 如果主节点恰好是在备用集群节点上创建的、请将其移至主节点、如下所示。
  2. 在集群节点之间移动资源。

    pcs resource move privip ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs resource move privip ip-172-30-15-111.ec2.internal
    Warning: A move constraint has been created and the resource 'privip' may or may not move depending on other configuration
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    Following resources have been moved and their move constraints are still in place: 'privip'
    Run 'pcs constraint location' or 'pcs resource clear <resource id>' to view or remove the constraints, respectively
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:26:38 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:26:27 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 2 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal (Monitoring)
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  3. 为Oracle创建虚拟IP (VIP)。虚拟IP将根据需要在主节点和备用节点之间浮动。

    pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.30.15.33 cidr_netmask=25 nic=eth0 op monitor interval=10s --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.30.15.33 cidr_netmask=25 nic=eth0 op monitor interval=10s --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    Following resources have been moved and their move constraints are still in place: 'privip'
    Run 'pcs constraint location' or 'pcs resource clear <resource id>' to view or remove the constraints, respectively
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:27:34 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:27:24 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 3 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  4. 以Oracle用户身份更新listener.ora和tnsnames.ora文件以指向VIP地址。重新启动侦听器。根据需要退回数据库、以便数据库向侦听器注册。

    vi $ORACLE_HOME/network/admin/listener.ora
    vi $ORACLE_HOME/network/admin/tnsnames.ora
    [oracle@ip-172-30-15-111 admin]$ cat listener.ora
    # listener.ora Network Configuration File: /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    # Generated by Oracle configuration tools.
    
    LISTENER.NTAP =
      (DESCRIPTION_LIST =
        (DESCRIPTION =
          (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
          (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
        )
      )
    
    [oracle@ip-172-30-15-111 admin]$ cat tnsnames.ora
    # tnsnames.ora Network Configuration File: /u01/app/oracle/product/19.0.0/NTAP/network/admin/tnsnames.ora
    # Generated by Oracle configuration tools.
    
    NTAP =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = NTAP.ec2.internal)
        )
      )
    
    LISTENER_NTAP =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
    
    
    [oracle@ip-172-30-15-111 admin]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 13-SEP-2024 18:28:17
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                13-SEP-2024 18:15:51
    Uptime                    0 days 0 hr. 12 min. 25 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.30.15.33)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=ip-172-30-15-111.ec2.internal)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/NTAP/admin/NTAP/xdb_wallet))(Presentation=HTTP)(Session=RAW))
    Services Summary...
    Service "21f0b5cc1fa290e2e0636f0f1eacfd43.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b74445329119e0636f0f1eacec03.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b83929709164e0636f0f1eacacc3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAP.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAPXDB.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb1.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb2.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    The command completed successfully
    
    **Oracle listener now listens on vip for database connection**
  5. 将/u01、/u02、/u03挂载点添加到Oracle资源组。

    pcs resource create u01 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u01' directory='/u01' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
    pcs resource create u02 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u02' directory='/u02' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
    pcs resource create u03 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u03' directory='/u03' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
  6. 在Oracle数据库中创建PCS监控用户ID。

    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Fri Sep 13 18:12:24 UTC 2024 on pts/0
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 19:08:41 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> CREATE USER c##ocfmon IDENTIFIED BY "XXXXXXXX";
    
    User created.
    
    SQL> grant connect to c##ocfmon;
    
    Grant succeeded.
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
  7. 将数据库添加到Oracle资源组。

    pcs resource create ntap ocf:heartbeat:oracle sid='NTAP' home='/u01/app/oracle/product/19.0.0/NTAP' user='oracle' monuser='C##OCFMON' monpassword='XXXXXXXX' monprofile='DEFAULT' --group oracle
  8. 将数据库侦听器添加到Oracle资源组。

    pcs resource create listener ocf:heartbeat:oralsnr sid='NTAP' listener='listener.ntap' --group=oracle
  9. 将Oracle资源组中的所有资源位置约束更新为主节点作为首选节点。

    pcs constraint location privip prefers ip-172-30-15-111.ec2.internal
    pcs constraint location vip prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u01 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u02 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u03 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location ntap prefers ip-172-30-15-111.ec2.internal
    pcs constraint location listener prefers ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs constraint config
    Location Constraints:
      Resource: listener
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: ntap
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: privip
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u01
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u02
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u03
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: vip
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
    Ordering Constraints:
    Colocation Constraints:
    Ticket Constraints:
  10. 验证Oracle资源配置。

    pcs status
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 19:25:32 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 19:23:40 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Started ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

部署后HA验证

Details

部署完成后、运行一些测试和验证非常重要、以确保PCS Oracle数据库故障转移集群配置正确并按预期运行。测试验证包括受管故障转移和模拟的意外资源故障、以及通过集群保护机制进行的恢复。

  1. 通过手动触发备用节点的隔离来验证节点隔离、并观察备用节点是否已在超时后脱机并重新启动。

    pcs stonith fence <standbynodename>
    [root@ip-172-30-15-111 ec2-user]# pcs stonith fence ip-172-30-15-5.ec2.internal
    Node: ip-172-30-15-5.ec2.internal fenced
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 21:58:45 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 21:55:12 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-111.ec2.internal ]
      * OFFLINE: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Started ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  2. 通过终止侦听器进程模拟数据库侦听器故障,并观察PC监控侦听器故障并在几秒钟内重新启动它。

    [root@ip-172-30-15-111 ec2-user]# ps -ef | grep lsnr
    oracle    154895       1  0 18:15 ?        00:00:00 /u01/app/oracle/product/19.0.0/NTAP/bin/tnslsnr listener.ntap -inherit
    root      217779  120186  0 19:36 pts/0    00:00:00 grep --color=auto lsnr
    [root@ip-172-30-15-111 ec2-user]# kill -9 154895
    
    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Thu Sep 19 14:58:54 UTC 2024
    [oracle@ip-172-30-15-111 ~]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 13-SEP-2024 19:36:51
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    TNS-12541: TNS:no listener
     TNS-12560: TNS:protocol adapter error
      TNS-00511: No listener
       Linux Error: 111: Connection refused
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
    TNS-12541: TNS:no listener
     TNS-12560: TNS:protocol adapter error
      TNS-00511: No listener
       Linux Error: 111: Connection refused
    
    [oracle@ip-172-30-15-111 ~]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 19-SEP-2024 15:00:10
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                16-SEP-2024 14:00:14
    Uptime                    3 days 0 hr. 59 min. 56 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.30.15.33)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=ip-172-30-15-111.ec2.internal)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/NTAP/admin/NTAP/xdb_wallet))(Presentation=HTTP)(Session=RAW))
    Services Summary...
    Service "21f0b5cc1fa290e2e0636f0f1eacfd43.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b74445329119e0636f0f1eacec03.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b83929709164e0636f0f1eacacc3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAP.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAPXDB.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb1.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb2.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    The command completed successfully
  3. 通过中止pmon进程模拟数据库故障、并观察PC监控数据库系统故障并在几秒钟内重新启动它来模拟数据库故障。

    **Make a remote connection to ntap database**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 15:42:42 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Thu Sep 12 2024 13:37:28 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>
    
    **Kill ntap pmon process to simulate a failure**
    
    [root@ip-172-30-15-111 ec2-user]# ps -ef | grep pmon
    oracle    159247       1  0 18:27 ?        00:00:00 ora_pmon_NTAP
    root      230595  120186  0 19:44 pts/0    00:00:00 grep --color=auto pmon
    [root@ip-172-30-15-111 ec2-user]# kill -9 159247
    
    **Observe the DB failure**
    
    SQL> /
    select instance_name, host_name from v$instance
    *
    ERROR at line 1:
    ORA-03113: end-of-file on communication channel
    Process ID: 227424
    Session ID: 396 Serial number: 4913
    
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    **Reconnect to DB after reboot**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 15:47:24 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 15:42:47 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>
  4. 通过将主节点置于备用模式以将Oracle资源故障转移到备用节点、验证受管数据库从主节点故障转移到备用节点的情况。

    pcs node standby <nodename>
    **Stopping Oracle resources on primary node in reverse order**
    
    [root@ip-172-30-15-111 ec2-user]# pcs node standby ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:01:16 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:01:08 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Node ip-172-30-15-111.ec2.internal: standby (with active resources)
      * Online: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Stopping ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Stopped
        * u03       (ocf::heartbeat:Filesystem):     Stopped
        * ntap      (ocf::heartbeat:oracle):         Stopped
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Starting Oracle resources on standby node in sequencial order**
    
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:01:34 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:01:08 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Node ip-172-30-15-111.ec2.internal: standby
      * Online: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-5.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-5.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Starting ip-172-30-15-5.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **NFS mount points mounted on standby node**
    
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  840G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  840G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  840G 100% /u03
    tmpfs                      1.6G     0  1.6G   0% /run/user/54321
    
    **Database opened on standby node**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 16:34:08 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 15:47:28 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select name, open_mode from v$database;
    
    NAME      OPEN_MODE
    --------- --------------------
    NTAP      READ WRITE
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-5.ec2.internal
    
    
    SQL>
  5. 验证非备用主节点是否已将受管数据库从备用故障恢复到主节点、并观察Oracle资源是否会因首选节点设置而自动进行故障恢复。

    pcs node unstandby <nodename>
    **Stopping Oracle resources on standby node for failback to primary**
    
    [root@ip-172-30-15-111 ec2-user]# pcs node unstandby ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:41:30 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:41:18 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Stopping ip-172-30-15-5.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Stopped
        * u01       (ocf::heartbeat:Filesystem):     Stopped
        * u02       (ocf::heartbeat:Filesystem):     Stopped
        * u03       (ocf::heartbeat:Filesystem):     Stopped
        * ntap      (ocf::heartbeat:oracle):         Stopped
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Starting Oracle resources on primary node for failback**
    
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:41:45 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:41:18 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Starting ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Database now accepts connection on primary node**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 16:46:07 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 16:34:12 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>

至此、在使用PacMaker集群和Amazon FSx ONTAP作为数据库存储后端的AWS EC2中完成了Oracle HA验证和解决方案演示。

使用SnapCenter进行Oracle备份、还原和克隆

Details

NetApp建议使用SnapCenter UI工具来管理AWS EC2和Amazon FSx ONTAP中部署的Oracle数据库。"借助子系统装载的FSx ONTAP、在基于AWS的VMware Cloud中简化自我管理Oracle" `Oracle backup, restore, and clone with SnapCenter`有关设置SnapCenter以及执行数据库备份、还原和克隆工作流的详细信息、请参见TR-4979。