Skip to main content
NetApp database solutions
本繁體中文版使用機器翻譯,譯文僅供參考,若與英文版本牴觸,應以英文版本為準。

TR-4998:使用 Pacemaker Clustering 和 FSx ONTAP在 AWS EC2 中實作 Oracle HA

貢獻者 netapp-revathid kevin-hoke

Allen Cao、Niyaz Mohamed, NetApp

此解決方案概述並詳細介紹如何在 AWS EC2 中使用 Redhat Enterprise Linux (RHEL) 上的 Pacemaker 叢集以及透過 NFS 協定使用Amazon FSx ONTAP實作資料庫儲存 HA 的 Oracle 高可用性 (HA)。

目的

許多努力在公有雲中自行管理和運行 Oracle 的客戶需要克服一些挑戰。其中一個挑戰是實現 Oracle 資料庫的高可用性。傳統上,Oracle 客戶依靠稱為「真正應用叢集」或 RAC 的 Oracle 資料庫功能在多個叢集節點上實現主動-主動事務支援。一個節點發生故障不會阻礙應用程式的處理。不幸的是,Oracle RAC 實現在許多流行的公有雲(如 AWS EC2)中並不容易獲得或支援。透過利用 RHEL 和Amazon FSx ONTAP中內建的 Pacemaker 叢集 (PCS),客戶可以在無需 Oracle RAC 授權成本的情況下實現可行的替代方案,以便在運算和儲存上實現主動-被動集群,從而支援 AWS 雲端中的關鍵任務 Oracle 資料庫工作負載。

本文檔示範了在 RHEL 上設定 Pacemaker 叢集、在 EC2 和使用 NFS 協定的Amazon FSx ONTAP上部署 Oracle 資料庫、在 Pacemaker 中配置 Oracle 資源以實作 HA 以及在最常遇到的 HA 場景下透過驗證完成簡報的詳細資訊。該解決方案還提供有關使用NetApp SnapCenter UI 工具快速備份、還原和克隆 Oracle 資料庫的資訊。

此解決方案適用於以下用例:

  • RHEL 中的 Pacemaker HA 叢集設定和配置。

  • AWS EC2 和Amazon FSx ONTAP中的 Oracle 資料庫 HA 部署。

對象

此解決方案適用於以下人群:

  • 想要在 AWS EC2 和Amazon FSx ONTAP中部署 Oracle 的 DBA。

  • 一位資料庫解決方案架構師,想要在 AWS EC2 和Amazon FSx ONTAP中測試 Oracle 工作負載。

  • 想要在 AWS EC2 和Amazon FSx ONTAP中部署和管理 Oracle 資料庫的儲存管理員。

  • 希望在 AWS EC2 和Amazon FSx ONTAP中建立 Oracle 資料庫的應用程式擁有者。

解決方案測試和驗證環境

此解決方案的測試和驗證是在實驗室環境中進行的,可能與最終部署環境不符。請參閱部署考慮的關鍵因素了解更多。

架構

此圖提供了帶有 Pacemaker Clustering 和 FSx ONTAP 的AWS EC2 中的 Oracle HA 的詳細圖片。

硬體和軟體組件

硬體

Amazon FSx ONTAP存儲

AWS 提供的目前版本

us-east-1 中的單可用區,容量 1024 GiB,吞吐量 128 MB/s

資料庫伺服器的 EC2 執行個體

t2.xlarge/4vCPU/16G

兩個 EC2 T2 xlarge EC2 實例,一個作為主資料庫伺服器,另一個作為備用資料庫伺服器

Ansible 控制器的虛擬機

4 個 vCPU,16GiB RAM

一個 Linux VM,用於在 NFS 上執行自動化 AWS EC2/FSx 配置和 Oracle 部署

軟體

紅帽Linux

RHEL Linux 8.6(LVM)-x64 Gen2

部署 RedHat 訂閱進行測試

Oracle 資料庫

版本 19.18

已套用RU補丁p34765931_190000_Linux-x86-64.zip

Oracle OPatch

版本 12.2.0.1.36

最新補丁 p6880880_190000_Linux-x86-64.zip

起搏器

版本 0.10.18

RedHat RHEL 8.0 高可用性附加元件

NFS

版本 3.0

已啟用 Oracle dNFS

Ansible

核心 2.16.2

Python 3.6.8

AWS EC2/FSx 實驗室環境中的 Oracle 資料庫主動/被動配置

伺服器

資料庫

資料庫儲存

主節點:orapm01/ip-172.30.15.111

NTAP(NTAP_PDB1,NTAP_PDB2,NTAP_PDB3)

/u01、/u02、/u03 NFS 在Amazon FSx ONTAP磁碟區上掛載

備用節點:orapm02/ip-172.30.15.5

故障轉移時 NTAP(NTAP_PDB1、NTAP_PDB2、NTAP_PDB3)

/u01、/u02、/u03 故障轉移時 NFS 掛載

部署考慮的關鍵因素

  • * Amazon FSx ONTAP HA。 *預設情況下, Amazon FSx ONTAP在單一或多個可用區域的 HA 儲存控制器對中配置。它以主動/被動方式為關鍵任務資料庫工作負載提供儲存冗餘。儲存故障轉移對於最終用戶來說是透明的。發生儲存故障轉移時無需使用者乾預。

  • *PCS 資源群組和資源排序。 *資源群組允許多個具有依賴關係的資源在同一個叢集節點上運作。資源順序強制執行資源啟動順序和關閉順序的相反順序。

  • *首選節點。 * Pacemaker 叢集刻意部署在主動/被動叢集中(不是 Pacemaker 的要求),並與 FSx ONTAP叢集同步。當具有位置約束時,活動的 EC2 執行個體將配置為 Oracle 資源的首選節點。

  • *備用節點的隔離延遲。 *在雙節點 PCS 叢集中,仲裁數被人為設定為 1。如果叢集節點之間出現通訊問題,則任一節點都可能嘗試隔離另一個節點,這可能會導致資料損壞。在備用節點上設定延遲可以緩解該問題,並允許主節點在備用節點被隔離時繼續提供服務。

  • *多可用區部署考慮。 *該解決方案在單一可用區域內部署並驗證。對於多可用區部署,需要額外的 AWS 網路資源在可用區之間移動 PCS 浮動 IP。

  • Oracle 資料庫儲存佈局。在此解決方案示範中,我們為測試資料庫 NTAP 配置三個資料庫磁碟區來託管 Oracle 二進位檔案、資料和日誌。這些磁碟區透過 NFS 安裝在 Oracle DB 伺服器上,形式為 /u01 - 二進位、/u02 - 資料和 /u03 - 日誌。在 /u02 和 /u03 掛載點上配置雙控製檔以實現冗餘。

  • *dNFS 配置。 *透過使用 dNFS(自 Oracle 11g 起可用),在 DB VM 上執行的 Oracle 資料庫可以比本機 NFS 用戶端驅動更多的 I/O。自動化 Oracle 部署預設在 NFSv3 上配置 dNFS。

  • *資料庫備份。 * NetApp提供了SnapCenter software套件,用於資料庫備份、復原和克隆,並具有使用者友好的 UI 介面。 NetApp建議實施這樣的管理工具,以實現快速(一分鐘內)快照備份、快速(幾分鐘內)資料庫復原和資料庫複製。

解決方案部署

以下部分提供了在 AWS EC2 中使用 Pacemaker 叢集和Amazon FSx ONTAP部署和配置 Oracle 資料庫 HA 以進行資料庫儲存保護的逐步流程。

部署先決條件

Details

部署需要以下先決條件。

  1. 已設定 AWS 帳戶,並在您的 AWS 帳戶內建立了必要的 VPC 和網路段。

  2. 將 Linux VM 配置為 Ansible 控制器節點,並安裝最新版本的 Ansible 和 Git。詳細資訊請參考以下連結:"NetApp解決方案自動化入門"在第 -
    Setup the Ansible Control Node for CLI deployments on RHEL / CentOS`或者
    `Setup the Ansible Control Node for CLI deployments on Ubuntu / Debian

    在 Ansible 控制器和 EC2 執行個體資料庫虛擬機器之間啟用 ssh 公鑰/私鑰認證。

配置 EC2 執行個體和Amazon FSx ONTAP儲存集群

Details

雖然可以從 AWS 控制台手動配置 EC2 執行個體和Amazon FSx ONTAP ,但建議使用基於NetApp Terraform 的自動化工具包來自動配置 EC2 執行個體和 FSx ONTAP儲存叢集。以下是詳細步驟。

  1. 從 AWS CloudShell 或 Ansible 控制器 VM 複製 EC2 和 FSx ONTAP的自動化工具包副本。

    git clone https://bitbucket.ngage.netapp.com/scm/ns-bb/na_aws_fsx_ec2_deploy.git
    註 如果工具包不是從 AWS CloudShell 執行的,則需要使用 AWS 使用者帳戶存取/金鑰對對您的 AWS 帳戶進行 AWS CLI 驗證。
  2. 查看工具包中包含的 READme.md 檔案。根據所需的 AWS 資源修改 main.tf 和相關參數檔。

    An example of main.tf:
    
    resource "aws_instance" "orapm01" {
      ami                           = var.ami
      instance_type                 = var.instance_type
      subnet_id                     = var.subnet_id
      key_name                      = var.ssh_key_name
    
      root_block_device {
        volume_type                 = "gp3"
        volume_size                 = var.root_volume_size
      }
    
      tags = {
        Name                        = var.ec2_tag1
      }
    }
    
    resource "aws_instance" "orapm02" {
      ami                           = var.ami
      instance_type                 = var.instance_type
      subnet_id                     = var.subnet_id
      key_name                      = var.ssh_key_name
    
      root_block_device {
        volume_type                 = "gp3"
        volume_size                 = var.root_volume_size
      }
    
      tags = {
        Name                        = var.ec2_tag2
      }
    }
    
    resource "aws_fsx_ontap_file_system" "fsx_01" {
      storage_capacity              = var.fs_capacity
      subnet_ids                    = var.subnet_ids
      preferred_subnet_id           = var.preferred_subnet_id
      throughput_capacity           = var.fs_throughput
      fsx_admin_password            = var.fsxadmin_password
      deployment_type               = var.deployment_type
    
      disk_iops_configuration {
        iops                        = var.iops
        mode                        = var.iops_mode
      }
    
      tags                          = {
        Name                        = var.fsx_tag
      }
    }
    
    resource "aws_fsx_ontap_storage_virtual_machine" "svm_01" {
      file_system_id                = aws_fsx_ontap_file_system.fsx_01.id
      name                          = var.svm_name
      svm_admin_password            = var.vsadmin_password
    }
  3. 驗證並執行 Terraform 計劃。成功執行將在目標 AWS 帳戶中建立兩個 EC2 執行個體和一個 FSx ONTAP儲存叢集。自動化輸出顯示 EC2 執行個體 IP 位址和 FSx ONTAP叢集端點。

    terraform plan -out=main.plan
    terraform apply main.plan

這完成了 Oracle 的 EC2 執行個體和 FSx ONTAP配置。

Pacemaker 叢集設置

Details

RHEL 的高可用性附加元件是一個叢集系統,可為關鍵生產服務(如 Oracle 資料庫服務)提供可靠性、可擴充性和可用性。在此使用案例示範中,設定並配置了一個雙節點 Pacemaker 集群,以支援主動/被動叢集場景中 Oracle 資料庫的高可用性。  

以 ec2-user 身分登入 EC2 實例,完成下列任務 `both`EC2 執行個體:

  1. 刪除 AWS Red Hat 更新基礎架構 (RHUI) 用戶端。

    sudo -i yum -y remove rh-amazon-rhui-client*
  2. 向 Red Hat 註冊 EC2 執行個體虛擬機器。

    sudo subscription-manager register --username xxxxxxxx --password 'xxxxxxxx' --auto-attach
  3. 啟用 RHEL 高可用性 rpm。

    sudo subscription-manager config --rhsm.manage_repos=1
    sudo subscription-manager repos --enable=rhel-8-for-x86_64-highavailability-rpms
  4. 安裝起搏器和圍欄代理。

    sudo yum update -y
    sudo yum install pcs pacemaker fence-agents-aws
  5. 在所有叢集節點上為 hacluster 使用者建立密碼。對所有節點使用相同的密碼。

    sudo passwd hacluster
  6. 啟動 pcs 服務並使其在啟動時啟動。

    sudo systemctl start pcsd.service
    sudo systemctl enable pcsd.service
  7. 驗證 pcsd 服務。

    sudo systemctl status pcsd
    [ec2-user@ip-172-30-15-5 ~]$ sudo systemctl status pcsd
    ● pcsd.service - PCS GUI and remote configuration interface
       Loaded: loaded (/usr/lib/systemd/system/pcsd.service; enabled; vendor preset: disabled)
       Active: active (running) since Tue 2024-09-10 18:50:22 UTC; 33s ago
         Docs: man:pcsd(8)
               man:pcs(8)
     Main PID: 65302 (pcsd)
        Tasks: 1 (limit: 100849)
       Memory: 24.0M
       CGroup: /system.slice/pcsd.service
               └─65302 /usr/libexec/platform-python -Es /usr/sbin/pcsd
    
    Sep 10 18:50:21 ip-172-30-15-5.ec2.internal systemd[1]: Starting PCS GUI and remote configuration interface...
    Sep 10 18:50:22 ip-172-30-15-5.ec2.internal systemd[1]: Started PCS GUI and remote configuration interface.
  8. 將叢集節點新增至主機檔案。

    sudo vi /etc/hosts
    [ec2-user@ip-172-30-15-5 ~]$ cat /etc/hosts
    127.0.0.1   localhost localhost.localdomain localhost4 localhost4.localdomain4
    ::1         localhost localhost.localdomain localhost6 localhost6.localdomain6
    
    # cluster nodes
    172.30.15.111   ip-172-30-15-111.ec2.internal
    172.30.15.5     ip-172-30-15-5.ec2.internal
  9. 安裝並設定 awscli 以連接到 AWS 帳戶。

    sudo yum install awscli
    sudo aws configure
    [ec2-user@ip-172-30-15-111 ]# sudo aws configure
    AWS Access Key ID [None]: XXXXXXXXXXXXXXXXX
    AWS Secret Access Key [None]: XXXXXXXXXXXXXXXX
    Default region name [None]: us-east-1
    Default output format [None]: json
  10. 如果尚未安裝,請安裝資源代理程式套件。

    sudo yum install resource-agents

在 `only one`叢集節點,完成以下任務來建立pcs叢集。

  1. 對pcs用戶hacluster進行身份驗證。

    sudo pcs host auth ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs host auth ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    Username: hacluster
    Password:
    ip-172-30-15-111.ec2.internal: Authorized
    ip-172-30-15-5.ec2.internal: Authorized
  2. 創建pcs集群。

    sudo pcs cluster setup ora_ec2nfsx ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs cluster setup ora_ec2nfsx ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal
    No addresses specified for host 'ip-172-30-15-5.ec2.internal', using 'ip-172-30-15-5.ec2.internal'
    No addresses specified for host 'ip-172-30-15-111.ec2.internal', using 'ip-172-30-15-111.ec2.internal'
    Destroying cluster on hosts: 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'...
    ip-172-30-15-5.ec2.internal: Successfully destroyed cluster
    ip-172-30-15-111.ec2.internal: Successfully destroyed cluster
    Requesting remove 'pcsd settings' from 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful removal of the file 'pcsd settings'
    ip-172-30-15-5.ec2.internal: successful removal of the file 'pcsd settings'
    Sending 'corosync authkey', 'pacemaker authkey' to 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'corosync authkey'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'pacemaker authkey'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'corosync authkey'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'pacemaker authkey'
    Sending 'corosync.conf' to 'ip-172-30-15-111.ec2.internal', 'ip-172-30-15-5.ec2.internal'
    ip-172-30-15-111.ec2.internal: successful distribution of the file 'corosync.conf'
    ip-172-30-15-5.ec2.internal: successful distribution of the file 'corosync.conf'
    Cluster has been successfully set up.
  3. 啟用叢集。

    sudo pcs cluster enable --all
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs cluster enable --all
    ip-172-30-15-5.ec2.internal: Cluster Enabled
    ip-172-30-15-111.ec2.internal: Cluster Enabled
  4. 啟動並驗證叢集。

    sudo pcs cluster start --all
    sudo pcs status
    [ec2-user@ip-172-30-15-111 ~]$ sudo pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    No stonith devices and stonith-enabled is not false
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Wed Sep 11 15:43:23 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Wed Sep 11 15:43:06 2024 by hacluster via hacluster on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 0 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    
    Full List of Resources:
      * No resources
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

這完成了 Pacemaker 叢集設定和初始配置。

Pacemaker 叢集防護配置

Details

對於生產叢集來說,Pacemaker 防護配置是必要的。它可確保 AWS EC2 叢集上的故障節點被自動隔離,從而防止該節點消耗叢集的資源、損害叢集的功能或破壞共用資料。本節示範了使用 fence_aws 隔離代理程式配置叢集隔離。

  1. 以 root 使用者身分輸入下列 AWS 元資料查詢以取得每個 EC2 執行個體節點的執行個體 ID。

    echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
    [root@ip-172-30-15-111 ec2-user]# echo $(curl -s http://169.254.169.254/latest/meta-data/instance-id)
    i-0d8e7a0028371636f
    
    or just get instance-id from AWS EC2 console
  2. 輸入以下命令配置隔離設備。使用 pcmk_host_map 指令將 RHEL 主機名稱對應到實例 ID。使用您先前用於 AWS 驗證的 AWS 使用者帳戶的 AWS 存取金鑰和 AWS 秘密存取金鑰。

    sudo pcs stonith \
    create clusterfence fence_aws access_key=XXXXXXXXXXXXXXXXX secret_key=XXXXXXXXXXXXXXXXXX \
    region=us-east-1 pcmk_host_map="ip-172-30-15-111.ec2.internal:i-0d8e7a0028371636f;ip-172-30-15-5.ec2.internal:i-0bc54b315afb20a2e" \
    power_timeout=240 pcmk_reboot_timeout=480 pcmk_reboot_retries=4
  3. 驗證防護配置。

    pcs status
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Wed Sep 11 21:17:18 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Wed Sep 11 21:16:40 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 1 resource instance configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  4. 將 stonith-action 設定為關閉,而不是在叢集層級重新啟動。

    pcs property set stonith-action=off
    [root@ip-172-30-15-111 ec2-user]# pcs property config
    Cluster Properties:
     cluster-infrastructure: corosync
     cluster-name: ora_ec2nfsx
     dc-version: 2.1.7-5.1.el8_10-0f7f88312
     have-watchdog: false
     last-lrm-refresh: 1726257586
     stonith-action: off
    註 當 stonith-action 設定為 off 時,隔離叢集節點將首先關閉。在 stonith power_timeout 定義的時間(240 秒)之後,隔離節點將重新啟動並重新加入叢集。
  5. 將備用節點的隔離延遲設定為 10 秒。

    pcs stonith update clusterfence pcmk_delay_base="ip-172-30-15-111.ec2.internal:0;ip-172-30-15-5.ec2.internal:10s"
    [root@ip-172-30-15-111 ec2-user]# pcs stonith config
    Resource: clusterfence (class=stonith type=fence_aws)
      Attributes: clusterfence-instance_attributes
        access_key=XXXXXXXXXXXXXXXX
        pcmk_delay_base=ip-172-30-15-111.ec2.internal:0;ip-172-30-15-5.ec2.internal:10s
        pcmk_host_map=ip-172-30-15-111.ec2.internal:i-0d8e7a0028371636f;ip-172-30-15-5.ec2.internal:i-0bc54b315afb20a2e
        pcmk_reboot_retries=4
        pcmk_reboot_timeout=480
        power_timeout=240
        region=us-east-1
        secret_key=XXXXXXXXXXXXXXXX
      Operations:
        monitor: clusterfence-monitor-interval-60s
          interval=60s
註 執行 `pcs stonith refresh`命令刷新已停止的 stonith 防護代理或清除失敗的 stonith 資源操作。

在 PCS 叢集中部署 Oracle 資料庫

Details

我們建議利用NetApp提供的 Ansible 劇本在 PCS 叢集上使用預先定義參數執行資料庫安裝和設定任務。對於此自動化 Oracle 部署,在劇本執行之前需要使用者輸入三個使用者定義的參數檔。

  • 主機 - 定義自動化劇本運作的目標。

  • vars/vars.yml - 定義適用於所有目標的變數的全域變數檔案。

  • host_vars/host_name.yml - 定義僅適用於命名目標的變數的本機變數檔案。在我們的用例中,這些是 Oracle DB 伺服器。

除了這些使用者定義的變數文件之外,還有幾個預設變數文件,其中包含預設參數,除非必要,否則不需要更改。以下顯示了在 PCS 叢集配置中 AWS EC2 和 FSx ONTAP中自動 Oracle 部署的詳細資訊。

  1. 從 Ansible 控制器管理員使用者主目錄,複製一份用於 NFS 的NetApp Oracle 部署自動化工具包的副本。

    git clone https://bitbucket.ngage.netapp.com/scm/ns-bb/na_oracle_deploy_nfs.git
    註 Ansible 控制器可以位於與資料庫 EC2 執行個體相同的 VPC 中,也可以位於本機,只要它們之間有網路連線即可。
  2. 在hosts參數檔中填寫使用者定義的參數。以下是典型主機檔案配置的範例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat hosts
    #Oracle hosts
    [oracle]
    orapm01 ansible_host=172.30.15.111 ansible_ssh_private_key_file=ec2-user.pem
    orapm02 ansible_host=172.30.15.5 ansible_ssh_private_key_file=ec2-user.pem
  3. 在vars/vars.yml參數檔中填寫使用者定義的參數。以下是典型的 vars.yml 檔案配置的範例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat vars/vars.yml
    ######################################################################
    ###### Oracle 19c deployment user configuration variables       ######
    ###### Consolidate all variables from ONTAP, linux and oracle   ######
    ######################################################################
    
    ###########################################
    ### ONTAP env specific config variables ###
    ###########################################
    
    # Prerequisite to create three volumes in NetApp ONTAP storage from System Manager or cloud dashboard with following naming convention:
    # db_hostname_u01 - Oracle binary
    # db_hostname_u02 - Oracle data
    # db_hostname_u03 - Oracle redo
    # It is important to strictly follow the name convention or the automation will fail.
    
    
    ###########################################
    ### Linux env specific config variables ###
    ###########################################
    
    redhat_sub_username: xxxxxxxx
    redhat_sub_password: "xxxxxxxx"
    
    
    ####################################################
    ### DB env specific install and config variables ###
    ####################################################
    
    # Database domain name
    db_domain: ec2.internal
    
    # Set initial password for all required Oracle passwords. Change them after installation.
    initial_pwd_all: "xxxxxxxx"
  4. 在host_vars/host_name.yml參數檔中填入使用者定義的參數。以下是典型的 host_vars/host_name.yml 檔案配置的範例。

    [admin@ansiblectl na_oracle_deploy_nfs]$ cat host_vars/orapm01.yml
    # User configurable Oracle host specific parameters
    
    # Database SID. By default, a container DB is created with 3 PDBs within the CDB
    oracle_sid: NTAP
    
    # CDB is created with SGA at 75% of memory_limit, MB. Consider how many databases to be hosted on the node and
    # how much ram to be allocated to each DB. The grand total of SGA should not exceed 75% available RAM on node.
    memory_limit: 8192
    
    # Local NFS lif ip address to access database volumes
    nfs_lif: 172.30.15.95
    註 可以從上一節自動 EC2 和 FSx ONTAP部署的 FSx ONTAP叢集端點輸出中擷取 nfs_lif 位址。
  5. 從 AWS FSx 主控台建立資料庫磁碟區。確保使用 PCS 主節點主機名稱(orapm01)作為磁碟區的前綴,如下所示。

    此映像提供來自 AWS FSx 控制台的Amazon FSx ONTAP磁碟區配置 此映像提供來自 AWS FSx 控制台的Amazon FSx ONTAP磁碟區配置 此映像提供來自 AWS FSx 控制台的Amazon FSx ONTAP磁碟區配置 此映像提供來自 AWS FSx 控制台的Amazon FSx ONTAP磁碟區配置 此映像提供來自 AWS FSx 控制台的Amazon FSx ONTAP磁碟區配置

  6. 階段以下 Oracle 19c 安裝檔案在 PCS 主節點 EC2 執行個體的 ip-172-30-15-111.ec2.internal /tmp/archive 目錄中,權限為 777。

    installer_archives:
      - "LINUX.X64_193000_db_home.zip"
      - "p34765931_190000_Linux-x86-64.zip"
      - "p6880880_190000_Linux-x86-64.zip"
  7. 執行 Linux 配置的劇本 all nodes

    ansible-playbook -i hosts 2-linux_config.yml -u ec2-user -e @vars/vars.yml
    [admin@ansiblectl na_oracle_deploy_nfs]$ ansible-playbook -i hosts 2-linux_config.yml -u ec2-user -e @vars/vars.yml
    
    PLAY [Linux Setup and Storage Config for Oracle] ****************************************************************************************************************************************************************************************************************************************************************************
    
    TASK [Gathering Facts] ******************************************************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    ok: [orapm02]
    
    TASK [linux : Configure RedHat 7 for Oracle DB installation] ****************************************************************************************************************************************************************************************************************************************************************
    skipping: [orapm01]
    skipping: [orapm02]
    
    TASK [linux : Configure RedHat 8 for Oracle DB installation] ****************************************************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/linux/tasks/rhel8_config.yml for orapm01, orapm02
    
    TASK [linux : Register subscriptions for RedHat Server] *********************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    ok: [orapm02]
    .
    .
    .
  8. 執行 Oracle 配置的劇本 only on primary node(註解掉 hosts 檔案中的備用節點)。

    ansible-playbook -i hosts 4-oracle_config.yml -u ec2-user -e @vars/vars.yml --skip-tags "enable_db_start_shut"
    [admin@ansiblectl na_oracle_deploy_nfs]$ ansible-playbook -i hosts 4-oracle_config.yml -u ec2-user -e @vars/vars.yml --skip-tags "enable_db_start_shut"
    
    PLAY [Oracle installation and configuration] ********************************************************************************************************************************************************************************************************************************************************************************
    
    TASK [Gathering Facts] ******************************************************************************************************************************************************************************************************************************************************************************************************
    ok: [orapm01]
    
    TASK [oracle : Oracle software only install] ********************************************************************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/oracle/tasks/oracle_install.yml for orapm01
    
    TASK [oracle : Create mount points for NFS file systems / Mount NFS file systems on Oracle hosts] ***************************************************************************************************************************************************************************************************************************
    included: /home/admin/na_oracle_deploy_nfs/roles/oracle/tasks/oracle_mount_points.yml for orapm01
    
    TASK [oracle : Create mount points for NFS file systems] ********************************************************************************************************************************************************************************************************************************************************************
    changed: [orapm01] => (item=/u01)
    changed: [orapm01] => (item=/u02)
    changed: [orapm01] => (item=/u03)
    .
    .
    .
  9. 部署資料庫後,註解掉主節點上 /etc/fstab 中的 /u01、/u02、/u03 掛載,因為掛載點僅由 PCS 管理。

    sudo vi /etc/fstab
    [root@ip-172-30-15-111 ec2-user]# cat /etc/fstab
    UUID=eaa1f38e-de0f-4ed5-a5b5-2fa9db43bb38       /       xfs     defaults        0       0
    /mnt/swapfile swap swap defaults 0 0
    #172.30.15.95:/orapm01_u01 /u01 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
    #172.30.15.95:/orapm01_u02 /u02 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
    #172.30.15.95:/orapm01_u03 /u03 nfs rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536 0 0
  10. 將 /etc/oratab /etc/oraInst.loc、/home/oracle/.bash_profile 複製到備用節點。確保維護適當的文件所有權和權限。

  11. 關閉資料庫、監聽器,並在主節點上卸載 /u01、/u02、/u03。

    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Wed Sep 18 16:51:02 UTC 2024
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Wed Sep 18 16:51:16 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> shutdown immediate;
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    [oracle@ip-172-30-15-111 ~]$ lsnrctl stop listener.ntap
    
    [oracle@ip-172-30-15-111 ~]$ exit
    logout
    [root@ip-172-30-15-111 ec2-user]# umount /u01
    [root@ip-172-30-15-111 ec2-user]# umount /u02
    [root@ip-172-30-15-111 ec2-user]# umount /u03
  12. 在備用節點ip-172-30-15-5上建立掛載點。

    mkdir /u01
    mkdir /u02
    mkdir /u03
  13. 在備用節點 ip-172-30-15-5 上掛載 FSx ONTAP資料庫磁碟區。

    mount -t nfs 172.30.15.95:/orapm01_u01 /u01 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount -t nfs 172.30.15.95:/orapm01_u02 /u02 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount -t nfs 172.30.15.95:/orapm01_u03 /u03 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
  14. 更改為 oracle 用戶,重新連結二進位。

    [root@ip-172-30-15-5 ec2-user]# su - oracle
    Last login: Thu Sep 12 18:09:03 UTC 2024 on pts/0
    [oracle@ip-172-30-15-5 ~]$ env | grep ORA
    ORACLE_SID=NTAP
    ORACLE_HOME=/u01/app/oracle/product/19.0.0/NTAP
    [oracle@ip-172-30-15-5 ~]$ cd $ORACLE_HOME/bin
    [oracle@ip-172-30-15-5 bin]$ ./relink
    writing relink log to: /u01/app/oracle/product/19.0.0/NTAP/install/relinkActions2024-09-12_06-21-40PM.log
  15. 將 dnfs lib 複製回 odm 資料夾。重新連結可能會遺失 dfns 庫檔案。

    [oracle@ip-172-30-15-5 odm]$ cd /u01/app/oracle/product/19.0.0/NTAP/rdbms/lib/odm
    [oracle@ip-172-30-15-5 odm]$ cp ../../../lib/libnfsodm19.so .
  16. 在備用節點 ip-172-30-15-5 上啟動資料庫進行驗證。

    [oracle@ip-172-30-15-5 odm]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Thu Sep 12 18:30:04 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Connected to an idle instance.
    
    SQL> startup;
    ORACLE instance started.
    
    Total System Global Area 6442449688 bytes
    Fixed Size                  9177880 bytes
    Variable Size            1090519040 bytes
    Database Buffers         5335154688 bytes
    Redo Buffers                7598080 bytes
    Database mounted.
    Database opened.
    SQL> select name, open_mode from v$database;
    
    NAME      OPEN_MODE
    --------- --------------------
    NTAP      READ WRITE
    
    SQL> show pdbs
    
        CON_ID CON_NAME                       OPEN MODE  RESTRICTED
    ---------- ------------------------------ ---------- ----------
             2 PDB$SEED                       READ ONLY  NO
             3 NTAP_PDB1                      READ WRITE NO
             4 NTAP_PDB2                      READ WRITE NO
             5 NTAP_PDB3                      READ WRITE NO
  17. 關閉資料庫並將資料庫故障還原至主節點 ip-172-30-15-111。

    SQL> shutdown immediate;
    Database closed.
    Database dismounted.
    ORACLE instance shut down.
    SQL> exit
    
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
    
    [root@ip-172-30-15-5 ec2-user]# umount /u01
    [root@ip-172-30-15-5 ec2-user]# umount /u02
    [root@ip-172-30-15-5 ec2-user]# umount /u03
    
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u01 /u01 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u02 /u02 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# mount -t nfs 172.30.15.95:/orapm01_u03 /u03 -o rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536
    mount: (hint) your fstab has been modified, but systemd still uses
           the old version; use 'systemctl daemon-reload' to reload.
    [root@ip-172-30-15-111 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.8G   48M  7.7G   1% /dev/shm
    tmpfs                      7.8G   33M  7.7G   1% /run
    tmpfs                      7.8G     0  7.8G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   29G   22G  58% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  844G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  844G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  844G 100% /u03
    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Thu Sep 12 18:13:34 UTC 2024 on pts/1
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Thu Sep 12 18:38:46 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Connected to an idle instance.
    
    SQL> startup;
    ORACLE instance started.
    
    Total System Global Area 6442449688 bytes
    Fixed Size                  9177880 bytes
    Variable Size            1090519040 bytes
    Database Buffers         5335154688 bytes
    Redo Buffers                7598080 bytes
    Database mounted.
    Database opened.
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    [oracle@ip-172-30-15-111 ~]$ lsnrctl start listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 12-SEP-2024 18:39:17
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Starting /u01/app/oracle/product/19.0.0/NTAP/bin/tnslsnr: please wait...
    
    TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    System parameter file is /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Log messages written to /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
    Listening on: (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                12-SEP-2024 18:39:17
    Uptime                    0 days 0 hr. 0 min. 0 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=ip-172-30-15-111.ec2.internal)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
    The listener supports no services
    The command completed successfully

配置 Oracle 資源以進行 PCS 管理

Details

配置 Pacemaker 叢集的目標是建立一個主動/被動高可用性解決方案,用於在 AWS EC2 和 FSx ONTAP環境中執行 Oracle,並在發生故障時盡量減少使用者介入。下面示範了 PCS 管理的 Oracle 資源配置。

  1. 以主 EC2 執行個體 ip-172-30-15-111 上的 root 使用者身份,使用 VPC CIDR 區塊中未使用的私人 IP 位址作為浮動 IP 建立輔助私有 IP 位址。在此過程中,建立輔助私有 IP 位址所屬的 oracle 資源群組。

    pcs resource create privip ocf:heartbeat:awsvip secondary_private_ip=172.30.15.33 --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:25:35 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:25:23 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 2 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-5.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    註 如果 privip 恰好是在備用叢集節點上建立的,請將其移至主節點,如下所示。
  2. 在叢集節點之間移動資源。

    pcs resource move privip ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs resource move privip ip-172-30-15-111.ec2.internal
    Warning: A move constraint has been created and the resource 'privip' may or may not move depending on other configuration
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    Following resources have been moved and their move constraints are still in place: 'privip'
    Run 'pcs constraint location' or 'pcs resource clear <resource id>' to view or remove the constraints, respectively
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:26:38 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:26:27 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 2 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal (Monitoring)
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  3. 為 Oracle 建立虛擬 IP(vip)。虛擬IP將根據需要在主節點和備用節點之間浮動。

    pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.30.15.33 cidr_netmask=25 nic=eth0 op monitor interval=10s --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs resource create vip ocf:heartbeat:IPaddr2 ip=172.30.15.33 cidr_netmask=25 nic=eth0 op monitor interval=10s --group oracle
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    
    WARNINGS:
    Following resources have been moved and their move constraints are still in place: 'privip'
    Run 'pcs constraint location' or 'pcs resource clear <resource id>' to view or remove the constraints, respectively
    
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 16:27:34 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 16:27:24 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 3 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  4. 以 oracle 使用者身份,更新 listener.ora 和 tnsnames.ora 檔案以指向 vip 位址。重新啟動監聽器。如果需要,則反彈資料庫以便 DB 向偵聽器註冊。

    vi $ORACLE_HOME/network/admin/listener.ora
    vi $ORACLE_HOME/network/admin/tnsnames.ora
    [oracle@ip-172-30-15-111 admin]$ cat listener.ora
    # listener.ora Network Configuration File: /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    # Generated by Oracle configuration tools.
    
    LISTENER.NTAP =
      (DESCRIPTION_LIST =
        (DESCRIPTION =
          (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
          (ADDRESS = (PROTOCOL = IPC)(KEY = EXTPROC1521))
        )
      )
    
    [oracle@ip-172-30-15-111 admin]$ cat tnsnames.ora
    # tnsnames.ora Network Configuration File: /u01/app/oracle/product/19.0.0/NTAP/network/admin/tnsnames.ora
    # Generated by Oracle configuration tools.
    
    NTAP =
      (DESCRIPTION =
        (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
        (CONNECT_DATA =
          (SERVER = DEDICATED)
          (SERVICE_NAME = NTAP.ec2.internal)
        )
      )
    
    LISTENER_NTAP =
      (ADDRESS = (PROTOCOL = TCP)(HOST = 172.30.15.33)(PORT = 1521))
    
    
    [oracle@ip-172-30-15-111 admin]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 13-SEP-2024 18:28:17
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                13-SEP-2024 18:15:51
    Uptime                    0 days 0 hr. 12 min. 25 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.30.15.33)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=ip-172-30-15-111.ec2.internal)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/NTAP/admin/NTAP/xdb_wallet))(Presentation=HTTP)(Session=RAW))
    Services Summary...
    Service "21f0b5cc1fa290e2e0636f0f1eacfd43.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b74445329119e0636f0f1eacec03.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b83929709164e0636f0f1eacacc3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAP.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAPXDB.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb1.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb2.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    The command completed successfully
    
    **Oracle listener now listens on vip for database connection**
  5. 將 /u01、/u02、/u03 掛載點新增至 oracle 資源組。

    pcs resource create u01 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u01' directory='/u01' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
    pcs resource create u02 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u02' directory='/u02' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
    pcs resource create u03 ocf:heartbeat:Filesystem device='172.30.15.95:/orapm01_u03' directory='/u03' fstype='nfs' options='rw,bg,hard,vers=3,proto=tcp,timeo=600,rsize=65536,wsize=65536' --group oracle
  6. 在 oracle DB 中建立 PCS 監控使用者 ID。

    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Fri Sep 13 18:12:24 UTC 2024 on pts/0
    [oracle@ip-172-30-15-111 ~]$ sqlplus / as sysdba
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 19:08:41 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> CREATE USER c##ocfmon IDENTIFIED BY "XXXXXXXX";
    
    User created.
    
    SQL> grant connect to c##ocfmon;
    
    Grant succeeded.
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
  7. 將資料庫新增至 oracle 資源組。

    pcs resource create ntap ocf:heartbeat:oracle sid='NTAP' home='/u01/app/oracle/product/19.0.0/NTAP' user='oracle' monuser='C##OCFMON' monpassword='XXXXXXXX' monprofile='DEFAULT' --group oracle
  8. 將資料庫監聽器加入到oracle資源組。

    pcs resource create listener ocf:heartbeat:oralsnr sid='NTAP' listener='listener.ntap' --group=oracle
  9. 將 oracle 資源群組中的所有資源位置約束更新為主節點作為首選節點。

    pcs constraint location privip prefers ip-172-30-15-111.ec2.internal
    pcs constraint location vip prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u01 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u02 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location u03 prefers ip-172-30-15-111.ec2.internal
    pcs constraint location ntap prefers ip-172-30-15-111.ec2.internal
    pcs constraint location listener prefers ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs constraint config
    Location Constraints:
      Resource: listener
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: ntap
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: privip
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u01
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u02
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: u03
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
      Resource: vip
        Enabled on:
          Node: ip-172-30-15-111.ec2.internal (score:INFINITY)
    Ordering Constraints:
    Colocation Constraints:
    Ticket Constraints:
  10. 驗證 Oracle 資源配置。

    pcs status
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 19:25:32 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 19:23:40 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Started ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled

部署後 HA 驗證

Details

部署後,必須執行一些測試和驗證以確保 PCS Oracle 資料庫故障轉移群集配置正確並如預期運作。測試驗證包括管理故障轉移和模擬意外資源故障以及透過叢集保護機制進行復原。

  1. 透過手動觸發備用節點的防護來驗證節點防護,並觀察備用節點是否在逾時後離線並重新啟動。

    pcs stonith fence <standbynodename>
    [root@ip-172-30-15-111 ec2-user]# pcs stonith fence ip-172-30-15-5.ec2.internal
    Node: ip-172-30-15-5.ec2.internal fenced
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 21:58:45 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 21:55:12 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-111.ec2.internal ]
      * OFFLINE: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-111.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Started ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Started ip-172-30-15-111.ec2.internal
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
  2. 透過終止監聽器程序來模擬資料庫監聽器故障,並觀察 PCS 監視監聽器故障並在幾秒鐘內重新啟動它。

    [root@ip-172-30-15-111 ec2-user]# ps -ef | grep lsnr
    oracle    154895       1  0 18:15 ?        00:00:00 /u01/app/oracle/product/19.0.0/NTAP/bin/tnslsnr listener.ntap -inherit
    root      217779  120186  0 19:36 pts/0    00:00:00 grep --color=auto lsnr
    [root@ip-172-30-15-111 ec2-user]# kill -9 154895
    
    [root@ip-172-30-15-111 ec2-user]# su - oracle
    Last login: Thu Sep 19 14:58:54 UTC 2024
    [oracle@ip-172-30-15-111 ~]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 13-SEP-2024 19:36:51
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    TNS-12541: TNS:no listener
     TNS-12560: TNS:protocol adapter error
      TNS-00511: No listener
       Linux Error: 111: Connection refused
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=IPC)(KEY=EXTPROC1521)))
    TNS-12541: TNS:no listener
     TNS-12560: TNS:protocol adapter error
      TNS-00511: No listener
       Linux Error: 111: Connection refused
    
    [oracle@ip-172-30-15-111 ~]$ lsnrctl status listener.ntap
    
    LSNRCTL for Linux: Version 19.0.0.0.0 - Production on 19-SEP-2024 15:00:10
    
    Copyright (c) 1991, 2022, Oracle.  All rights reserved.
    
    Connecting to (DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=172.30.15.33)(PORT=1521)))
    STATUS of the LISTENER
    ------------------------
    Alias                     listener.ntap
    Version                   TNSLSNR for Linux: Version 19.0.0.0.0 - Production
    Start Date                16-SEP-2024 14:00:14
    Uptime                    3 days 0 hr. 59 min. 56 sec
    Trace Level               off
    Security                  ON: Local OS Authentication
    SNMP                      OFF
    Listener Parameter File   /u01/app/oracle/product/19.0.0/NTAP/network/admin/listener.ora
    Listener Log File         /u01/app/oracle/diag/tnslsnr/ip-172-30-15-111/listener.ntap/alert/log.xml
    Listening Endpoints Summary...
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcp)(HOST=172.30.15.33)(PORT=1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=ipc)(KEY=EXTPROC1521)))
      (DESCRIPTION=(ADDRESS=(PROTOCOL=tcps)(HOST=ip-172-30-15-111.ec2.internal)(PORT=5500))(Security=(my_wallet_directory=/u01/app/oracle/product/19.0.0/NTAP/admin/NTAP/xdb_wallet))(Presentation=HTTP)(Session=RAW))
    Services Summary...
    Service "21f0b5cc1fa290e2e0636f0f1eacfd43.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b74445329119e0636f0f1eacec03.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "21f0b83929709164e0636f0f1eacacc3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAP.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "NTAPXDB.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb1.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb2.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    Service "ntap_pdb3.ec2.internal" has 1 instance(s).
      Instance "NTAP", status READY, has 1 handler(s) for this service...
    The command completed successfully
  3. 透過終止 pmon 進程來模擬資料庫故障,並觀察 PCS 監視資料庫故障並在幾秒鐘內重新啟動它。

    **Make a remote connection to ntap database**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 15:42:42 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Thu Sep 12 2024 13:37:28 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>
    
    **Kill ntap pmon process to simulate a failure**
    
    [root@ip-172-30-15-111 ec2-user]# ps -ef | grep pmon
    oracle    159247       1  0 18:27 ?        00:00:00 ora_pmon_NTAP
    root      230595  120186  0 19:44 pts/0    00:00:00 grep --color=auto pmon
    [root@ip-172-30-15-111 ec2-user]# kill -9 159247
    
    **Observe the DB failure**
    
    SQL> /
    select instance_name, host_name from v$instance
    *
    ERROR at line 1:
    ORA-03113: end-of-file on communication channel
    Process ID: 227424
    Session ID: 396 Serial number: 4913
    
    
    SQL> exit
    Disconnected from Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    **Reconnect to DB after reboot**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 15:47:24 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 15:42:47 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>
  4. 透過將主節點置於待機模式以將 Oracle 資源故障轉移到備用節點,驗證從主節點到備用節點的託管資料庫故障轉移。

    pcs node standby <nodename>
    **Stopping Oracle resources on primary node in reverse order**
    
    [root@ip-172-30-15-111 ec2-user]# pcs node standby ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:01:16 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:01:08 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Node ip-172-30-15-111.ec2.internal: standby (with active resources)
      * Online: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Stopping ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Stopped
        * u03       (ocf::heartbeat:Filesystem):     Stopped
        * ntap      (ocf::heartbeat:oracle):         Stopped
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Starting Oracle resources on standby node in sequencial order**
    
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:01:34 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:01:08 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Node ip-172-30-15-111.ec2.internal: standby
      * Online: [ ip-172-30-15-5.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-5.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-5.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-5.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Starting ip-172-30-15-5.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **NFS mount points mounted on standby node**
    
    [root@ip-172-30-15-5 ec2-user]# df -h
    Filesystem                 Size  Used Avail Use% Mounted on
    devtmpfs                   7.7G     0  7.7G   0% /dev
    tmpfs                      7.7G   33M  7.7G   1% /dev/shm
    tmpfs                      7.7G   17M  7.7G   1% /run
    tmpfs                      7.7G     0  7.7G   0% /sys/fs/cgroup
    /dev/xvda2                  50G   21G   30G  41% /
    tmpfs                      1.6G     0  1.6G   0% /run/user/1000
    172.30.15.95:/orapm01_u01   48T   47T  840G  99% /u01
    172.30.15.95:/orapm01_u02  285T  285T  840G 100% /u02
    172.30.15.95:/orapm01_u03  190T  190T  840G 100% /u03
    tmpfs                      1.6G     0  1.6G   0% /run/user/54321
    
    **Database opened on standby node**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 16:34:08 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 15:47:28 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select name, open_mode from v$database;
    
    NAME      OPEN_MODE
    --------- --------------------
    NTAP      READ WRITE
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-5.ec2.internal
    
    
    SQL>
  5. 透過非備用主節點驗證託管資料庫從備用故障還原到主節點,並觀察 Oracle 資源是否由於首選節點設定而自動故障復原。

    pcs node unstandby <nodename>
    **Stopping Oracle resources on standby node for failback to primary**
    
    [root@ip-172-30-15-111 ec2-user]# pcs node unstandby ip-172-30-15-111.ec2.internal
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:41:30 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:41:18 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Stopping ip-172-30-15-5.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Stopped
        * u01       (ocf::heartbeat:Filesystem):     Stopped
        * u02       (ocf::heartbeat:Filesystem):     Stopped
        * u03       (ocf::heartbeat:Filesystem):     Stopped
        * ntap      (ocf::heartbeat:oracle):         Stopped
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Starting Oracle resources on primary node for failback**
    
    [root@ip-172-30-15-111 ec2-user]# pcs status
    Cluster name: ora_ec2nfsx
    Cluster Summary:
      * Stack: corosync (Pacemaker is running)
      * Current DC: ip-172-30-15-111.ec2.internal (version 2.1.7-5.1.el8_10-0f7f88312) - partition with quorum
      * Last updated: Fri Sep 13 20:41:45 2024 on ip-172-30-15-111.ec2.internal
      * Last change:  Fri Sep 13 20:41:18 2024 by root via root on ip-172-30-15-111.ec2.internal
      * 2 nodes configured
      * 8 resource instances configured
    
    Node List:
      * Online: [ ip-172-30-15-5.ec2.internal ip-172-30-15-111.ec2.internal ]
    
    Full List of Resources:
      * clusterfence        (stonith:fence_aws):     Started ip-172-30-15-5.ec2.internal
      * Resource Group: oracle:
        * privip    (ocf::heartbeat:awsvip):         Started ip-172-30-15-111.ec2.internal
        * vip       (ocf::heartbeat:IPaddr2):        Started ip-172-30-15-111.ec2.internal
        * u01       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u02       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * u03       (ocf::heartbeat:Filesystem):     Started ip-172-30-15-111.ec2.internal
        * ntap      (ocf::heartbeat:oracle):         Starting ip-172-30-15-111.ec2.internal
        * listener  (ocf::heartbeat:oralsnr):        Stopped
    
    Daemon Status:
      corosync: active/enabled
      pacemaker: active/enabled
      pcsd: active/enabled
    
    **Database now accepts connection on primary node**
    
    [oracle@ora_01 ~]$ sqlplus system@//172.30.15.33:1521/NTAP.ec2.internal
    
    SQL*Plus: Release 19.0.0.0.0 - Production on Fri Sep 13 16:46:07 2024
    Version 19.18.0.0.0
    
    Copyright (c) 1982, 2022, Oracle.  All rights reserved.
    
    Enter password:
    Last Successful login time: Fri Sep 13 2024 16:34:12 -04:00
    
    Connected to:
    Oracle Database 19c Enterprise Edition Release 19.0.0.0.0 - Production
    Version 19.18.0.0.0
    
    SQL> select instance_name, host_name from v$instance;
    
    INSTANCE_NAME
    ----------------
    HOST_NAME
    ----------------------------------------------------------------
    NTAP
    ip-172-30-15-111.ec2.internal
    
    
    SQL>

這完成了在 AWS EC2 中使用 Pacemaker 叢集和Amazon FSx ONTAP作為資料庫儲存後端的 Oracle HA 驗證和解決方案示範。

使用SnapCenter進行 Oracle 備份、復原和克隆

Details

NetApp建議使用SnapCenter UI 工具來管理部署在 AWS EC2 和Amazon FSx ONTAP中的 Oracle 資料庫。參考 TR-4979"VMware Cloud on AWS 中簡化的自主管理 Oracle,附有用戶端安裝的 FSx ONTAP"部分 `Oracle backup, restore, and clone with SnapCenter`有關設定SnapCenter和執行資料庫備份、復原和複製工作流程的詳細資訊。

在哪裡可以找到更多信息

要了解有關本文檔中描述的信息的更多信息,請查看以下文檔和/或網站: