Deployment steps
This section covers deployment steps for NetApp XCP for data transfer.
Test bed details
The following table provides the details of the test bed that was used for this deployment and performance validation.
Solution components | Details |
---|---|
XCP version 1.7 |
|
NetApp AFF storage array HA pair for the source volume |
|
NetApp AFF storage array HA pair for destination volume |
|
Fujitsu PRIMERGY RX2540 server |
Each equipped with: |
Networking |
10GbE |
Deployment steps - NAS
To deploy NetApp XCP for data transfer, first install and activate the XCP software on the destination location. You can review the details in the NetApp XCP User Guide. To do so, complete the following steps:
-
Meet the prerequisites as detailed in the section “Prerequisites for XCP.”
-
Download the XCP software from the NetApp XCP (Downloads) page.
-
Copy the downloaded XCP tar files to the XCP server.
# scp Documents/OneDrive\ -\ NetApp\ Inc/XCP/software/1.6.1/NETAPP_XCP_1.6.1.tgz mailto:root@10.63.150.53:/usr/src
-
Untar the tarfile.
[root@mastr-53 src]# tar -zxvf NETAPP_XCP_1.6.1.tgz
-
Download the license from https://xcp.netapp.com/license/xcp.xwic and copy to the XCP server.
-
Activate the license.
[root@mastr-53 linux]# ./xcp activate [root@mastr-53 src]# cp license /opt/NetApp/xFiles/xcp/license [root@mastr-53 src]# cd /usr/src/xcp/linux/ [root@mastr-53 linux]# ./xcp activate
-
Find the source NFS port and destination NFS server. The default port is 2049.
[root@mastr-53 ~]# rpcinfo -p 10.63.150.213 [root@mastr-53 ~]# rpcinfo -p 10.63.150.63
-
Check the NFS connection. Check the NFS server (for both source and destination) by using telnet to the NFS server port.
[root@mastr-53 ~]# telnet 10.63.150.127 2049 [root@mastr-53 ~]# telnet 10.63.150.63 2049
-
Configure the catalog.
-
Create an NFS volume and export NFS for the XCP catalog. You can also leverage the operating system NFS export for XCP catalog.
A800-Node1-2::> volume create -vserver Hadoop_SVM -volume xcpcatalog -aggregate aggr_Hadoop_1 -size 50GB -state online -junction-path /xcpcatalog -policy default -unix-permissions ---rwxr-xr-x -type RW -snapshot-policy default -foreground true A800-Node1-2::> volume mount -vserver Hadoop_SVM -volume xcpcatalog_vol -junction-path /xcpcatalog
-
Check the NFS export.
[root@mastr-53 ~]# showmount -e 10.63.150.63 | grep xcpca /xcpcatalog (everyone)
-
Update
xcp.ini
.[root@mastr-53 ~]# cat /opt/NetApp/xFiles/xcp/xcp.ini # Sample xcp config [xcp] catalog = 10.63.150.64:/xcpcatalog [root@mastr-53 ~]#
-
-
Find the source NAS exports by using
xcp show
. Look for:== NFS Exports == == Attributes of NFS Exports ==
[root@mastr-53 linux]# ./xcp show 10.63.150.127 == NFS Exports == <check here> == Attributes of NFS Exports == <check here>
-
(Optional) Scan the source NAS data.
[root@mastr-53 linux]# ./xcp scan -newid xcpscantest4 -stats 10.63.150.127:/xcpsrc_vol
Scanning the source NAS data helps you understand the data layout and find any potential issues for migration. The XCP scanning operation time is proportional to the number of files and the directory depth. You can skip this step if you are familiar with your NAS data.
-
Check the report created by
xcp scan
. Search mainly for unreadable folders and unreadable files.[root@mastr-53 linux]# mount 10.63.150.64:/xcpcatalog /xcpcatalog base) nkarthik-mac-0:~ karthikeyannagalingam$ scp -r root@10.63.150.53:/xcpcatalog/catalog/indexes/xcpscantest4 Documents/OneDrive\ -\ NetApp\ Inc/XCP/customers/reports/
-
(Optional) Change the inode. View the number of inodes and modify the number based on the number of files to migrate or copy for both catalog and destination volumes (if required).
A800-Node1-2::> volume show -volume xcpcatalog -fields files,files-used A800-Node1-2::> volume show -volume xcpdest -fields files,files-used A800-Node1-2::> volume modify -volume xcpcatalog -vserver A800-Node1_vs1 -files 2000000 Volume modify successful on volume xcpcatalog of Vserver A800-Node1_vs1. A800-Node1-2::> volume show -volume xcpcatalog -fields files,files-used
-
Scan the destination volume.
[root@mastr-53 linux]# ./xcp scan -stats 10.63.150.63:/xcpdest
-
Check the source and destination volume space.
[root@mastr-53 ~]# df -h /xcpsrc_vol [root@mastr-53 ~]# df -h /xcpdest/
-
Copy the data from source to destination by using
xcp copy
and check the summary.[root@mastr-53 linux]# ./xcp copy -newid create_Sep091599198212 10.63.150.127:/xcpsrc_vol 10.63.150.63:/xcpdest <command inprogress results removed> Xcp command : xcp copy -newid create_Sep091599198212 -parallel 23 10.63.150.127:/xcpsrc_vol 10.63.150.63:/xcpdest Stats : 9.07M scanned, 9.07M copied, 118 linked, 9.07M indexed, 173 giants Speed : 1.57 TiB in (412 MiB/s), 1.50 TiB out (392 MiB/s) Total Time : 1h6m. STATUS : PASSED [root@mastr-53 linux]#
By default, XCP creates seven parallel processes to copy the data. This can be tuned. NetApp recommends that the source volume be read only. In real time, the source volume is a live, active file system. The xcp copy
operation might fail because NetApp XCP does not support a live source that is continuously changed by an application.For Linux, XCP requires an Index ID because XCP Linux performs cataloging.
-
(Optional) Check the inodes on the destination NetApp volume.
A800-Node1-2::> volume show -volume xcpdest -fields files,files-used vserver volume files files-used -------------- ------- -------- ---------- A800-Node1_vs1 xcpdest 21251126 15039685 A800-Node1-2::>
-
Perform the incremental update by using
xcp sync
.[root@mastr-53 linux]# ./xcp sync -id create_Sep091599198212 Xcp command : xcp sync -id create_Sep091599198212 Stats : 9.07M reviewed, 9.07M checked at source, no changes, 9.07M reindexed Speed : 1.73 GiB in (8.40 MiB/s), 1.98 GiB out (9.59 MiB/s) Total Time : 3m31s. STATUS : PASSED
For this document, to simulate real-time, the one million files in the source data were renamed, and then the updated files were copied to the destination by using
xcp sync
. For Windows, XCP needs both source and destination paths. -
Validate data transfer. You can validate that the source and destination have the same data by using
xcp verify
.Xcp command : xcp verify 10.63.150.127:/xcpsrc_vol 10.63.150.63:/xcpdest Stats : 9.07M scanned, 9.07M indexed, 173 giants, 100% found (6.01M have data), 6.01M compared, 100% verified (data, attrs, mods) Speed : 3.13 TiB in (509 MiB/s), 11.1 GiB out (1.76 MiB/s) Total Time : 1h47m. STATUS : PASSED
XCP documentation provides multiple options (with examples) for the scan
, copy
, sync
, and verify
operations. For more information, see the NetApp XCP User Guide.
Windows customers should copy the data by using access control lists (ACLs). NetApp recommends using the command xcp copy -acl -fallbackuser\<username> -fallbackgroup\<username or groupname> <source> <destination> . To maximum performance, considering the source volume that has SMB data with ACL and the data accessible by both NFS and SMB, the target must be an NTFS volume. Using XCP (NFS version), copy the data from the Linux server and execute the XCP (SMB version) sync with the -acl and -nodata options from the Windows server to copy the ACLs from source data to the target SMB data.
|
For detailed steps, see Configuring 'Manage Auditing and Security Log' Policy.
Deployment steps - HDFS/MapRFS data migration
In this section, we discuss the new XCP feature called Hadoop Filesystem Data Transfer to NAS, which migrates data from HDFS/MapRFS to NFS and vice versa.
Prerequisites
For the MapRFS/HDFS feature, you must perform the following procedure in a non-root user environment. Normally the non-root user is hdfs, mapr, or a user who has permission to make changes in the HDFS and MapRFS filesystem.
-
Set the CLASSPATH, HADOOP_HOME, NHDFS_LIBJVM_PATH, LB_LIBRARY_PATH, and NHDFS_LIBHDFS_PATH variables in the CLI or the .bashrc file of the user along with the
xcp
command.-
NHDFS_LIBHDFS_PATH points to the libhdfs.so file. This file provides HDFS APIs to interact and manipulate the HDFS/MapRFS files and filesystem as a part of the Hadoop distribution.
-
NHDFS_LIBJVM_PATH points to the libjvm.so file. This is a shared JAVA virtual machine library in the jre location.
-
CLASSPATH points to all jars files using (Hadoop classpath –glob) values.
-
LD_LIBRARY_PATH points to the Hadoop native library folder location.
See the following sample based on a Cloudera cluster.
export CLASSPATH=$(hadoop classpath --glob) export LD_LIBRARY_PATH=/usr/java/jdk1.8.0_181-cloudera/jre/lib/amd64/server/ export HADOOP_HOME=/opt/cloudera/parcels/CDH-6.3.4-1.cdh6.3.4.p0.6751098/ #export HADOOP_HOME=/opt/cloudera/parcels/CDH/ export NHDFS_LIBJVM_PATH=/usr/java/jdk1.8.0_181-cloudera/jre/lib/amd64/server/libjvm.so export NHDFS_LIBHDFS_PATH=$HADOOP_HOME/lib64/libhdfs.so
In this release, we support XCP scan, copy, and verify operations and data migration from HDFS to NFS. You can transfer data from a data lake cluster single worker node and multiple worker nodes. In the 1.8 release, root and non-root users can perform data migration.
-
Deployment steps - Non-root user migrates HDFS/MaprFS data to NetApp NFS
-
Follow the same steps mentioned from 1-9 steps from steps for deployment section.
-
In the following example, the user migrates data from HDFS to NFS.
-
Create a folder and files (using
hadoop fs -copyFromLocal
) in HDFS.[root@n138 ~]# su - tester -c 'hadoop fs -mkdir /tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src' [root@n138 ~]# su - tester -c 'hadoop fs -ls -d /tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src' drwxr-xr-x - tester supergroup 0 2021-11-16 16:52 /tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src [root@n138 ~]# su - tester -c "echo 'testfile hdfs' > /tmp/a_hdfs.txt" [root@n138 ~]# su - tester -c "echo 'testfile hdfs 2' > /tmp/b_hdfs.txt" [root@n138 ~]# ls -ltrah /tmp/*_hdfs.txt -rw-rw-r-- 1 tester tester 14 Nov 16 17:00 /tmp/a_hdfs.txt -rw-rw-r-- 1 tester tester 16 Nov 16 17:00 /tmp/b_hdfs.txt [root@n138 ~]# su - tester -c 'hadoop fs -copyFromLocal /tmp/*_hdfs.txt hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src' [root@n138 ~]#
-
Check permissions in the HDFS folder.
[root@n138 ~]# su - tester -c 'hadoop fs -ls hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src' Found 2 items -rw-r--r-- 3 tester supergroup 14 2021-11-16 17:01 hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src/a_hdfs.txt -rw-r--r-- 3 tester supergroup 16 2021-11-16 17:01 hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src/b_hdfs.txt
-
Create a folder in NFS and check permissions.
[root@n138 ~]# su - tester -c 'mkdir /xcpsrc_vol/mohankarthiknfs_dest' [root@n138 ~]# su - tester -c 'ls -l /xcpsrc_vol/mohankarthiknfs_dest' total 0 [root@n138 ~]# su - tester -c 'ls -d /xcpsrc_vol/mohankarthiknfs_dest' /xcpsrc_vol/mohankarthiknfs_dest [root@n138 ~]# su - tester -c 'ls -ld /xcpsrc_vol/mohankarthiknfs_dest' drwxrwxr-x 2 tester tester 4096 Nov 16 14:32 /xcpsrc_vol/mohankarthiknfs_dest [root@n138 ~]#
-
Copy the files from HDFS to NFS using XCP, and check permissions.
[root@n138 ~]# su - tester -c '/usr/src/hdfs_nightly/xcp/linux/xcp copy -chown hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src/ 10.63.150.126:/xcpsrc_vol/mohankarthiknfs_dest' XCP Nightly_dev; (c) 2021 NetApp, Inc.; Licensed to Karthikeyan Nagalingam [NetApp Inc] until Wed Feb 9 13:38:12 2022 xcp: WARNING: No index name has been specified, creating one with name: autoname_copy_2021-11-16_17.04.03.652673 Xcp command : xcp copy -chown hdfs:///tmp/testerfolder_src/util-linux-2.23.2/mohankarthikhdfs_src/ 10.63.150.126:/xcpsrc_vol/mohankarthiknfs_dest Stats : 3 scanned, 2 copied, 3 indexed Speed : 3.44 KiB in (650/s), 80.2 KiB out (14.8 KiB/s) Total Time : 5s. STATUS : PASSED [root@n138 ~]# su - tester -c 'ls -l /xcpsrc_vol/mohankarthiknfs_dest' total 0 -rw-r--r-- 1 tester supergroup 14 Nov 16 17:01 a_hdfs.txt -rw-r--r-- 1 tester supergroup 16 Nov 16 17:01 b_hdfs.txt [root@n138 ~]# su - tester -c 'ls -ld /xcpsrc_vol/mohankarthiknfs_dest' drwxr-xr-x 2 tester supergroup 4096 Nov 16 17:01 /xcpsrc_vol/mohankarthiknfs_dest [root@n138 ~]#
-