DEV Community

Cong Li
Cong Li

Posted on

Introduction to GBase 8a MPP Cluster Data Migration

GBase 8a MPP Cluster provides multiple data migration tools that are easy to use, catering to different scenario needs. Today, we will introduce a few methods for data migration.

1. Database Object Structure Export

Overview

The gcdump tool allows you to export the structure of database objects, including:

  • Table structures
  • Stored procedures
  • Custom functions

Description

The gcdump tool is located in the $GCLUSTER_HOME/bin directory. The parameter gbase_show_ident_case_sensitive controls whether column names are case-sensitive. By default, the case matches the source table's structure.

Syntax:

gcdump [OPTIONS] database [tables]
gcdump [OPTIONS] --databases [OPTIONS] DB1 [DB2 DB3...]
gcdump [OPTIONS] --all-databases [OPTIONS]
Enter fullscreen mode Exit fullscreen mode

Parameter Explanation

Parameter Name Description
-u Cluster login username
-p Cluster login password
-R Export stored procedures and functions
-B Export multiple databases
-W Specify VC name

Example:

$ $GCLUSTER_BASE/server/bin/gcdump -uroot -p****** -B -R ssbm > /home/gbase/ssbm.sql
Enter fullscreen mode Exit fullscreen mode

2. Data Loading

Overview

GBase 8a MPP Cluster provides a SQL interface for loading data, supporting:

  • Local file loading
  • Data fetching from a general data server
  • Multiple protocols such as FTP, HTTP, HDFS, SFTP
  • Parallel loading to maximize performance
  • Loading of plain text, gzip, snappy, and lzo compressed files
  • Real-time loading status and information query
  • Error tracing for identifying problematic data in the source file
  • Scalable loading performance as the cluster expands

Syntax:

LOAD DATA INFILE 'file_list' INTO TABLE [vcname.] [dbname.]tbl_name [options]
Enter fullscreen mode Exit fullscreen mode

Explanation of file_list

  • Cluster Local Data Source Loading:

    1. Supports loading from local files on multiple data nodes. Use the format file://host+abs_path, separating multiple files with commas.
    2. Supports concurrent loading from files on each node. Use file://+abs_path, separating multiple files with commas.
  • Loading from General File Servers:

    1. Set up ftp/http/hdfs/sftp services on the file server and place the data files in the configured path.
    2. Use a URL to specify the file path on the server, using commas to separate multiple files or directories.

Examples

LOAD DATA INFILE 'ftp://gbase:gbase@127.0.0.1/data/a.tbl' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'http://127.0.0.1/data/b.tbl.gz' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'hdp://gbase@127.0.0.1:50070/data/a.tbl.snappy' INTO TABLE test.t DATA_FORMAT 3;
LOAD DATA INFILE 'ftp://192.168.0.1/pub/lineitem.tbl,http://192.168.0.2/lineitem.tbl' INTO TABLE test.lineitem FIELDS TERMINATED BY '|' ENCLOSED BY '"' LINES TERMINATED BY '\n';
Enter fullscreen mode Exit fullscreen mode

Monitoring Load Status

gbase> use information_schema;
gbase> select * from load_status;
gbase> show load logs 100 limit 1,5;
gbase> show gcluster load logs 101;
gbase> select * from information_schema.cluster_load_result;
Enter fullscreen mode Exit fullscreen mode

3. Data Export

Overview

GBase 8a MPP Cluster provides data export functionality through the SQL syntax SELECT ... INTO OUTFILE .... It supports:

  • Exporting data to the cluster's server, an FTP/SFTP server, or a Hadoop cluster, in text or compressed formats (gzip/snappy/lzo)
  • Remote export of data from the cluster to the client's machine in text format

Syntax

SELECT ... INTO OUTFILE 'file_name' [OPTION] FROM ...;
SELECT... FROM... INTO OUTFILE 'file_name' [OPTION];
rmt:SELECT... FROM... INTO OUTFILE 'file_path' [OUTFILE_OPTION];
Enter fullscreen mode Exit fullscreen mode

Example

gbase> select * from aa into outfile '/home/davies/out.txt' fields escaped by '' terminated by '|' double_enclosed by '"' null_value 'null';
Enter fullscreen mode Exit fullscreen mode

4. Data Migration

4.1 orato8a Data Extraction Tool

Overview

orato8a is a dedicated tool for quickly and efficiently extracting data from Oracle databases, saving the data to files or directly migrating it to GBase 8a MPP Cluster. It supports both query-based export and full table export.

Deployment

orato8a is a standalone tool that must be deployed on a machine that can access Oracle, such as a machine with an Oracle client or the Oracle server itself.

Syntax

./orato8a parameter_1 parameter_2 ... parameter_n
Enter fullscreen mode Exit fullscreen mode

Example

$ ./orato8a --user='ct1/ct1ct1@orcl' --query="select LO_ORDERKEY, LO_LINENUMBER FROM lineorder_test" --file='/opt/orato8a_output/lineorder.txt ' --field=";" --format=3
Enter fullscreen mode Exit fullscreen mode

4.2 db2to8a Data Extraction Tool

Overview

db2to8a is a dedicated tool for quickly and efficiently extracting data from DB2 databases and saving it to a specified file.

Deployment

db2to8a must be deployed on a machine that can access DB2, such as a DB2 client or the DB2 server itself.

Syntax

./db2to8a parameter_1 parameter_2 ... parameter_n
Enter fullscreen mode Exit fullscreen mode

Example

$ ./db2to8a -D'test' -u'db2inst1' -p'db2inst1' -q"select * from t" -f'data1.txt' -m'3' -e'|' -l'\n' -s'h'
Enter fullscreen mode Exit fullscreen mode

That's all for today. Thank you for reading!

Top comments (0)