DEV Community

GBASE Database
GBASE Database

Posted on

Summary of Common Issues with GBase 8c Primary-Standby Cluster

GBase 8c is a multi-model database representing the third generation of intelligent distributed database products from GBASE. It supports various storage modes and deployment forms, offering high performance, high availability, high intelligence, high security, and strong compatibility. This article primarily describes failure cases caused by operating system versions, environmental issues, etc., that lead to installation failures. It aims to help users quickly locate problems, analyze causes, and provide solutions or ideas to proficiently use the GBase 8c database.

0. Fault Classification

  • Basic environment not ready
  • System version/environment issues

1. Basic Environment Not Ready

Preparation before installation mainly involves checking operating system information, whether dependency packages are installed, whether the hostname is effective, whether the firewall is closed, whether the root user passwords are consistent across multiple machines, whether the time is synchronized, etc. Specific steps can be referred to as follows:

1.1 Check Operating System Information

1) Whether the ID information is in the json file

# cat /etc/os-release                      # General command for Linux operating systems
NAME="Kylin Linux Advanced Server"
VERSION="V10 (Tercel)"
ID="kylin"
VERSION_ID="V10"
PRETTY_NAME="Kylin Linux Advanced Server V10 (Tercel)"
ANSI_COLOR="0;31"

# cat support_system_info.json
{
 "SUSE": "suse",
 "REDHAT": "redhat",
 "CENTOS": "centos",
 "EULEROS": "euleros",
 "KYLIN": "kylin",
 "KYLINSEC": "kylinsecos",
 "OPENEULER": "openeuler",
 "FUSIONOS": "fusionos",
 "ASIANUX": "asianux",
 "DEBIAN": "debian",
 "UBUNTU": "ubuntu",
 "NFS": "nfs",
 "UOS": "uos",
 "BCLINUX": "bclinux",
 "NEOKYLIN": "neokylin",
 "OPENKYLIN": "openkylin"
}
Enter fullscreen mode Exit fullscreen mode

2) If it is a Kylin operating system, you also need to check whether the current version is based on Ubuntu, using the following command (either one will do):

# cat /etc/.kyinfo 
# cat /etc/os-release 
Enter fullscreen mode Exit fullscreen mode

In the returned version information, if the dist_id parameter value starts with "Kylin-Desktop", it is an Ubuntu-based operating system.

You need to replace bash. Use the following command:

# sudo dpkg-reconfigure dash 
Enter fullscreen mode Exit fullscreen mode

During the process, select No and press Enter. After exiting, it will automatically switch to bash.

3) For Kylin operating systems, you need to disable security authorization authentication

# setstatus softmode -p
Enter fullscreen mode Exit fullscreen mode

4) CentOS/RHEL 7.2+ environment

The systemd-logind service introduces a new feature—RemoveIPC, which manifests as: files created after a user logs in will be automatically deleted after logout. In the default case (i.e., RemoveIPC=yes), when a user logs out, the operating system will crash applications that use Shared Memory Segment (SHM) or Semaphores (SEM), causing the GBase database process to be interrupted. To avoid such problems, follow these steps:

(1) Check if the RemoveIPC parameter value is yes

# loginctl show-session | grep RemoveIPC
# systemctl show systemd-logind | grep RemoveIPC
Enter fullscreen mode Exit fullscreen mode

If it is yes, it needs to be modified; if it is no, there is no need to continue with the following steps.

(2) Modify the /etc/systemd/logind.conf configuration file

# vim /etc/systemd/logind.conf
Enter fullscreen mode Exit fullscreen mode

Set the RemoveIPC parameter value to no, type ":wq" to save and exit.

Modify the /usr/lib/systemd/system/systemd-logind.service configuration file, set the RemoveIPC parameter value to no.

# vim /usr/lib/systemd/system/systemd-logind.service
Enter fullscreen mode Exit fullscreen mode

Set the RemoveIPC parameter value to no, type ":wq" to save and exit.

(3) Reload the configuration file, execute the following commands:

# systemctl daemon-reload
# systemctl restart systemd-logind
Enter fullscreen mode Exit fullscreen mode

(4) Check again to see if it takes effect.

# loginctl show-session | grep RemoveIPC
# systemctl show systemd-logind | grep RemoveIPC
Enter fullscreen mode Exit fullscreen mode

1.2 Check CPU Architecture

Confirm whether it is x86 or aarch64

# lscpu |grep 'Architecture'
  Architecture:                    x86_64 
Enter fullscreen mode Exit fullscreen mode

It needs to be consistent with the information in the downloaded package name, for example:

GBase8cV6_SXXXXX_centos7.8_x86_64.tar.gz                                                              # x86 package
GBase8cV6_SXXXXX_centos7.8_aarch64.tar.gz                                                             # aarch64 package
Enter fullscreen mode Exit fullscreen mode

1.3 Pre-installation Dependency Check

# rpm -qa|egrep "libaio-devel|flex|bison|ncurses-devel|glibc-devel|patch|redhat-lsb-core|readline-devel|libnsl|expect|patchelf|bzip2"
Enter fullscreen mode Exit fullscreen mode

Installation command:

# yum install -y libaio-devel flex bison ncurses-devel glibc-devel patch redhat-lsb-core readline-devel libnsl expect patchelf  bzip2
Enter fullscreen mode Exit fullscreen mode

1.4 Confirm Hostname Consistency

Modify the hostname. Even for single-machine deployment, it is not recommended to use the default localhost.

Execute the command to view the hostname:

# cat /etc/hostname
# hostname
Enter fullscreen mode Exit fullscreen mode

Modification method:

# hostnamectl set-hostname  gbasedb01
Enter fullscreen mode Exit fullscreen mode

1.5 Create Database OS Management User

Take gbase as an example, it is not recommended to use the graphical interface to create

# groupadd gbase
# useradd -m -d /home/gbase gbase -g gbase
# passwd gbase
Enter fullscreen mode Exit fullscreen mode

1.6 Close the Firewall

# systemctl stop firewalld.service
# systemctl disable firewalld.service
# sed -i.bak '/^SELINUX=/s#SELINUX=.*#SELINUX=disabled#' /etc/selinux/config
# setenforce  0
Enter fullscreen mode Exit fullscreen mode

If the environment requires it to be open, you can refer to the following configuration:

# systemctl start firewalld
# firewall-cmd --zone=public --add-port=15400/tcp --permanent
# firewall-cmd --zone=public --add-port=15300/tcp --permanent
# firewall-cmd --zone=public --add-port=15301/tcp --permanent
# firewall-cmd --zone=public --add-port=15302/tcp --permanent
# firewall-cmd --zone=public --add-port=15405/tcp --permanent
# firewall-cmd --zone=public --add-port=15401/tcp --permanent
# firewall-cmd --zone=public --add-port=5000/tcp --permanent
# firewall-cmd --zone=public --add-port=5001/tcp --permanent
# firewall-cmd --zone=public --add-port=5002/tcp --permanent
# firewall-cmd --zone=public --query-port=15400/tcp
systemctl restart firewalld
Enter fullscreen mode Exit fullscreen mode

1.7 Multi-node Environment Installation Requires Checking Time Consistency

# date
Enter fullscreen mode Exit fullscreen mode

Modification method:

# hwclock --show
# hwclock --systohc
Enter fullscreen mode Exit fullscreen mode

Note: During the database startup process, time inconsistency will cause the startup to hang. When manually modifying the time, you need to modify the current time to a future time.

1.8 Check Maximum Open Files

The requirement is not less than 640000

# ulimit -n
Enter fullscreen mode Exit fullscreen mode

Modification method:

# echo 'ulimit -n 1000000' >> ~/.bashrc
Enter fullscreen mode Exit fullscreen mode

2. System Version/Environment Issues

2.1 Multi-node Installation, Root Password Inconsistency

Solution:

  1. Temporarily adjust to be consistent

  2. If it cannot be changed, follow the steps below to install

① Execute gs_preinstall under root on each machine, place the installation package and xml file in the same directory, and add the -L command to gs_preinstall to only perform local installation

./gs_preinstall -U gbase  -G gbase  -X /opt/cluster.xml -L
Enter fullscreen mode Exit fullscreen mode

② Switch to the gbase user, establish mutual trust for the gbase user, after establishment, execute "ssh hostname", "ssh ip" to log in to the local machine and other machines, refresh the cache information (refer to the ssh-free configuration issue for details)

③ Execute the installation on any server

gs_install -X /opt/cluster.xml
Enter fullscreen mode Exit fullscreen mode

2.2 SSH-free Configuration Issue

Complete password-free configuration reference is as follows (can avoid ssh permission issues)

# su - gbase
$ rm -rf /home/gbase/.ssh
$ mkdir ~/.ssh
$ chmod 700 ~/.ssh
$ ssh-keygen -t rsa

$ ssh-copy-id gbase@172.16.xx.xxx
$ ssh-copy-id gbase@172.16.xx.xxx

$ echo 'StrictHostKeyChecking no' >> ~/.ssh/config
$ echo 'UserKnownHostsFile ~/.ssh/known_hosts' >> ~/.ssh/config
$ chmod 644 ~/.ssh/config
Enter fullscreen mode Exit fullscreen mode

Verification test (both IP and hostname need to be tested)

$ ssh 172.16.xx.xxx  date
$ ssh 172.16.xx.xxx  date
$ ssh gbasedb01  date
$ ssh gbasedb01  date
Enter fullscreen mode Exit fullscreen mode

If during verification, it is found that ssh ip works but ssh hostname does not, ping the hostname to confirm if it is ipv6. If it is ipv6, perform the following operations:

echo 'precedence ::ffff:0:0/96 100' | sudo tee -a /etc/gai.conf 
Enter fullscreen mode Exit fullscreen mode

2.3 Environment Issues

(1) Error xxxx list index out of range during installation

Check the xml file during installation, focusing on whether the hostname and IP address are configured correctly.

(2) Error "importerror:libffi.so.6:cannot open shared object file: no such file or directory" during installation.

This may be due to the server's CPU architecture being ARM version, and the residual files from previously installed X86 version packages. In this case, according to the error message, delete all corresponding residues.

ll /lib64/libffi.so.6*
sudo rm -rf /lib64/libffi.so.6*
Enter fullscreen mode Exit fullscreen mode

(3) Error "cannot execute binary file" during installation.

This is due to the GBase 8c installation package not matching the CPU architecture. According to the deployment environment, replace the corresponding version of the installation package. For details, see 1.2 Check CPU Architecture.

2.4 VIP Configuration Issue After Installation

1) VIP address does not float

Check if sudo permissions are configured on all machines
Enter fullscreen mode Exit fullscreen mode
$ su - gbase
$ sudo -l
Enter fullscreen mode Exit fullscreen mode
Check if VIP is configured on the standby machine
Enter fullscreen mode Exit fullscreen mode
$ sudo find /* -name 'cm_resource.json'
Enter fullscreen mode Exit fullscreen mode

2) The entire cluster is unavailable when the primary node is down

Check if the VIP information is saved in the cm_server configuration

cm_ctl  list --param --server| grep "third_party_gateway_ip"
Enter fullscreen mode Exit fullscreen mode

If the above query result is empty, configure as follows:

cm_ctl set --param --server -k "third_party_gateway_ip= gateway address"
cm_ctl set --param --server -k "cms_enable_db_crash_recovery=1"
cm_ctl set --param --server -k "cms_network_isolation_timeout=10"
cm_ctl set --param --server -k "cms_enable_failover_on2nodes=1"
Enter fullscreen mode Exit fullscreen mode

This article aims to provide a comprehensive guide to troubleshooting common issues encountered during the installation and configuration of GBase 8c primary-standby clusters. By following the detailed steps and solutions provided, users can ensure a smooth and successful deployment of the GBase 8c database. For further assistance, refer to the official documentation or reach out to the support team.

Top comments (0)