DEV Community

Cong Li
Cong Li

Posted on

Overview of GBase 8c Row Storage Engine Syntax

GBase 8c is a high-performance, highly available, and secure database management system widely used in big data processing and analysis. It supports SQL standards, allowing users to easily perform data operations and queries. GBase 8c supports various storage engines, including row storage, column storage, and in-memory storage. This article introduces the SQL syntax for typical OLTP scenarios using the row storage engine, helping users better understand and utilize the GBase database.

1. Row Storage Engine

GBase 8c's row storage supports astore (append-only update) and ustore (in-place update). The main difference between the two is that in ustore, the new and historical versions of the data are stored separately.

If not specified, the default is astore. The following examples use the default setting.

--- astore
drop table if exists test_astore_row;
drop table if exists test_astore_column;
drop foreign table if exists test_astore_mot;

--- Create astore row storage table
create table test_astore_row(col text) with (orientation=row);

--- Create astore column storage table
create table test_astore_column(col text) with (orientation=column);

--- Create astore memory table
create foreign table test_astore_mot(col int) server mot_server;

--- ustore
drop table if exists test_ustore;
create table test_ustore(col text) with (storage_type=ustore);
Enter fullscreen mode Exit fullscreen mode

2. Distributed Tables and Replication Tables

GBase 8c's data sharding strategy: When creating a table, use the distribute or replication keyword to set the table sharding strategy.

1) Create a hash partitioned table using distribute by hash. Data is evenly distributed to each data node using the hash algorithm.

drop table if exists t1_dis;
CREATE TABLE t1_dis(c1 int, c2 int) DISTRIBUTE BY hash(c1);
Enter fullscreen mode Exit fullscreen mode

2) Create a replication table using distribute by replication. Each data node stores a complete copy of the data.

drop table if exists t1_rep;
CREATE TABLE t1_rep(c1 int, c2 int) DISTRIBUTE BY replication;
Enter fullscreen mode Exit fullscreen mode

3. Various Indexes

GBase 8c commonly uses the following types of indexes:

  • Hash Index: Only handles simple equality queries. Use the syntax: create index ... using hash(column) to specify a hash index.
  • B-Tree Index: Often used for operations like greater than, less than, or equal to. Use the syntax: create index ... using btree(column) to specify a B-tree index.
  • GiST Index: For geographic data and images, when querying if a specific point exists in a location (i.e., determining if a location is "contained"). For spatial data, a GiST index can use an R-tree to support relative position operators (left, right, contains, etc.). For hierarchical data, R-trees support intersection or containment operators. Use the syntax: create index ... using gist(column) to specify a GiST index.
  • GIN Index: Suitable for searching values within multi-value types, such as arrays, full-text search, or tokens. (Supports intersection, containment, greater than, left, right, etc. searches based on different types.) Use the syntax: create index ... using gin(column) to specify a GIN index.

Hash Index Example

create table rw_split(col int, name text);
insert into rw_split select generate_series(1, 50000), md5(random()::text)::text;
create index rw_split_col_hash on rw_split using hash(col);
explain select * from rw_split where col = 2;
Enter fullscreen mode Exit fullscreen mode

B-Tree Index Example

drop index rw_split_col_hash;
create index rw_split_col_btree on rw_split using btree(col);
explain select * from rw_split where col = 2;
Enter fullscreen mode Exit fullscreen mode

GiST Index Example

drop table if exists t_gist;
create table t_gist(id int, p point);
insert into t_gist select generate_series(1, 10000), point(round((random() * 1000)::numeric, 2), round((random() * 1000)::numeric, 2));
select * from t_gist limit 2;
create index on t_gist using gist(p);
explain (analyze, buffers) select * from t_gist where circle '((100, 100) 1)' @> p order by p <-> '(100, 100)' limit 10; 
Enter fullscreen mode Exit fullscreen mode

The query above searches for points within a radius of 10 units around the point (100, 100).

GIN Index Example

drop table if exists t_gin;
create table t_gin(doc text, doc_tsv tsvector);
insert into t_gin(doc) values
('Can a sheet slitter slit sheets?'),
('How many sheets could a sheet slitter slit?'),
('I slit a sheet, a sheet I slit.'),
('Upon a slitted sheet I sit.'),
('Whoever slit the sheets is a good sheet slitter.'),
('I am a sheet slitter.'),
('I slit sheets.'),
('I am the sleekest sheet slitter that ever slit sheets.'),
('She slits the sheet she sits on.');
update t_gin set doc_tsv = to_tsvector(doc);
create index on t_gin using gin(doc_tsv);
--- To demonstrate the effect, manually disable sequential scan.
set enable_seqscan = off;
explain(costs off) select doc from t_gin where doc_tsv @@ to_tsquery('many & slitter');
Enter fullscreen mode Exit fullscreen mode

Expression Index Example

create table test_expression(col int, name varchar(64));
insert into test_expression select 1, 'ASDD';
insert into test_expression select 2, 'ASDD';
insert into test_expression select 3, 'AS';
insert into test_expression select 4, 'ASsda';
insert into test_expression select 5, 'ASdssa';
insert into test_expression select 6, 'Asds';
insert into test_expression select 7, 'Assa';
insert into test_expression select 8, 'as';

explain select * from test_expression where lower(name) = 'as';
create index on test_expression (name);
create index test_expression_lower on test_expression (lower(name));
explain select * from test_expression where lower(name) = 'as';
Enter fullscreen mode Exit fullscreen mode

Top comments (0)