Mariadb Xpand分发了mysql
#sql #database #mysql #distributed

在本系列中,我正在使用EMP/DEPT模式测试最基本的SQL功能。因为您会惊讶于多少新闻Ql数据库不支持它。这是我使用Mariadb Xpand的第一次测试。

我创建了一个3个节点群集:
cloud.mariadb.com/skysql

我从mySQL客户端连接:

mysql --host xpand-db00008262.mdb0002418.db1.skysql.net --port 5001 --user DB00008262 --default-character-set=utf8 -A -D test -p
Enter password:
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 291
Server version: 5.0.45-Xpand-6.0.3.1

Copyright (c) 2000, 2022, Oracle and/or its affiliates.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql> ALTER USER 'DB00008262'@'%' IDENTIFIED BY "newpass";

mariadb基于mysql,我将与本系列的first post相同,而auto_increment不支持generated always as identity

CREATE TABLE dept (
  deptno integer NOT NULL,
  dname text,
  loc text,
  description text,
  CONSTRAINT pk_dept PRIMARY KEY (deptno asc)
);

CREATE TABLE emp (
  empno integer NOT NULL auto_increment,
  ename text NOT NULL,
  job text,
  mgr integer,
  hiredate date,
  sal integer,
  comm integer,
  deptno integer NOT NULL,
  email text,
  other_info json,
  CONSTRAINT pk_emp PRIMARY KEY (empno),
  CONSTRAINT emp_email_uk UNIQUE (email),
  CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno),
  CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno)
);

我有以下错误:email text

ERROR 1170 (HY000): [12299] Invalid blob/text index specification: BLOB/TEXT column "email" used in key specification without a key length

与我已经测试过的以前的MySQL兼容相比,这已经是个好消息:错误消息很明确。让我们用varchar(90)替换text

CREATE TABLE emp (
  empno integer NOT NULL auto_increment,
  ename text NOT NULL,
  job text,
  mgr integer,
  hiredate date,
  sal integer,
  comm integer,
  deptno integer NOT NULL,
  email varchar(90),
  other_info json,
  CONSTRAINT pk_emp PRIMARY KEY (empno),
  CONSTRAINT emp_email_uk UNIQUE (email),
  CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno),
  CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno)
);

这有效,我现在正在插入4个部门和14名员工:

INSERT INTO dept (deptno,  dname,        loc, description)
     VALUES    (10,     'ACCOUNTING', 'NEW YORK','preparation of financial statements, maintenance of general ledger, payment of bills, preparation of customer bills, payroll, and more.'),
            (20,     'RESEARCH',   'DALLAS','responsible for preparing the substance of a research report or security recommendation.'),
            (30,     'SALES',      'CHICAGO','division of a business that is responsible for selling products or services'),
            (40,     'OPERATIONS', 'BOSTON','administration of business practices to create the highest level of efficiency possible within an organization');

INSERT INTO emp (empno, ename,    job,        mgr,   hiredate,     sal, comm, deptno, email, other_info)
     VALUES   (7369, 'SMITH',  'CLERK',     7902, '1980-12-17',  800, NULL,   20,'SMITH@acme.com', '{"skills":["accounting"]}'),
            (7499, 'ALLEN',  'SALESMAN',  7698, '1981-02-20', 1600,  300,   30,'ALLEN@acme.com', null),
            (7521, 'WARD',   'SALESMAN',  7698, '1981-02-22', 1250,  500,   30,'WARD@compuserve.com', null),
            (7566, 'JONES',  'MANAGER',   7839, '1981-04-02', 2975, NULL,   20,'JONES@gmail.com', null),
            (7654, 'MARTIN', 'SALESMAN',  7698, '1981-09-28', 1250, 1400,   30,'MARTIN@acme.com', null),
            (7698, 'BLAKE',  'MANAGER',   7839, '1981-05-01', 2850, NULL,   30,'BLAKE@hotmail.com', null),
            (7782, 'CLARK',  'MANAGER',   7839, '1981-06-09', 2450, NULL,   10,'CLARK@acme.com', '{"skills":["C","C++","SQL"]}'),
            (7788, 'SCOTT',  'ANALYST',   7566, '1982-12-09', 3000, NULL,   20,'SCOTT@acme.com', '{"cat":"tiger"}'),
            (7839, 'KING',   'PRESIDENT', NULL, '1981-11-17', 5000, NULL,   10,'KING@aol.com', null),
            (7844, 'TURNER', 'SALESMAN',  7698, '1981-09-08', 1500,    0,   30,'TURNER@acme.com', null),
            (7876, 'ADAMS',  'CLERK',     7788, '1983-01-12', 1100, NULL,   20,'ADAMS@acme.org', null),
            (7900, 'JAMES',  'CLERK',     7698, '1981-12-03',  950, NULL,   30,'JAMES@acme.org', null),
            (7902, 'FORD',   'ANALYST',   7566, '1981-12-03', 3000, NULL,   20,'FORD@acme.com', '{"skills":["SQL","CQL"]}'),
            (7934, 'MILLER', 'CLERK',     7782, '1982-01-23', 1300, NULL,   10,'MILLER@acme.com', null);

ERROR 1452 (HY000): [7168] Foreign key constraint violation on insert: Cannot add or update a child row: foreign key "fk_mgr" violated by: (mgr)=(7902)

鉴于我已经测试过的先前数据库忽略了外键,因此此错误已经是个好消息。看来,约束未检查每个语句,而是每行检查。整个集合是一致的,但仅在原子上。

作为解决方法,我将放下外键:

mysql> alter table emp drop constraint fk_mgr;

ERROR 1 (HY000): [12291] Invalid foreign key specification encountered in DDL statement: Cannot drop index 'fk_mgr' ('mgr'): needed in a foreign key constraint 'fk_mgr' on 'emp'

好吧,我不知道drop constraint不是删除约束的正确方法,但我可以drop foreign key

mysql> alter table emp drop foreign key fk_mgr;
Query OK, 0 rows affected (0.06 sec)

然后,插入作品有效:

Query OK, 14 rows affected (0.04 sec)

我可以再次启用约束:

mysql> alter table emp add CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno);
Query OK, 0 rows affected (0.27 sec)

测试外键

如果我删除了一个孩子排的部门,则会提出一个例外:

mysql> delete from dept where deptno=10;
ERROR 1451 (HY000): [7169] Foreign key constraint violation on delete: Cannot delete or update a parent row: foreign key "fk_deptno" violated
mysql>

这可以按预期工作。错误消息很清楚。

现在测试并发交易。
在会议1:

mysql> start transaction;
Query OK, 0 rows affected (0.02 sec)

mysql> insert into emp(deptno, ename) values (40, 'Franck');
Query OK, 1 row affected (0.03 sec)

在会议2上:

mysql> delete from dept where deptno=40;

这个等待,这是正常的
会议1:

mysql> rollback;
Query OK, 0 rows affected (0.02 sec)

会议2:

mysql> delete from dept where deptno=40;
Query OK, 1 row affected (45.43 sec)

好的,这是按预期工作的。我猜是用两个阶段提交的悲观锁定。

测试可序列化

第一个会话:

mysql> set transaction isolation level serializable;
Query OK, 0 rows affected (0.01 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> insert into emp(ename, deptno) values ('Franck',40);
Query OK, 1 row affected (0.03 sec)

交易运行时,我打开了一个并发会话:

mysql> set transaction isolation level serializable;
Query OK, 0 rows affected (0.01 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.01 sec)

mysql> SELECT @@TX_ISOLATION;
+----------------+
| @@TX_ISOLATION |
+----------------+
| SERIALIZABLE   |
+----------------+
1 row in set (0.00 sec)

mysql> select count(*) from emp where deptno=40;

第二节会等待。当然,可序列化可以用重锁实现,但这并不能扩展。在这里,我只是在阅读,并期望MVCC(多反转并发控制)非阻滞读取。

但是documentation清楚地表明, 目前尚不可用于最终用户交易的可序列交易隔离级别。

一段时间后,我的选择返回:

mysql> select count(*) from emp where deptno=40;
+----------+
| count(*) |
+----------+
|        0 |
+----------+
1 row in set (1 min 42.92 sec)

mysql>

,但原因是我的第一个会话被终止:

mysql> select * from temps;
ERROR 1927 (HY000): Lost connection to backend server: connection closed by peer (@@Xpand-Monitor:node-2)
mysql>

实际上,我什至不确定等待是一个锁,因为有一个会话:

mysql> set transaction isolation level serializable;
Query OK, 0 rows affected (0.00 sec)

mysql> start transaction;
Query OK, 0 rows affected (0.00 sec)

mysql> delete from dept where deptno=40;
ERROR 1205 (40000): [30720] MVCC serializable conflict: This transaction conflicted with a serializable transaction: `test`.`dept` Primary key: (40); try restarting transaction
mysql>

mysql> drop table emp;
Query OK, 0 rows affected (0.06 sec)

mysql> drop table dept;
Query OK, 0 rows affected (0.04 sec)

mysql>

我丢下了桌子以确保它们没有锁定。

分配

在我的3个节点群集上,该表是用3个碎片创建的,在此处可见为 slices

mysql> show create table dept;

CREATE TABLE `dept` (
  `deptno` int(11) not null,
  `dname` text CHARACTER SET utf8,
  `loc` text CHARACTER SET utf8,
  `description` text CHARACTER SET utf8,
  PRIMARY KEY (`deptno`) /*$ DISTRIBUTE=1 */
) CHARACTER SET utf8 /*$ SLICES=3 */;

Overview

Image description

Image description

我在hiredate上创建索引,分析表并查看执行计划:

mysql> create index emp_hiredate on emp(hiredate);
Query OK, 0 rows affected (0.25 sec)

mysql> analyze table dept;
+-----------+---------+----------+----------+
| Table     | Op      | Msg_type | Msg_text |
+-----------+---------+----------+----------+
| test.dept | analyze | status   | OK       |
+-----------+---------+----------+----------+
1 row in set (0.01 sec)

mysql> analyze table emp;
+----------+---------+----------+----------+
| Table    | Op      | Msg_type | Msg_text |
+----------+---------+----------+----------+
| test.emp | analyze | status   | OK       |
+----------+---------+----------+----------+
1 row in set (0.01 sec)

mysql> explain select * from dept natural 
       join emp where emp.hiredate>now() order by hiredate;

+----------------------------------------------------------------------------------+-----------+-----------+
| Operation                                                                        | Est. Cost | Est. Rows |
+----------------------------------------------------------------------------------+-----------+-----------+
| sigma_sort KEYS=[(2 . "hiredate") ASC]                                           |     92.23 |     14.00 |
|   nljoin                                                                         |     45.63 |     14.00 |
|     filter (2.hiredate > cast(datetime_trunc(current_timestamp(), 0), datetime)) |     13.43 |     14.00 |
|       stream_combine                                                             |     13.15 |     14.00 |
|         index_scan 2 := emp.__idx_emp__PRIMARY                                   |     10.75 |     14.00 |
|     index_scan 1 := dept.__idx_dept__PRIMARY, deptno = 2.deptno                  |      2.30 |      1.00 |
+----------------------------------------------------------------------------------+-----------+-----------+
6 rows in set (0.03 sec)

这很有趣。根据我在doc中阅读的内容,stream_combine是分发操作(索引扫描)的地方。在这里,整个emp表都从所有切片中读取,然后在hiredate上过滤。这不是最佳的,可能是因为索引本身是碎片。让我们重新创建它作为上升:

mysql> alter table emp drop index emp_hiredate;
Query OK, 0 rows affected (0.07 sec)

mysql> create index emp_hiredate on emp(hiredate asc);
Query OK, 0 rows affected (0.23 sec)

mysql> explain select * from dept natural
       join emp where emp.hiredate>now() order by hiredate;
+-----------------------------------------------------------------------------------------------------------+-----------+-----------+
| Operation                                                                                                 | Est. Cost | Est. Rows |
+-----------------------------------------------------------------------------------------------------------+-----------+-----------+
| nljoin                                                                                                    |     89.41 |     14.00 |
|   msjoin KEYS=[(2 . "hiredate") ASC]                                                                      |     57.21 |     14.00 |
|     stream_merge KEYS=[(2 . "hiredate") ASC]                                                              |     17.21 |     14.00 |
|       index_scan 2 := emp.emp_hiredate, hiredate > cast(datetime_trunc(current_timestamp(), 0), datetime) |      9.41 |     14.00 |
|     index_scan 2 := emp.__idx_emp__PRIMARY, empno = 2.empno                                               |      2.30 |      1.00 |
|   index_scan 1 := dept.__idx_dept__PRIMARY, deptno = 2.deptno                                             |      2.30 |      1.00 |
+-----------------------------------------------------------------------------------------------------------+-----------+-----------+
6 rows in set (0.01 sec)

这看起来更好,而过滤器向下推了。读取索引并过滤索引,然后加入到表上,并在主键上加入一个合并,然后用嵌套循环到dept

由于dept非常静态,我可以决定将其广播到所有节点;

mysql> alter table dept replicas=allnodes;
Query OK, 0 rows affected (0.22 sec)

mysql> explain select * from dept natural
       join emp where emp.hiredate>now() order by hiredate;

+--------------------------------------------------------------------------------------+-----------+-----------+
| Operation                                                                            | Est. Cost | Est. Rows |
+--------------------------------------------------------------------------------------+-----------+-----------+
| sigma_sort KEYS=[(2 . "hiredate") ASC]                                               |     44.16 |      4.71 |
|   nljoin                                                                             |     15.21 |      4.71 |
|     index_scan 1 := dept.__idx_dept__PRIMARY                                         |      2.30 |      1.01 |
|     stream_combine                                                                   |     12.78 |      4.67 |
|       filter (1.deptno = 2.deptno)                                                   |     11.31 |      4.67 |
|         filter (2.hiredate > cast(datetime_trunc(current_timestamp(), 0), datetime)) |     11.03 |     14.00 |
|           index_scan 2 := emp.__idx_emp__PRIMARY                                     |     10.75 |     14.00 |
+--------------------------------------------------------------------------------------+-----------+-----------+
7 rows in set (0.02 sec)

在这里dept is local,是向分布式emp嵌套循环的驾驶表。我希望在这里加入哈希,但是根据documentation hash_join,我不适合Xpand 的流模型。我很想测试嵌套循环到远程节点的性能,那些stream_combinestream_merge,但是Xpand是专有的软件,免费试用不足。