阅读(2939) (0)

Mycat2 执行计划管理与注释

2021-09-09 15:08:41 更新

Mycat2的执行计划管理可以实现以下功能(v1.18以后)

  1. 通过缓存多个执行计划,使用SQL参数计算它们的执行代价,从中挑选执行代价最小的执行计划
  2. 带有注释的SQL与不带注释的SQL进行绑定,使得处理不带注释的SQL的时候使用带注释SQL的执行计划

添加执行计划

BASELINE ADD select * from db1.travelrecord n join db1.company s on n.id = s.id and n.id = 1
BASELINE_ID STATUS
572448969439842322  OK

作用:执行计划管理添加baseline以及优化器得出的plan

此时我们使用explain语句查看执行计划

EXPLAIN SELECT * FROM db1.travelrecord n JOIN db1.company s ON n.id = s.id AND n.id = 1
plan
MycatHashJoin(condition=[=($0, $6)], joinType=[inner])
  MycatView(distribution=[[db1.travelrecord]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
  MycatView(distribution=[[db1.company]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
Each(targetName=c0, sql=SELECT * FROM db1_0.travelrecord_0 WHERE (`id` = ?) union all SELECT * FROM db1_0.travelrecord_1 WHERE (`id` = ?))
Each(targetName=c1, sql=SELECT * FROM db1_1.travelrecord_0 WHERE (`id` = ?) union all SELECT * FROM db1_1.travelrecord_1 WHERE (`id` = ?))
Each(targetName=prototype, sql=SELECT * FROM db1.company WHERE (`id` = ?))

它得出使用HashJoin实现

如果我们想使用MERGE_JOIN实现JOIN则怎么办?

BASELINE ADD /*+MYCAT:use_merge_join(n,s) */select * from db1.travelrecord n join db1.company s on n.id = s.id and n.id = 1
BASELINE_ID STATUS
572448969439842322  OK

可以留意到此处的BASELINE_ID与上面的BASELINE_ID是相同的,说明它们的baseline sql以及SQL约束是相同的.

查看内存中的PLAN信息

BASELINE LIST;
BASELINE_ID PARAMETERIZED_SQL   PLAN_ID EXTERNALIZED_PLAN   FIXED   ACCEPTED
572448969439842322  select * from db1.travelrecord n join db1.company s on n.id = s.id and n.id = ? 572809558016135252  MycatHashJoin(condition=[=($0, $6)], joinType=[inner])
  MycatView(distribution=[[db1.travelrecord]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
  MycatView(distribution=[[db1.company]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
    false   true
572448969439842322  /*+MYCAT:use_merge_join(n,s) */ select * from db1.travelrecord n join db1.company s on n.id = s.id and n.id = ? 572810594013417563  MycatSortMergeJoin(condition=[=($0, $6)], joinType=[inner])
  MycatMergeSort(sort0=[$0], dir0=[ASC])
    MycatView(distribution=[[db1.travelrecord]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
  MycatMergeSort(sort0=[$0], dir0=[ASC])
    MycatView(distribution=[[db1.company]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
    true    true

可以看到内存中存在两个执行计划

此时我们再次对不带hint的SQL进行explain查看执行计划

EXPLAIN SELECT * FROM db1.travelrecord n JOIN db1.company s ON n.id = s.id AND n.id = 1
plan
MycatSortMergeJoin(condition=[=($0, $6)], joinType=[inner])
  MycatMergeSort(sort0=[$0], dir0=[ASC])
    MycatView(distribution=[[db1.travelrecord]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
  MycatMergeSort(sort0=[$0], dir0=[ASC])
    MycatView(distribution=[[db1.company]], conditions=[=($0, CAST(?0):BIGINT NOT NULL)])
Each(targetName=c0, sql=SELECT * FROM db1_0.travelrecord_0 WHERE (`id` = ?) ORDER BY (`id` IS NULL), `id`)
Each(targetName=c0, sql=SELECT * FROM db1_0.travelrecord_1 WHERE (`id` = ?) ORDER BY (`id` IS NULL), `id`)
Each(targetName=c1, sql=SELECT * FROM db1_1.travelrecord_0 WHERE (`id` = ?) ORDER BY (`id` IS NULL), `id`)
Each(targetName=c1, sql=SELECT * FROM db1_1.travelrecord_1 WHERE (`id` = ?) ORDER BY (`id` IS NULL), `id`)
Each(targetName=prototype, sql=SELECT * FROM db1.company WHERE (`id` = ?) ORDER BY (`id` IS NULL), `id`)

此时我们看到它已经选择了MERGE_JOIN实现.但是现在执行计划是有可能根据参数id变化的(使用代价分析挑选).

如果我们想把SQL永远与MERGE_JOIN实现的执行计划进行一对一绑定,怎么办?

BASELINE FIX /*+MYCAT:use_merge_join(n,s) */SELECT * FROM db1.travelrecord n JOIN db1.company s ON n.id = s.id AND n.id = 1
BASELINE_ID STATUS
572448969439842322  OK

此时已经在内存中进行绑定,对于SQL

SELECT * FROM db1.travelrecord n JOIN db1.company s ON n.id = s.id AND n.id = ?

永远都会选择MERGE_JOIN实现

此时还不足够,我们需要持久化该执行计划的绑定关系,不这样做,Mycat重启后将会丢失这个执行计划

BASELINE PERSIST 572448969439842322

这样就可以保存整个baseline,mycat2在启动的时候会自动加载该baseline以及它的执行计划.