PostgreSQLスキルアップノート（自己啓発のための個人サイト）
実行計画(EXPLAIN)・結合

マニュアルへのリンクは9.2としています。


━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
■■■■ PostgreSQL スキルアップノート
■■■■
■◆■■ 実行計画(EXPLAIN)・結合
■■■■
■■■■
■■■■ 使用環境：PostgreSQL9.1.7 (CentOS6.2)
         2013/02/18 
         2013/03/15
                                                                   (C) 2013 ohdb
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

EXPLAINを使った実行計画の確認。結合編
ごく簡単な内容のみです。


【マニュアル】

EXPLAIN→●[マニュアル]

第14章性能に関するヒント・EXPLAINの利用→●[マニュアル]


【その他】

Lets's postgres・スロークエリの改善→●[記事]
Lets's postgres・スロークエリの分析→●[記事]



■１■ 概要
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━



　Nested Loop　　　　・・・■２■
　Merge Join　　　　 ・・・今回は省略
　Hash & Hash Join　 ・・・■３■




■２■ Nested Loop
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

小さなテーブルt1を駆動側にして、大きなテーブルt2に対してNested Loopによる結合を行う。



１．準備
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

【□】 テーブルを作成

DROP TABLE if exists t1,t2;
CREATE TABLE t1 (c1 int PRIMARY KEY,c2 int);
CREATE TABLE t2 (c1 int PRIMARY KEY,c2 timestamp,c3 int);


【□】 小規模テーブル「t1」へ20件インサート

WITH RECURSIVE tmp(c1,c2) AS (
    SELECT 1,(random()*20)::integer 
    UNION ALL 
    SELECT c1+1,(random()*20)::integer 
      FROM tmp WHERE c1 < 20
)
INSERT INTO t1 SELECT * FROM tmp;


【□】 大規模テーブル「t2」へ100万件インサート

WITH RECURSIVE tmp(c1,c2,c3) AS (
    SELECT 1, current_timestamp,(random()*1000000)::int 
    UNION ALL 
    SELECT c1+1, c2 + '1 hours' ,(random()*1000000)::int 
      FROM tmp WHERE c1 < 1000000
)
INSERT INTO t2 SELECT * FROM tmp;



※上記で使用しているランダム値生成は簡易的なものなので注意して下さい。
  詳細は別ページにて記載してあります。



２．実行計画確認
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

【□】 EXPLAIN (ANALYZE,VERBOSE,BUFFERS) SELECT * FROM t1 LEFT JOIN t2 ON t1.c2=t2.c1;

                                                        QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.00..14357.57 rows=2140 width=24) (actual time=0.042..0.144 rows=20 loops=1)
   Output: t1.c1, t1.c2, t2.c1, t2.c2, t2.c3
   Buffers: shared hit=59 read=5
   ->  Seq Scan on public.t1  (cost=0.00..31.40 rows=2140 width=8) (actual time=0.019..0.027 rows=20 loops=1)
         Output: t1.c1, t1.c2
         Buffers: shared read=1
   ->  Index Scan using t2_pkey on public.t2  (cost=0.00..6.68 rows=1 width=16) (actual time=0.004..0.004 rows=0 loops=20)
         Output: t2.c1, t2.c2, t2.c3
         Index Cond: (t1.c2 = t2.c1)
         Buffers: shared hit=59 read=4
 Total runtime: 0.192 ms
(11 行)



上記の骨子は以下の通り

 Nested Loop Left Join 
   ->  Seq Scan on public.t1 
   ->  Index Scan using t2_pkey on public.t2
         Index Cond: (t1.c2 = t2.c1)





■３■ Hash & Hash Join
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

小さなテーブルt1、t2に対してHash Joinによる結合を行う。


１．準備
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

【□】 テーブルを作成

DROP TABLE if exists t1,t2;
CREATE TABLE t1 (c1 int PRIMARY KEY,c2 int);
CREATE TABLE t2 (c1 int PRIMARY KEY,c2 int);


【□】 小規模テーブル「t1」へ20件インサート

WITH RECURSIVE tmp(c1,c2) AS (
    SELECT 1,(random()*20)::int 
    UNION ALL 
    SELECT c1+1,(random()*20)::int 
      FROM tmp WHERE c1 < 20
)
INSERT INTO t1 SELECT * FROM tmp;

【□】 小規模テーブル「t2」へ20件インサート


WITH RECURSIVE tmp(c1,c2) AS (
    SELECT 1,(random()*20)::int 
    UNION ALL 
    SELECT c1+1,(random()*20)::int 
      FROM tmp WHERE c1 < 20
)
INSERT INTO t2 SELECT * FROM tmp;


【□】 EXPLAIN (ANALYZE,VERBOSE,BUFFERS) SELECT * FROM t1 LEFT JOIN t2 ON t1.c2=t2.c1;

                                                     QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Hash Left Join  (cost=58.15..121.65 rows=2140 width=16) (actual time=0.030..0.051 rows=20 loops=1)
   Output: t1.c1, t1.c2, t2.c1, t2.c2
   Hash Cond: (t1.c2 = t2.c1)
   Buffers: shared hit=2
   ->  Seq Scan on public.t1  (cost=0.00..31.40 rows=2140 width=8) (actual time=0.005..0.017 rows=20 loops=1)
         Output: t1.c1, t1.c2
         Buffers: shared hit=1
   ->  Hash  (cost=31.40..31.40 rows=2140 width=8) (actual time=0.015..0.015 rows=20 loops=1)
         Output: t2.c1, t2.c2
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         Buffers: shared hit=1
         ->  Seq Scan on public.t2  (cost=0.00..31.40 rows=2140 width=8) (actual time=0.002..0.003 rows=20 loops=1)
               Output: t2.c1, t2.c2
               Buffers: shared hit=1
 Total runtime: 0.078 ms



上記の骨子は以下の通り
 Hash Left Join 
   ->  Seq Scan on public.t1  
   ->  Hash 
         ->  Seq Scan on public.t2 




※上記で使用しているランダム値生成は簡易的なものなので注意して下さい。
  詳細は別ページにて記載してあります。



以上