在Redshift上使用comment语句

一、背景

Redshift是用于大量数据分析的列式数据仓库。Redshift并不支持所有PostgreSQL的语法。

关于Redshift支持的的限制,可参考这里 Unsupported PostgreSQL functions 的文档。

在这篇文章中,明确提到Comment information functions是不被支持的。

不过,在Redshift上可使用comment命令来为数据库的表、列添加注释。

二、添加注释

1、为表添加注释

请替换如下命令和表名为实际值。

COMMENT ON TABLE category IS 'This table stores tickets category for sales';

添加完毕返回如下结果:

COMMENT executed successfully
Execution time: 0.17s

2、为列添加注释

请替换如下命令和表名为实际值。

COMMENT ON COLUMN category.catname IS 'Short name of the game and show.';

添加完毕返回如下结果:

COMMENT executed successfully
Execution time: 0.17s

三、查询注释

1、查看表定义

Redshift由于不支持PostgreSQL的注释,因此即便通过comment语句添加注释成功,再次查询表结构时候,也不会显示注释。

执行如下命令查看表定义。

show table category;

返回结果如下。

CREATE TABLE public.category (
    catid smallint NOT NULL ENCODE raw distkey,
    catgroup character varying(10) ENCODE lzo,
    catname character varying(10) ENCODE lzo,
    catdesc character varying(50) ENCODE lzo
)
DISTSTYLE KEY
SORTKEY ( catid );  

可以看到确实没有包含注释。

2、通过查看数据字典显示表注释和列注释

WITH tables AS (
    SELECT c.oid,
           ns.nspname as schema_name,
           c.relname as table_name,
           d.description as table_description
    FROM pg_catalog.pg_class AS c
    JOIN pg_catalog.pg_namespace AS ns
      ON c.relnamespace = ns.oid
    LEFT JOIN pg_catalog.pg_description d
      on c.oid = d.objoid
     and d.objsubid = 0
     WHERE ns.nspname not in ('pg_catalog') 
)
SELECT 
       c.table_schema as schema_name,
       c.table_name AS table_name,
       t.table_description,
       c.column_name AS column_name,
       c.ordinal_position,
       d.description as column_description
from tables t
join information_schema.columns c
    on c.table_schema = t.schema_name
   and c.table_name = t.table_name
left join pg_catalog.pg_description d
   ON d.objoid = t.oid
  AND d.objsubid = c.ordinal_position
  AND d.objsubid > 0
where 1=1
and coalesce(table_description, column_description) is not null
order by
    t.schema_name,
    t.table_name,
    c.ordinal_position

返回结果如下:

schema_name | table_name | table_description                            | column_name | ordinal_position | column_description              
------------+------------+----------------------------------------------+-------------+------------------+---------------------------------
public      | category   | This table stores tickets category for sales | catid       |                1 |                                 
public      | category   | This table stores tickets category for sales | catgroup    |                2 |                                 
public      | category   | This table stores tickets category for sales | catname     |                3 | Short name of the game and show.
public      | category   | This table stores tickets category for sales | catdesc     |                4 |                                 

3、显示列注释和更多信息

显示列的注释

请替换如下命令的Schema名称和表名为实际值。

SELECT c.table_schema,
       c.table_name,
       c.column_name,
       c.ordinal_position,
       c.column_default,
       c.data_type,
       d.description
FROM information_schema.columns c
  INNER JOIN pg_class c1 ON c.table_name = c1.relname
  INNER JOIN pg_catalog.pg_namespace n
          ON c.table_schema = n.nspname
         AND c1.relnamespace = n.oid
  LEFT JOIN pg_catalog.pg_description d
         ON d.objsubid = c.ordinal_position
        AND d.objoid = c1.oid
WHERE c.table_schema = 'public'
AND   c.table_name = 'category'

返回结果如下:

table_schema | table_name | column_name | ordinal_position | column_default | data_type         | description                     
-------------+------------+-------------+------------------+----------------+-------------------+---------------------------------
public       | category   | catid       |                1 |                | smallint          |                                 
public       | category   | catgroup    |                2 |                | character varying |                                 
public       | category   | catname     |                3 |                | character varying | Short name of the game and show.
public       | category   | catdesc     |                4 |                | character varying |                                 

4、只输出有注释的列

执行如下SQL。(不需要替换Schema和表名)

WITH tables AS
(
  SELECT OID, relname AS TABLE FROM pg_class
),
columns AS
(
  SELECT ordinal_position AS objsubid,
         table_name AS TABLE,
         column_name AS COLUMN
  FROM information_schema.columns
)
SELECT t.table,
       c.COLUMN,
       d.description
FROM pg_catalog.pg_description d
  LEFT JOIN tables t ON d.objoid = t.oid
  LEFT JOIN COLUMNS c
         ON d.objsubid = c.objsubid
        AND t.table = c.table
WHERE d.objsubid > 0

即可返回有注释的列。

table    | column  | description                     
---------+---------+---------------------------------
category | catname | Short name of the game and show.

四、参考文档

https://docs.aws.amazon.com/redshift/latest/dg/r_COMMENT.html

https://gist.github.com/alexanderlz/7302623