对MySQL数据库特定字段进行全文检索
一、背景
现有数据库某字段保存有大量文本,现在需要对多条数据进行全文检索,以确定哪些条数据包含关键字。此时可使用MySQL全文检索。在搜索之前,需要为响应的字段增加索引,这一步会产生较大的系统资源消耗,请谨慎操作。
使用方法如下文。
二、确认要全文检索的字段
首先查看表结构。
use wordpress;
desc wp_posts;
返回结果如下。
+-----------------------+-----------------+------+-----+---------------------+----------------+
| Field | Type | Null | Key | Default | Extra |
+-----------------------+-----------------+------+-----+---------------------+----------------+
| ID | bigint unsigned | NO | PRI | NULL | auto_increment |
| post_author | bigint unsigned | NO | MUL | 0 | |
| post_date | datetime | NO | | 0000-00-00 00:00:00 | |
| post_date_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
| post_content | longtext | NO | MUL | NULL | |
| post_title | text | NO | | NULL | |
| post_excerpt | text | NO | | NULL | |
| post_status | varchar(20) | NO | | publish | |
| comment_status | varchar(20) | NO | | open | |
| ping_status | varchar(20) | NO | | open | |
| post_password | varchar(255) | NO | | | |
| post_name | varchar(200) | NO | MUL | | |
| to_ping | text | NO | | NULL | |
| pinged | text | NO | | NULL | |
| post_modified | datetime | NO | | 0000-00-00 00:00:00 | |
| post_modified_gmt | datetime | NO | | 0000-00-00 00:00:00 | |
| post_content_filtered | longtext | NO | | NULL | |
| post_parent | bigint unsigned | NO | MUL | 0 | |
| guid | varchar(255) | NO | | | |
| menu_order | int | NO | | 0 | |
| post_type | varchar(20) | NO | MUL | post | |
| post_mime_type | varchar(100) | NO | | | |
| comment_count | bigint | NO | | 0 | |
+-----------------------+-----------------+------+-----+---------------------+----------------+
23 rows in set (0.06 sec)
执行SQL语句,确认内容是在正确的字段。
select * from wp_posts where post_status='publish' limit 1
查询后,发现内容是存储在名为post_content
字段。
三、查询字段是否有索引
以表明wp_posts
为例,执行如下命令:
show index from wp_posts;
返回结果如下:
mysql> show index from wp_posts;
+----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment | Index_comment | Visible | Expression |
+----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
| wp_posts | 0 | PRIMARY | 1 | ID | A | 3731 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | post_name | 1 | post_name | A | 1907 | 191 | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | type_status_date | 1 | post_type | A | 11 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | type_status_date | 2 | post_status | A | 12 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | type_status_date | 3 | post_date | A | 3731 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | type_status_date | 4 | ID | A | 3731 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | post_parent | 1 | post_parent | A | 342 | NULL | NULL | | BTREE | | | YES | NULL |
| wp_posts | 1 | post_author | 1 | post_author | A | 2 | NULL | NULL | | BTREE | | | YES | NULL |
+----------+------------+------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+---------------+---------+------------+
8 rows in set (0.04 sec)
mysql>
可看到这个索引列表中Column_name
字段名称位置,没有上一步搜索的post_content
字段。因此就需要为其添加索引。
四、为字段增加索引
现在给post_content
字段添加索引。命令如下。
ALTER TABLE wp_posts
ADD FULLTEXT INDEX blog_text (post_content);
根据数据量,这一步消耗的时间可能很长,对CPU消耗很大,需要在非生产高峰、维护窗口执行。CPU使用情况可以通过RDS界面进行监控。
本例中数据量很小,因此几十秒执行完毕,返回结果如下:
Query OK, 0 rows affected, 1 warning (31.13 sec)
Records: 0 Duplicates: 0 Warnings: 1
五、使用全文检索
查询语句如下,其中post_content
是列的名字,查询内容是关键字video
。命令如下。统计有多少条记录匹配。
SELECT count(post_title)
FROM wp_posts
WHERE MATCH(post_content) AGAINST('video')
and post_status='publish';
返回结果有49条。
+---------------------+
| count(post_content) |
+---------------------+
| 49 |
+---------------------+
1 row in set (0.06 sec)
查询其标题。
SELECT post_title
FROM wp_posts
WHERE MATCH(post_content) AGAINST('video')
and post_status='publish'
ORDER by post_modified desc
limit 10;
由此确认全文检索正常。
最后修改于 2025-09-04