合 PG使用插件pg_prewarm实现数据预加热
Tags: PG优化插件pg_prewarm预加热
在MySQL中,在配置参数innodb_buffer_pool_dump_at_shutdown=1
后,若在正常关闭MySQL时,就可以将内存缓冲区的信息 dump到一个文件内部(该文件名为ib_buffer_pool,为MySQL 5.6新特性),然后启动时通过加载该文件内部的块(需要配置innodb_buffer_pool_load_at_startup=1
),实现对内存缓冲区的预热,从而提高数据库重启后的查询性能。
在PostgreSQL中,也有这种功能,只需要安装pg_prewarm插件即可。更好的消息是,从PG 9.4开始,pg_prewarm插件融入了 PostgreSQL发行版中,无需下载编译安装程序。
安装插件pg_prewarm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 | C:\Users\lhrxxt>psql -U postgres -h 192.168.66.35 -p 15433 Password for user postgres: psql (13.3) Type "help" for help. postgres=# select * from pg_available_extensions where name like '%prewarm%' order by name; name | default_version | installed_version | comment ------------+-----------------+-------------------+----------------------- pg_prewarm | 1.2 | | prewarm relation data (1 row) postgres=# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+------------------------------------------------------------------------ pageinspect | 1.8 | public | inspect the contents of database pages at a low level pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (3 rows) postgres=# create extension pg_prewarm ; CREATE EXTENSION postgres=# \dx List of installed extensions Name | Version | Schema | Description --------------------+---------+------------+------------------------------------------------------------------------ pageinspect | 1.8 | public | inspect the contents of database pages at a low level pg_prewarm | 1.2 | public | prewarm relation data pg_stat_statements | 1.8 | public | track planning and execution statistics of all SQL statements executed plpgsql | 1.0 | pg_catalog | PL/pgSQL procedural language (4 rows) postgres=# \dx pg_prewarm List of installed extensions Name | Version | Schema | Description ------------+---------+--------+----------------------- pg_prewarm | 1.2 | public | prewarm relation data (1 row) postgres=# \dx+ pg_prewarm Objects in extension "pg_prewarm" Object description ------------------------------------------------------- function autoprewarm_dump_now() function autoprewarm_start_worker() function pg_prewarm(regclass,text,text,bigint,bigint) (3 rows) postgres=# \sf+ pg_prewarm CREATE OR REPLACE FUNCTION public.pg_prewarm(regclass, mode text DEFAULT 'buffer'::text, fork text DEFAULT 'main'::text, first_block bigint DEFAULT NULL::bigint, last_block bigint DEFAULT NULL::bigint) RETURNS bigint LANGUAGE c PARALLEL SAFE 1 AS '$libdir/pg_prewarm', $function$pg_prewarm$function$ |
主要函数pg_prewarm
的参数含义如下:
regclass:要做prewarm的表名
mode:prewarm模式。prefetch表示异步地将数据预加载到os cache;read表示同步预取,最终结果和 prefetch 一样,但它是同步方式,支持所有平台;buffer表示同步读入PG的shared buffer,默认为 buffer
fork:relation fork的类型。一般用main,其他类型有visibilitymap和fsm,默认为main
first_block & last_block:first_block 表示开始 prewarm 的数据块,last_block 表示最后 prewarm 的数据块。表的first_block=0,last_block可通过pg_class的relpages字段获得
RETURNS int8:函数返回pg_prewarm处理的block数目(整型),pg_prewarm 函数返回的是加载后的数据块数
autoprewarm_dump_now
表示在服务器启动期间没有配置自动预热功能时,可以使用此命令启动自动预热工作程序。autoprewarm_start_worker
立马对 autoprewarm. blocks文件进行更新,如果自动预热进程当前没有运行,那么希望在下次重启之后运行它,这样做会很有用。
pg_prewarm使用
pg_prewarm 模块可以将数据预先加载到数据库缓存,也可以预先加载到操作系统缓存。
所以,预热有两种方式,
一种是手动调用pg_prewarm函数,用于将当前所需的数据装入内存。
另一种是自动执行,要要设置shared_preload_libraries参数。设置完毕后,系统将自动运行一个后台工作进程postgres: autoprewarm master
,它定期将shared_buffers中的内容写入到文件 autoprewarm. blocks
中,以便在重新启动数据库后,快速加载该文件内部的数据块,实现预热功能。
配置shared_preload_libraries参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | postgres=# show shared_preload_libraries; shared_preload_libraries -------------------------- pg_stat_statements (1 row) postgres=# alter system set shared_preload_libraries=pg_stat_statements,pg_prewarm; ALTER SYSTEM [pg13@lhrpg pgdata]$ pg_ctl restart waiting for server to shut down.... done server stopped waiting for server to start....2021-08-03 17:20:04.458 CST [6464] LOG: redirecting log output to logging collector process 2021-08-03 17:20:04.458 CST [6464] HINT: Future log output will appear in directory "pg_log". done server started [pg13@lhrpg pgdata]$ postgres=# show shared_preload_libraries; shared_preload_libraries -------------------------------- pg_stat_statements, pg_prewarm (1 row) |
注意:
1、对于shared_preload_libraries,多个参数不要整体放在单引号中,可以每个单引号内一个参数,例如:
1 2 3 | alter system set shared_preload_libraries=pg_stat_statements,pg_prewarm; -- 或 alter system set shared_preload_libraries='pg_stat_statements','pg_prewarm'; |
2、变量shared_preload_libraries指定一个或者多个要在服务器启动时预载入的共享库。它包含一个由逗号分隔的库名列表,其中每个名称都会按LOAD命令的方式解析。项之间的空格会被忽略,如果需要在库名中包含空格或者逗号,请把库名放在双引号内。这个参数只能在服务器启动时设置。如果指定的库没有找到,服务器将无法启动。
3、重启数据库后可以看到多了一个进程postgres: autoprewarm master
,如下所示:
1 2 3 4 5 6 7 8 9 10 11 | [pg13@lhrpg pgdata]$ ps -ef | grep pg13 pg13 6464 0 0 17:20 ? 00:00:00 /pg13/pg13/bin/postgres pg13 6465 6464 0 17:20 ? 00:00:00 postgres: logger pg13 6467 6464 0 17:20 ? 00:00:00 postgres: checkpointer pg13 6468 6464 0 17:20 ? 00:00:00 postgres: background writer pg13 6469 6464 0 17:20 ? 00:00:00 postgres: walwriter pg13 6470 6464 0 17:20 ? 00:00:00 postgres: autovacuum launcher pg13 6471 6464 0 17:20 ? 00:00:00 postgres: stats collector pg13 6472 6464 0 17:20 ? 00:00:00 postgres: autoprewarm master pg13 6473 6464 0 17:20 ? 00:00:00 postgres: logical replication launcher pg13 6679 6464 0 17:23 ? 00:00:00 postgres: postgres postgres 172.17.0.1(33765) idle |
文件autoprewarm.blocks:
1 2 | [pg13@lhrpg pgdata]$ ll -lrt autoprewarm.blocks -rw------- 1 pg13 postgres 3549 Aug 3 17:25 autoprewarm.blocks |
功能测试
我们来使用虚拟机测试一下,把shared_buffers为128MB,我们创建一个75MB的表。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | [pg13@lhrpg pgdata]$ pgbench -i -s5 dropping old tables... NOTICE: table "pgbench_accounts" does not exist, skipping NOTICE: table "pgbench_branches" does not exist, skipping NOTICE: table "pgbench_history" does not exist, skipping NOTICE: table "pgbench_tellers" does not exist, skipping creating tables... generating data (client-side)... 500000 of 500000 tuples (100%) done (elapsed 0.30 s, remaining 0.00 s) vacuuming... creating primary keys... done in 2.86 s (drop tables 0.00 s, create tables 0.01 s, client-side generate 1.38 s, vacuum 0.50 s, primary keys 0.97 s). [pg13@lhrpg pgdata]$ postgres=# show shared_buffers; shared_buffers ---------------- 128MB (1 row) postgres=# select pg_size_pretty(pg_total_relation_size('pgbench_accounts')); pg_size_pretty ---------------- 75 MB (1 row) postgres=# select count(*) from pgbench_accounts; count -------- 500000 (1 row) |
接下来进行全表扫描测试:
1 2 3 4 5 6 7 8 9 10 | postgres=# explain (analyze,buffers) select * from pgbench_accounts; QUERY PLAN --------------------------------------------------------------------------------------------------------------------------- Seq Scan on pgbench_accounts (cost=0.00..13197.00 rows=500000 width=97) (actual time=0.077..135.820 rows=500000 loops=1) Buffers: shared hit=2144 read=6053 Planning: Buffers: shared hit=9 dirtied=1 Planning Time: 0.196 ms Execution Time: 179.849 ms (6 rows) |
首次运行shared hit 2144,read 6053,时间需要180ms。