pg_rman 1.3官方文档

0    144    1

Tags:

👉 本文共约15090个字,系统预计阅读时间或需57分钟。

http://ossc-db.github.io/pg_rman/index.html

Name

pg_rman -- manages backup and recovery of PostgreSQL.

Synopsis

pg_rman has the features below:

  • Takes a backup of entire database including tablespaces with just one command.
  • Can recovery from backup with just one command.
  • Supports incremental backup and compression of backup files so that it takes less disk spaces.
  • Manages backup versions and shows a catalog of the backups.
  • Supports storage snapshot.

DATE is the start time of the target backup in ISO-format (YYYY-MM-DD HH:MI:SS). Prefix match is used to compare DATE and backup files.

pg_rman supports the following commands. See also Options for details of OPTIONS.

    • Initialize a backup catalog.
    • Take an online backup.
    • Do restore.
    • Show backup history. The detail option shows with additional information of each backups.
    • Validate backup files. Backups without validation cannot be used for restore and incremental backup.
    • Delete backup files.
    • Remove deleted backups from backup catalog.

Description

pg_rman is a utility program to backup and restore PostgreSQL database. It takes a physical online backup of whole database cluster, archive WALs, and server logs.

pg_rman supports getting backup from standby-site with PostgreSQL 9.0 later, also supports storage snapshot backup.

Initialize a backup catalog

First, you need to create "a backup catalog" to store backup files and their metadata.

It is recommended to setup log_directory, archive_mode and archive_command in postgresql.conf before initialize the backup catalog. If the variables are initialized, pg_rman can adjust the configuration file to the setting. In this case, you have to specify the database cluster path for PostgreSQL. Please specify it in PGDATA environmental variable or -D/--pgdata option.

Backup

The mode of backup can be one of the following types.

  • Full backup
    • Backup a whole database cluster.
  • Incremental backup
    • Backup only files or pages modified after the last verified backup with the same timeline.
  • Archive WAL backup
    • Backup only archive WAL files.

Pg_rman also can backup PostgreSQL server log files.

Validate backup data

It is necessary to validate the data backuped by pg_rman. Pg_rman uses file size check and CRC for validation.

It is recommended to verify backup files as soon as possible after backup. Unverified backup cannot be used in restore nor in incremental backup.

View backup information

The show command outputs backup lists.

show detail command shows more detail information.

The fields are:

  • StartTime : timestamp when backup was started
  • EndTime : timestamp when backup ended
  • Mode : mode of backup.
    • FULL : full backup
    • INCR : incremental backup
    • ARCH : archive WAL backup
  • Data : size of read data files
  • ArcLog : size of read archive WAL files
  • SrvLog : size of read server log files
  • Size/Total : size of backup ( = written size)
  • Compressed : whether backup is compressed or not
  • TLI/CurTLI : timeline ID of PostgreSQL
  • ParentTLI : former timeline ID of PostgreSQL
  • Status: status of backup. Possible values are:
    • OK : backup is done successfully and validated.
    • DONE : backup is done successfully.
    • RUNNING : backup is still running.
    • DELETING : backup is being deleted.
    • DELETED : backup has been deleted.
    • ERROR : some errors occur during backup.
    • CORRUPT : backup is unavailable because it does not pass validation.

And more, when you specify the date in “Start” field, you can see the detail information of the backup.

Restore

pg_rman restore the backuped data into target database cluster path.

PostgreSQL server should be stopped before restoring. In addition, do not erase an original database cluster, because pg_rman has to check the timeline ID or data checksum status from it. Restore command will save unarchived transaction log and delete all database files. You can retry recovery until a new backup is taken. After restoring files, pg_rman create recovery.conf in $PGDATA. The conf file contains parameters to recovery, and you can also modify the file if needed.

pg_rman configure guc parameters related recovery when restoring. The configuration file depends on PostgreSQL's version and pg_rman's version. Please start a server and execute PITR after modifying the file manually if you need.

  • PostgreSQL's version is lower than 12: pg_rman creates and configures $PGDATA/recovery.conf
  • PostgreSQL's version is 12 or higher, and pg_rman's version is 1.3.12 or less: pg_rman appends recovery-related configurations to $PGDATA/postgresql.conf and creates $PGDATA/recovery.signal.
  • PostgreSQL's version is 12 or higher, and pg_rman's version is higher than 1.3.12: pg_rman creates and configures $PGDATA/pg_rman_recovery.conf, and appends include directive to the $PGDATA/postgresql.conf. If there is an include directive added when pg_rman was restored in the past, delete it. And it creates $PGDATA/recovery.signal.

It is recommended to take a full backup as soon as possible after recovery is succeeded and to remove the recovery-related parameters configured by pg_rman manually. The reason is that there is a case that even after recovery is done, PostgreSQL doesn't work with HA cluster software since recovery.conf is integrated to postgresql.conf after PostgreSQL's version is 12 or higher. Pacemaker which is a HA cluster software start postgresql server as standby at first, after that it decides it should promote or not. So, the postgresql server doesn't start properly because the recovery-related parameter configured by pg_rman works as valid values unexpectedly. For example, in case using PostgreSQL's version is 12 or higher, and pg_rman's version is higher than 1.3.12, you need to remove an include directive in $PGDATA/postgresql.conf and $PGDATA/pg_rman_recovery.conf.

If --recovery-target-timeline is not specified, the last checkpoint’s TimeLineID in control file ($PGDATA/global/pg_control) will be a restore target. If pg_control is not present, TimeLineID in the full backup used by the restore will be a restore target.

When specifying --recovery-target-time, make sure to specify a timestamp greater than (or equal to) the EndTime of the full backup that you want to use as the base.

If the archive WALs are not compressed at the time of backup, archive WALs that do not exist in the archive storage area will be restored as symbolic links. When used in combination with peripheral tools (ex. PG-REX) that are not designed for this behavior, please specify the option (--hard-copy) to perform physical copying.

Delete backups

The delete command deletes all backup files before the specified date not required by other incremental backups. Incremental backups depend on earlier validated full backup.

The following example deletes unneeded backup files to recovery at 12:00 11, September 2009.

Remove deleted backups

Though delete command removes actual data from file system, there remains some catalog information of deleted backups. In order to remove this, execute purge command.

Standby-site Backup

If you use replication feature on PostgreSQL 9.0 later, you can get backup from standby-site. The basic usage is the same as when using it with a single master server, so only the points that need attention are described.

Archive WALs must also be taken when you take a backup of the standby-site. So, you need to prepare a shared disk and so on so that the archive area of the master can be accessed from the standby, or set archive_mode to 'always' at the standby-site.

In the latter case, copy the primary's archive WALs (including history file) when the standby-site is created to make sure that you can take back up all the files required for restoring. You can delete old archive WALs at the time of backup using --keep-arclog-files / --keep-arclog-days. But, since the deletion target is only the one which it take a backup, the master's archived WALs are not deleted if you take a backup at standby-site.

You should specify different options from usual use for getting backup from standby-site. In detail, you should specify the database cluster on standby-site by -D/--pgdata option. And you should specify information on master-site by connection options (-d/--dbname, -h/--host, -p/--port). In addition, you should specify information to connect standby-site by standby connection options (--standby-host, --standby-port).

Here shows an example with the below environment.

  • the hostname of master-site: master
  • the hostname of standby-site: localhost
  • the port number of standby-site: 5432
  • the database cluster path of standby-site: /home/postgres/pgdata_sby

Then, the backup from standby-site can be done with the below command:

Examples

In this example, let's consider about PostgreSQL server with the following configurations.

And the PGDATA and BACKUP_PATH are set as environment variables.

Initialize a backup catalog.

By this, the configuration file for pg_rman, named pg_rman.init, is created. All the commands of pg_rman load configurations from this file as default.

For this example, we use the following configurtaions.

Then, do a backup. It should be start from a full backup. Here, we will also take server log files.

Check the result by show command.

The status of the backup we have just taken is DONE. This is because we does not do validate yet. So, do validate command next.

Now the status has been changed to OK.

Let's try to restore the backup data. Before try to do it, PostgreSQL server should be stopped.

The pg_rman has configured recovery-related parameters. If necessary, modify them as you wanted. In this example, we use this without modifications and will try to do PITR to latest database status.

Options

pg_rman accepts the following command line parameters. Some of them can be also specified as environment variables.

Common options

As a general rule, paths for data location need to be specified as absolute paths; relative paths are not allowed.

  • -D PATH / --pgdata=PATH

    • The absolute path of database cluster. Required on backup and restore.
  • -A PATH / --arclog-path=PATH

    • The absolute path of archive WAL directory. Required on backup and restore.
  • -S PATH / --srvlog-path=PATH

    • The absolute path of server log directory. Required on backup with server logs and restore.
  • -B PATH / --backup-path=PATH

    • The absolute path of backup catalog. Always required.
  • -c / --check

    • If specifed, pg_rman doesn’t perform actual jobs but only checks parameters and required resources. The option is typically used with --verbose option to verify the operation.
  • -v / --verbose

    • If specified, pg_rman works in verbose mode.
  • -P / --progress

    • If specified, pg_rman keep showing number of files it processed during backup or restore.

    Backup options

    • -b { full | incremental | archive } / --backup-mode={ full | incremental | archive }

    • Specify backup target files. Available options are:

      backup,

      backup, and

      backup. Abbreviated forms (prefix match) are also available. For example,

      means

      backup.

      • full : Whole database backup and archive backup
      • incremental : Incremental backup and archive backup
      • archive : Only archive backup
    • -s / --with-serverlog

    • Backup server log files if specified.

    • -Z / --compress-data

    • Compress backup files with zlib if specified. When the option is omitted, no compression is performed. When compressing, all backup files are compressed, including configuration files and backup_label.

    • -C / --smooth-checkpoint

    • Checkpoint is performed on every backups. If the option is specified, do smooth checkpoint then. See also the second argument for pg_start_backup().

    • --keep-data-generations / --keep-data-days

    • Specify how long backuped data files will be kept. --keep-data-generations means number of backup generations. --keep-data-days means days to be kept. If these two options are specified together, old files exceeded both settings are deleted.

    • --keep-arclog-files / --keep-arclog-days

    • Specify how long archive WAL files already archived will be kept. --keep-arclog-files means number of files. --keep-arclog-days means days to be kept. When you do backup, only files which have been already backuped and exceeded specified condition are deleted from archive log directory ($ARCLOG_PATH). If these two options are given together, pg_rman deletes files which are old enough against both conditions.

    • --keep-srvlog-files / --keep-srvlog-days

    • Specify how long backuped serverlog files will be kept. --keep-srvlog-files means number of files. --keep-srvlog-days means days to be kept. When you do backup, only files exceeded specified condition are deleted from server log directory (log_directory). This option works when you specify --with-serverlog and --srvlog-path options in backup command. If these two options are given toghether, pg_rman deletes files which are old enough against both conditions.

    Restore options

    The parameters which are started with ––recovery are same as Recovery Target parameters.

    • --recovery-target-timeline TIMELINE
    • Specifies recovering into a particular timeline. If not specified, the current timeline from ($PGDATA/global/pg_control) is used.
    • --recovery-target-time TIMESTAMP
    • This parameter specifies the time stamp up to which recovery will proceed. If not specified, continue recovery to the latest time.
    • --recovery-target-xid XID
    • This parameter specifies the transaction ID up to which recovery will proceed. If not specified, continue recovery to the latest xid.
    • --recovery-target-inclusive
    • Specifies whether we stop just after the specified recovery target (true), or just before the recovery target (false). Default is true.
    • --recovery-target-action {{ pause | promote | shutdown }}
    • Specifies what action the server should take once the recovery target is reached. The default is pause, which means recovery will be paused. promote means the recovery process will finish and the server will start to accept connections. Finally shutdown will stop the server after reaching the recovery target. This option is provided version higher than 1.3.12.

The following parameter determines the behavior of restore.

  • --hard-copy

    • The archive WAL are copied to archive WAL storage area. If not specified, pg_rman makes symbolic link to archive WAL where are in the backup catalog directory. If the files are compressed using -Z option when to take a backup, the files are always copied to archive WAL storage area after decompressed.
  • -G PATH / --pgconf-path=PATH

    • The recovery-related parameters are configured when restoring. If you manage the postgresql.conf in different location of database cluster using

    data_directory

    and so on, specify the absolute path. This option is provided version higher than 1.3.14.

Catalog options

  • -a / --show-all
    • Show also deleted backups.

Connection options

Parameters to connect PostgreSQL server.

  • -d DBNAME / --dbname=DBNAME
    • The database name to execute pg_start_backup() and pg_stop_backup().
  • -h HOSTNAME / --host=HOSTNAME
    • Specifies the host name of the machine on which the server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket.
  • -p PORT / --port=PORT
    • Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections.
  • -U USERNAME / --username=USERNAME
    • User name to connect as.
  • -w / --no-password
    • Never issue a password prompt. If the server requires password authentication and a password is not available by other means such as a .pgpass file, the connection attempt will fail. This option can be useful in batch jobs and scripts where no user is present to enter a password.
  • -W / --password
    • Force pg_rman to prompt for a password before connecting to a database. This option is never essential, since pg_rman will automatically prompt for a password if the server demands password authentication. However, pg_rman will waste a connection attempt finding out that the server wants a password. In some cases it is worth typing -W to avoid the extra connection attempt.

Standby connection options

Parameters to connect standby server. They are used only when you get backup from the standby-site.

  • --standby-host
    • Specifies the host name of the machine on which the standby server is running. If the value begins with a slash, it is used as the directory for the Unix domain socket.
  • --standby-port
    • Specifies the TCP port or local Unix domain socket file extension on which the server is listening for connections.

Generic options

  • --help
    • Print help, then exit.
  • -V / --version
    • Print version information, then exit.
  • -! / --debug
    • Show debug information.

Way to pass options

Some of parameters can be specified in command line arguments, environment variables or configuration file as follows:

ShortLongEnvironment variableConf fileDescriptionRemarks
-h–hostPGHOSTdatabase server host or socket directory
-p–portPGPORTdatabase server port
-d–dbnamePGDATABASEdatabase to connect
-U–usernamePGUSERuser name to connect as
PGPASSWORDpassword used to connect
-w–no-passwordnever prompt for password
-W–passwordforce password prompt
-D–pgdataPGDATAYeslocation of the database storage area
-B–backup-pathBACKUP_PATHYeslocation of the backup storage area
-A–arclog-pathARCLOG_PATHYeslocation of archive WAL storage area
-S–srvlog-pathSRVLOG_PATHYeslocation of server log storage area
-b–backup-modeBACKUP_MODEYesbackup mode (full, incremental, or archive)
-s–with-serverlogWITH_SERVERLOGYesalso backup server log filesspecify boolean type in environmental variable or configuration file
-Z–compress-dataCOMPRESS_DATAYescompress data backup with zlibspecify boolean type in environmental variable or configuration file
-C–smooth-checkpointSMOOTH_CHECKPOINTYesdo smooth checkpoint before backupspecify boolean type in environmental variable or configuration file
–standby-hostSTANDBY_HOSTYesstandby server host or socket directory
–standby-portSTANDBY_PORTYesstandby server port
–keep-data-generationsKEEP_DATA_GENERATIONSYeskeep GENERATION of full data backup
–keep-data-daysKEEP_DATA_DAYSYeskeep enough data backup to recover to DAY days age
–keep-srvlog-filesKEEP_SRVLOG_FILESYeskeep NUM of serverlogs
–keep-srvlog-daysKEEP_SRVLOG_DAYSYeskeep serverlog modified in DAY days
–keep-arclog-filesKEEP_ARCLOG_FILESYeskeep NUM of archived WAL
–keep-arclog-daysKEEP_ARCLOG_DAYSYeskeep archived WAL modified in DAY days
–recovery-target-timelineRECOVERY_TARGET_TIMELINEYesrecovering into a particular timeline
–recovery-target-xidRECOVERY_TARGET_XIDYestransaction ID up to which recovery will proceed
–recovery-target-timeRECOVERY_TARGET_TIMEYestime stamp up to which recovery will proceed
–recovery-target-inclusiveRECOVERY_TARGET_INCLUSIVEYeswhether we stop just after the recovery target
–recovery-target-actionRECOVERY_TARGET_ACTIONYesaction the server should take once the recovery target is reachedThis option is provided versions higher than 1.3.12
–hard-copyHARD_COPYYeshow to restore archive WALspecify boolean type in environmental variable or configuration file
  • The names of variable in configuration file are the same as long names or names of environment variables.
  • The password can not be specified in command line and configuration file for security reason.

This utility, like most other PostgreSQL utilities, also uses the environment variables supported by libpq (see Environment Variables)

Restrictions

pg_rman has the following restrictions.

  • Requires to read database cluster directory and write backup catalog directory. For example, you need to mount the disk where backup catalog is placed with NFS from database server.
  • Block sizes of pg_rman and server should be matched. BLCKSZ and XLOG_BLCKSZ also should be matched.
  • If there are some unreadable files/directories in database cluster directory, WAL directory or archived WAL directory, the backup or restore would be failed.
  • When taking an incremental backup, pg_rman check the timeline ID of the target database whether it is the same with the one of the full backup in backup list. But, pg_rman does not check whether the data itself is same with the full backup in backup list. So, you can take an incremental backup over the full backup against the database which has the same timeline ID but has different data.

Getting backup from standby-site, pg_rman has the follow restrictions too.

  • The environment of replication should be built right, or the backup will not finish.
  • You can’t get backups on master and standby at the same time.
  • You can’t get backups on multi standbys at the same time too.
  • Basically, the backup from standby-site is used for restoring on MASTER. pg_rman doesn’t treat the backup as restoring on standby automatically.
  • If you want to restore the backup on STANDBY, you have to manage archive logs with your self.

When using storage snapshot, pg_rman has the following restrictions too.

  • If your snapshot does not have any file update time, incremental backup is same with full backup.
  • Because pg_rman judges performing full backup or incremental backup by update time for files. If files don’t have update time because of storage snapshot specification, pg_rman performs full backup every time.
  • You can’t backup for one side works storage with split mirror snapshot.
  • Before you execute pg_rman, you should perform storage “RESYNC”.
  • After pg_rman performs backup with split mirror snapshot, storeage will be “SPLITTED”(works on one side).
    • pg_rman perform SPLIT command for getting snapshot, but doesn’t perform RESYNC command.
  • You cant’t get snapshot from different vendor storages in a time.
  • You cant’t use some vendor storages which have different commands for getting snapshot.
  • The script and commands for getting storage snapshot should be executable.
  • It’s expected to have authority of root for getting snapshot or mounting volumes. So a user, performs pg_rman, is granted to execute any commands in the script.
  • If you use LVM(Logical Volume Manager), it’s needed root authority for mount, umount, lvcreate, lvremove, lvscan commands. You should granted to these commands with sudo command to non-password executable.

Details

Recovery to Point-in-Time

pg_rman can recover to point-in-time if timeline, transaction ID, or timestamp are specified in recovery. pg_xlogdump(9.3 or later)xlogdump(9.2 or before) is an useful tool to check the contents of WAL files and determine when to recover. See Continuous Archiving and Point-in-Time Recovery (PITR) for the details.

Configuration file

Setting parameters can be specified with form of “name=value” in the configuration file. Quotes are required if the value contains whitespaces. Comments starts with “#”. Whitespaces and tabs are ignored excluding values.

Exit codes

pg_rman returns exit codes for each error status.

CodeNameDescription
0SUCCESSSucceeded.
1HELPPrint a help, then exit.
2ERRORGeneric error.
3FATALExit because of repeated errors
4PANICUnknown critical condition.
10ERROR_SYSTEMI/O or system error.
11ERROR_NOMEMOut of memory.
12ERROR_ARGSInvalid input parameters.
13ERROR_INTERRUPTEDInterrupted by user. (Ctrl+C etc.)
14ERROR_PG_COMMANDSQL error.
15ERROR_PG_CONNECTCannot connect to PostgreSQL server.
20ERROR_ARCHIVE_FAILEDCannot archive WAL files.
21ERROR_NO_BACKUPBackup file not found.
22ERROR_CORRUPTEDBackup file is broken.
23ERROR_ALREADY_RUNNINGCannot start because another pg_rman is running.
24ERROR_PG_INCOMPATIBLEVersion conflicted with PostgreSQL server.
25ERROR_PG_RUNNINGCannot restore because PostgreSQL server is running.
26ERROR_PID_BROKENpostmaster.pid file is broken.

External Scripts

This is the script to getting snapshot and mounting file systems. If you want to add outer scripts, you should make your script corresponding outer script interface according to referring manuals of the storage. Please refer Interface Specification about what you should make.

Outer script performs some operation for getting several snapshots in a time execution.

If you want to use outer script, you should set the script in backup catalog directory and rename it to “snapshot_script”.

A sample outer script is released for LVM(Logical Volume Manager).

Commands Specification

  • Input

    • first argument

    • identifier for performing

    • second argument

    • If you specified cleanup, error occurring doesn’t stop the process. just output warning messages.
    • this argument is for resync, umount, unfreeze only.
  • Output

    • Making snapshot volume by performing split operation.
    • Freezing filesystem I/O by performing freeze operation.
    • Mounting to snapshot volume by performing mount operation.
    • standard output
    • output tablespace name in the snapshot by performing split operation.
    • output tablespace name and its directory by performing mount operation. The template is =
    • The command performs without any errors, output “SUCCESS”, otherwise output nothing. If the command is split or mount, output in last line.
    • standard error output
    • output log messages.
    • return value
    • Not Specified.

Interface Specification

    • Making procedure for getting snapshots.
    • Output tablespace name in the snapshots to standard output.
    • If database cluster is included in a snapshot, the tablespace name should be “PG-DATA”.
    • If some tablespaces are included in a snapshot, output them with multi lines.
    • Making procedure for removing snapshot made by split operation.
    • If you don’t need to remove snapshot, like split mirror snapshot, you don’t make any procedure.
    • If cleanup is specified and occurring errors, output warning messages and continue to get rest snapshots.
    • Making procedure for mounting the snapshot made by split operation to the filesystem.
    • output tablespace name and its directory(absolute path) to standard output. The template is [=].
    • If database cluster is included in a snapshot, the tablespace name should be “PG-DATA”.
    • If some tablespaces are included in a snapshot, output them with multi lines.
    • Making procedure for unmounting the snapshot made by mount operation.
    • If cleanup is specified and occurring errors, output warning messages and continue to unmount rest snapshots.
    • Making procedure for freezing filesystem IO.
    • If your storage doesn’t need to freeze filesystem IO, you don’t make any procedure.
    • Making procedure for unfreezing filesystem IO by performing freeze operation.
    • If your storage doesn’t need to freeze filesystem IO, you don’t make any procedure.
    • If cleanup is specified and occurring errors, output warning messages and continue to unfreeze rest filesystems.

Explanation for sample script for LVM(Logical Volume Manager)

  • split

    • perform lvcreate command as root authority against a volume for getting snapshot.

    • Above example is getting snapshot for logical volume “LogVolume00”.

    • Snapshot name, snapshot size and logical volume for getting snapshot are parameterable.

    • Output tablespace names for backup to standard output.

  • resync

    • perform lvremove command as root authority against a volume for getting snapshot.

  • mount

    • perform mount command as root authority against a volume for getting snapshot.

    • Above example is mounting snapshot volume made by split operation to “/mnt/snapshot_lvm/pgdata”.

    • Mounting directory name can be parameterized.

    • Output tablespace names and corresponding mounting direcotry name to standard output.

  • umount

    • perform umount command as root authority against a volume for getting snapshot.

  • freeze

    • noting to do. (not need file system freeze)
  • unfreeze

    • noting to do. (not need file system freeze)

Download

You can download pg_rman rpm packages and source from: Click here to download pg_rman

Installation

pg_rman can be installed as same as standard contrib modules.

No need to register to databases.

Build from source

The module can be built with pgxs.

Install from rpm package

Download rpm which name contains the PostgreSQL version and OS version of your environment.

Requirements

  • PostgreSQL

    PostgreSQL 12, 13, 14, 15, 16

  • OS

    RHEL 7, 8, 9

See Also

Backup and Restore

标签:

Avatar photo

小麦苗

学习或考证,均可联系麦老师,请加微信db_bao或QQ646634621

您可能还喜欢...

发表回复