原 MPP架构之Greenplum的安装配置高级版(企业配置)
Tags: 原创GreenPlum安装部署MPP分布式数据库企业数据仓库
简介
操作系统要求
Greenplum Database 6 runs on the following operating system platforms:
- Red Hat Enterprise Linux 64-bit 8.7 or later (As of Greenplum Database version 6.20. See the following Note)
- Red Hat Enterprise Linux 64-bit 7.x (See the following Note.)
- Red Hat Enterprise Linux 64-bit 6.x
- CentOS 64-bit 7.x
- CentOS 64-bit 6.x
- Ubuntu 18.04 LTS
- Oracle Linux 64-bit 7, using the Red Hat Compatible Kernel (RHCK)
依赖包要求
GPDB6 ON RHEL/CentOS 6/7 systems
Greenplum Database 6 requires the following software packages on RHEL/CentOS 6/7 systems which are installed automatically as dependencies when you install the Greenplum RPM package)
- apr
- apr-util
- bash
- bzip2
- curl
- krb5
- libcgroup (RHEL/CentOS 6)
- libcgroup-tools (RHEL/CentOS 7)
- libcurl
- libevent
- libxml2
- libyaml
- zlib
- openldap
- openssh-client
- openssl
- openssl-libs (RHEL7/Centos7)
- perl
- readline
- rsync
- R
- sed (used by
gpinitsystem
) - tar
- zip
VMware Greenplum Database 6 client software requires these operating system packages:
- apr
- apr-util
- libyaml
- libevent
Greenplum Database 6 uses Python 2.7.18, which is included with the product installation (and not installed as a package dependency).
软件下载
可以从以下2个地方下载:
1、从 Greenplum 的 GitHub 页面(https://github.com/greenplum-db/gpdb/releases)下载RPM 包
2、需要注册并登录到 Pivotal 公司官网(https://network.pivotal.io/products/vmware-tanzu-greenplum)进行下载
安装包大约66MB,如下:
1 2 3 4 | wget https://github.com/greenplum-db/gpdb/releases/download/6.25.3/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm -- 加速 wget https://ghproxy.com/https://github.com/greenplum-db/gpdb/releases/download/6.25.3/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm |
环境申请
本文以1个master+1个standby master,4个segment的集群示例,OS均为CentOS 7.6.1810,采用docker初始化环境。
每个segment主机上有4个Primary Segment Instance 和 4个Mirror Segment Instance共8个实例。
所以,该集群共有16个Primary实例和16个Mirror实例,再加1个master实例和1个standby master实例共34个数据库实例。
master对外服务的端口号为5432,其它segment的端口号为primary从6000-6003,对应的mirror为7000-7003
IP | 主机名 | 内存 | CPU | 硬盘 | 操作系统 | 数据目录 | 端口 | 备注 |
---|---|---|---|---|---|---|---|---|
172.72.6.50 | mdw | 64G | 32 | 200G | CentOS 7.6 | /opt/greenplum/data/master | 5432,28080 | Master host |
172.72.6.51 | smdw | 64G | 32 | 200G | CentOS 7.6 | /opt/greenplum/data/master | 5432,28080 | Standby host |
172.72.6.52 | sdw1 | 64G | 32 | 4T | CentOS 7.6 | /opt/greenplum/data/primary /opt/greenplum/data/mirror | 主:6000-6003 镜像:7000-7003 | segment host1,配置4个主实例+4个镜像实例 |
172.72.6.53 | sdw2 | 64G | 32 | 4T | CentOS 7.6 | /opt/greenplum/data/primary /opt/greenplum/data/mirror | 主:6000-6003 镜像:7000-7003 | segment host2,配置4个主实例+4个镜像实例 |
172.72.6.54 | sdw3 | 64G | 32 | 4T | CentOS 7.6 | /opt/greenplum/data/primary /opt/greenplum/data/mirror | 主:6000-6003 镜像:7000-7003 | segment host3,配置4个主实例+4个镜像实例 |
172.72.6.55 | sdw4 | 64G | 32 | 4T | CentOS 7.6 | /opt/greenplum/data/primary /opt/greenplum/data/mirror | 主:6000-6003 镜像:7000-7003 | segment host4,配置4个主实例+4个镜像实例 |
172.72.6.59 | VIP,在mdw和smdw上进行漂移 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 | -- 网卡 docker network create --subnet=172.72.0.0/16 lhrnw docker rm -f mdw docker run -itd --name mdw -h mdw \ --net=lhrnw --ip 172.72.6.50 \ -p 64350:5432 -p 28080:28080 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker rm -f smdw docker run -itd --name smdw -h smdw \ --net=lhrnw --ip 172.72.6.51 \ -p 64351:5432 -p 28081:28080 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker rm -f sdw1 docker run -itd --name sdw1 -h sdw1 \ --net=lhrnw --ip 172.72.6.52 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker rm -f sdw2 docker run -itd --name sdw2 -h sdw2 \ --net=lhrnw --ip 172.72.6.53 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker rm -f sdw3 docker run -itd --name sdw3 -h sdw3 \ --net=lhrnw --ip 172.72.6.54 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker rm -f sdw4 docker run -itd --name sdw4 -h sdw4 \ --net=lhrnw --ip 172.72.6.55 \ -v /sys/fs/cgroup:/sys/fs/cgroup \ --privileged=true \ --add-host='mdw:172.72.6.50' \ --add-host='smdw:172.72.6.51' \ --add-host='sdw1:172.72.6.52' \ --add-host='sdw2:172.72.6.53' \ --add-host='sdw3:172.72.6.54' \ --add-host='sdw4:172.72.6.55' \ lhrbest/lhrcentos76:9.2 \ /usr/sbin/init docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm mdw:/soft/ docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm smdw:/soft/ docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm sdw1:/soft/ docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm sdw2:/soft/ docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm sdw3:/soft/ docker cp /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm sdw4:/soft/ |
docker环境:
1 2 3 4 5 6 7 8 | [root@lhrxxt ~]# docker ps CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES 0480b0933003 lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 3 seconds ago Up 2 seconds 22/tcp, 3389/tcp sdw4 565d75621bae lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 5 seconds ago Up 4 seconds 22/tcp, 3389/tcp sdw3 5a3fcdb956ab lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 5 seconds ago Up 4 seconds 22/tcp, 3389/tcp sdw2 a514d478aa22 lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 6 seconds ago Up 5 seconds 22/tcp, 3389/tcp sdw1 19b1d6b0ef9b lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 14 seconds ago Up 13 seconds 22/tcp, 3389/tcp, 0.0.0.0:64351->5432/tcp, :::64351->5432/tcp smdw 6692b298d8cd lhrbest/lhrcentos76:9.2 "/usr/sbin/init" 15 seconds ago Up 14 seconds 22/tcp, 3389/tcp, 0.0.0.0:64350->5432/tcp, :::64350->5432/tcp mdw |
修改主机名
1 2 3 4 5 6 | hostnamectl set-hostname mdw hostnamectl set-hostname smdw hostnamectl set-hostname sdw1 hostnamectl set-hostname sdw2 hostnamectl set-hostname sdw3 hostnamectl set-hostname sdw4 |
修改内核参数
若是docker环境,则以下配置需要在宿主机运行。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 | # blockdev预读尺寸应该被设置为16384 /sbin/blockdev --getra /dev/sdb /sbin/blockdev --setra 16384 /dev/sdb echo 'export LANG=en_US.UTF-8' >> /etc/profile sed -i s/SELINUX=enforcing/SELINUX=disabled/g /etc/selinux/config setenforce 0 systemctl status firewalld.service systemctl disable firewalld.service systemctl start firewalld firewall-cmd --add-port=0-65535/tcp --permanent firewall-cmd --add-port=0-65535/udp --permanent firewall-cmd --reload firewall-cmd --list-ports systemctl stop firewalld.service sed -i 's/.*RemoveIPC.*/RemoveIPC=no/' /etc/systemd/logind.conf echo "RemoveIPC=no" >> /usr/lib/systemd/system/systemd-logind.service systemctl daemon-reload systemctl restart systemd-logind.service ll /lib64/security/pam_limits.so echo "session required /lib64/security/pam_limits.so" >> /etc/pam.d/login cat >> /etc/security/limits.conf <<"EOF" * soft nofile 655350 * hard nofile 655350 * soft nproc 655350 * hard nproc 655350 gpadmin soft priority -20 EOF sed -i 's/4096/655350/' /etc/security/limits.d/20-nproc.conf cat /etc/security/limits.d/20-nproc.conf ulimit -HSn 65535 cat >> /etc/sysctl.conf <<"EOF" fs.file-max=9000000 fs.inotify.max_user_instances = 1000000 fs.inotify.max_user_watches = 1000000 kernel.pid_max=4194304 kernel.shmmax = 4398046511104 kernel.shmmni = 4096 kernel.shmall = 4000000000 kernel.sem = 32000 1024000000 500 32000 vm.overcommit_memory=1 vm.overcommit_ratio=95 net.ipv4.ip_forward=1 vm.swappiness=20 vm.dirty_background_bytes = 0 vm.dirty_background_ratio = 5 vm.dirty_bytes = 0 vm.dirty_expire_centisecs = 600 vm.dirty_ratio = 10 vm.dirty_writeback_centisecs = 100 vm.vfs_cache_pressure = 500 vm.min_free_kbytes = 2097152 EOF sysctl -p cat >> /etc/ssh/sshd_config <<"EOF" MaxSessions 1000 MaxStartups 1000 # MaxStartups 50:80:200 EOF systemctl restart sshd docker restart mdw smdw sdw1 sdw2 sdw3 sdw4 |
修改/etc/hosts文件
在Greenplum中,习惯将Master机器叫做mdw,将Segment机器叫做sdw。dw的含义为Data Warehouse。
1 2 3 4 5 6 7 8 | cat >> /etc/hosts <<"EOF" 172.72.6.50 mdw 172.72.6.51 smdw 172.72.6.52 sdw1 172.72.6.53 sdw2 172.72.6.54 sdw3 172.72.6.55 sdw4 EOF |
mdw和sdw只是一个主机的别名,不影响程序去查找IP。
docker环境每次重启容器都会重置/etc/hosts文件的内容,但是这不会影响GreenPlum的正常运行。
也可以在创建容器的时候添加参数"--add-host='mdw:172.72.6.50'"来规避这个问题。
创建用户和集群配置文件(所有节点)
为所有节点创建gpadmin用户:
1 2 3 4 5 | groupadd -g 530 gpadmin useradd -g 530 -u 530 -m -d /home/gpadmin -s /bin/bash gpadmin chown -R gpadmin:gpadmin /home/gpadmin echo "gpadmin:lhr" | chpasswd echo "gpadmin ALL=(ALL) NOPASSWD: ALL" >> /etc/sudoers |
为所有节点创建一个all_hosts文件,包含所有节点主机名:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | su - gpadmin mkdir -p /home/gpadmin/conf/ cat > /home/gpadmin/conf/all_hosts <<"EOF" mdw smdw sdw1 sdw2 sdw3 sdw4 EOF |
为所有节点创建一个 seg_hosts文件 ,包含所有的Segment Host的主机名:
1 2 3 4 5 6 7 8 9 | cat > /home/gpadmin/conf/seg_hosts <<"EOF" sdw1 sdw2 sdw3 sdw4 EOF |
配置互信
只在master节点(172.72.6.50)操作:
1 2 3 4 5 6 | ./sshUserSetup.sh -user root -hosts "mdw smdw sdw1 sdw2 sdw3 sdw4" -advanced -noPromptPassphrase ./sshUserSetup.sh -user gpadmin -hosts "mdw smdw sdw1 sdw2 sdw3 sdw4" -advanced -noPromptPassphrase chmod 600 /home/gpadmin/.ssh/config -- 测试 gpssh -f all_hosts date |
1、为了后续的维护方便,建议root和gpadmin用户都做互信的配置。
2、必须给600权限,否则会报错“.Bad owner or permissions on /home/gpadmin/.ssh/config”
安装GP软件(所有节点)
在所有节点以root用户操作:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 | yum install -y openssh openssh-clients openssh-server \ initscripts net-tools telnet which wget passwd lrzsz sudo unzip \ tree traceroute lsof file tar systemd sysstat curl \ strace less stress bash bash-completion sed zip \ yum-utils ethtool iputils screen yum install -y apr apr-util bash bzip2 curl krb5 libcurl \ libevent libxml2 libyaml zlib openldap openssh openssl \ openssl-libs perl readline rsync R sed tar zip krb5-devel \ libcgroup-tools libnsl -- 主要安装包 yum install -y apr apr-util bzip2 krb5-devel libyaml perl rsync zip libevent --downloadonly --downloaddir=/soft rpm -ivh net-tools*.rpm rpm -ivh libcgroup*.rpm rpm -ivh libcgroup-tools*.rpm rpm -ivh keyutils-libs-devel*.rpm rpm -ivh libcom_err-devel*.rpm rpm -ivh libkadm5*.rpm rpm -ivh krb5-libs*.rpm --nodeps rpm -ivh libkadm5*.rpm --nodeps rpm -ivh krb5-devel*.rpm --nodeps rpm -ivh libselinux-devel*.rpm --nodeps rpm -ivh libverto-devel*.rpm rpm -ivh apr*.rpm rpm -ivh apr-util*.rpm rpm -ivh bzip2*.rpm rpm -ivh libyaml*.rpm rpm -ivh perl*.rpm --nodeps rpm -ivh rsync*.rpm rpm -ivh zip*.rpm rpm -ivh libevent*.rpm rpm -ivh /soft/open-source-greenplum-db-6.25.3-rhel7-x86_64.rpm |
默认的安装路径是/usr/local,修改该路径gpadmin操作权限:
1 2 | chown -R gpadmin:gpadmin /usr/local/greenplum-db chown -R gpadmin:gpadmin /usr/local/greenplum-db-6.25.3 |
创建目录,用作集群数据的存储目录:
1 2 | mkdir -p /opt/greenplum/data/ chown -R gpadmin:gpadmin /opt/greenplum |
配置环境变量
1 2 3 4 5 6 7 8 9 | -- 所有节点 echo ". /usr/local/greenplum-db/greenplum_path.sh" >> /home/gpadmin/.bashrc -- master和standby master配置 echo "export MASTER_DATA_DIRECTORY=/opt/greenplum/data/master/gpseg-1" >> /home/gpadmin/.bashrc echo "export PGDATABASE=postgres" >> /home/gpadmin/.bashrc echo "export PGPORT=5432" >> /home/gpadmin/.bashrc source /home/gpadmin/.bashrc |
数据库初始化(重要)
该步为最核心最重要的命令过程!!!
Greenplum 配置文件模板都在/usr/local/greenplum-db/docs/cli_help/gpconfigs
目录下,其中gpinitsystem_config是初始化 Greenplum 的模板。
在master节点操作:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | [gpadmin@mdw ~]$ cd /usr/local/greenplum-db/docs/cli_help/gpconfigs [gpadmin@mdw gpconfigs]$ ll total 52 -rw-r--r-- 1 root root 2422 Feb 26 03:09 gpinitsystem_config -rw-r--r-- 1 root root 4511 Feb 26 03:09 gpinitsystem_singlenode -rw-r--r-- 1 root root 2321 Feb 26 03:09 gpinitsystem_test -rw-r--r-- 1 root root 359 Feb 26 03:09 hostfile_exkeys -rw-r--r-- 1 root root 119 Feb 26 03:09 hostfile_gpchecknet_ic1 -rw-r--r-- 1 root root 119 Feb 26 03:09 hostfile_gpchecknet_ic2 -rw-r--r-- 1 root root 87 Feb 26 03:09 hostfile_gpcheckperf -rw-r--r-- 1 root root 255 Feb 26 03:09 hostfile_gpexpand -rw-r--r-- 1 root root 237 Feb 26 03:09 hostfile_gpinitsystem -rw-r--r-- 1 root root 96 Feb 26 03:09 hostfile_gpssh_allhosts -rw-r--r-- 1 root root 87 Feb 26 03:09 hostfile_gpssh_segonly -rw-r--r-- 1 root root 44 Feb 26 03:09 hostlist_singlenode |
在master节点操作:创建一个初始化副本 initgp_config,根据前面的架构规划配置,修改参数:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | -- 在所有节点操作(在master和standby master节点创建master目录,在segment节点分布创建primary目录和mirror目录) 或 3个目录创建都可以 su - gpadmin -- master和standby master节点 mkdir -p /opt/greenplum/data/master -- segment节点 mkdir -p /opt/greenplum/data/primary mkdir -p /opt/greenplum/data/mirror -- master节点配置(此处配置每个segment节点上包括4个primary节点和4个mirror节点) cat > /home/gpadmin/conf/initgp_config <<"EOF" declare -a DATA_DIRECTORY=(/opt/greenplum/data/primary /opt/greenplum/data/primary /opt/greenplum/data/primary /opt/greenplum/data/primary) declare -a MIRROR_DATA_DIRECTORY=(/opt/greenplum/data/mirror /opt/greenplum/data/mirror /opt/greenplum/data/mirror /opt/greenplum/data/mirror) ARRAY_NAME="lhrgp" SEG_PREFIX=gpseg PORT_BASE=6000 MIRROR_PORT_BASE=7000 MASTER_PORT=5432 MASTER_HOSTNAME=mdw MASTER_DIRECTORY=/opt/greenplum/data/master DATABASE_NAME=lhrgpdb MACHINE_LIST_FILE=/home/gpadmin/conf/seg_hosts EOF |
在master节点操作:执行初始化命令:
1 2 | su - gpadmin gpinitsystem -c /home/gpadmin/conf/initgp_config -e=lhr -s smdw -P 5432 -S /opt/greenplum/data/master/gpseg-1 -m 500 -b 256MB |