Multi-site setup

Concept

Each facility has a database server. For each site the facility library is the main library. The libraries from other facilities are replications. These replications are read-only. This means that every site is independent but still benefits from the work and libraries another facility is doing.

In Postgres terms:
Use a Streaming Replication for Multi-Site setups.

Requirements

In each facility you have a database server up and running with Postgres installed.

 

In this example Linux is used to setup the database servers. It should however work with any other operating system as well.

Create primary database server

Configure primary server

  1. change postgres.conf to enable networking

  2. create a replication user

  3. allow remote access in pg_hba.conf

This is the database server in our main facility - e.g. Munich

initdb -D /var/lib/postgresql/daselement-munich

change postgres.conf to enable networking

vim /var/lib/postgresql/daselement-munich/postgres.conf # uncomment and change the line: listen_addresses = '*' # and change the port to 5433 to avoid conflicts with any default instances that are running on port 5432 port = 5433 # start the Postgres server pg_ctl -D /var/lib/postgresql/daselement-munich start

Create a user accounts

# postgres is the default postgres database psql --port 5433 postgres # create user that is allowed to create new databases and read/write information CREATE ROLE dbuser LOGIN password 'password'; ALTER USER dbuser CREATEDB; # create replication user [local]:5433 postgres=# create user repuser replication;

Create a replication user

Allow remote access in pg_hba.conf

add line to allow access for the replication user

 

The primary system is now ready. Let’s move on to the replica system!

Create replica database server

Configure replication server

Now we are at a clean database server in another facility - e.g. Vancouver


In the replica directory you will find these two files:

standby.signal

this file defines that the database that it is a replica

postgresql.auto.conf

contains information to connect to the primary server


Now start the replica server:

 

Any changes made in the primary database get synced/streamed to the replication server(s). If the replication server is down it will add the changes once it’s back online!

Configure library

In the other facility copy and import the library (.lib) from main facility:

  1. update the library root

  2. update the database information

  3. lock the library (Lock Library option) to prevent editing and ingesting of new elements!


An option is to setup the main facility to automatically sync the Proxy Files to the other facilities when an element is ingested. This can be done with an additional transcoding task and a custom command task.

 

Advanced Setup

If you want to take it a step further it’s a good idea to look into these topics:

  • high availability

  • load balancing

  • connection pooling

  • automatic fail-over

Here is a great video tutorial using PGPool-II to deal with these topics.

Monitoring

To monitor the databases you can for example use pgwatch2

 

Troubleshooting

psql - commands

command

description

command

description

\l

list all databases

\dt

 

\du

list all users

\dRp

lists available publications

\dRs

lists available subscriptions

 

ALTER USER user_name WITH PASSWORD 'new_password';

change user password

 

Can not start Postgres Database

Solution is to fix the permissions on the folder. For example like this:

 

initdb: command not found

The initdb is located in the installation directory of Postgres.

Fix for Ubuntu:

 

pg_ctl: command not found

The pg_ctl is located in the installation directory of Postgres.

Fix for Ubuntu:

 

Unable to connect to server; FATAL: role “postgres“ does not exist

Seems like there is no user in the database called “postgres”.
To create a user use this command: