Creating a PostgreSQL database container

We need to test our code against a PostgreSQL database. This is the database that we will be deploying the code in production against.

While the abstraction layer in SQLAlchemy aims to reduce the differences, there are some differences in the behavior of the databases. 

For example, in /thoughts_backend/api_namespace.py, the following line is case-insensitive, which is the behavior that we want: 

query = (query.filter(ThoughtModel.text.contains(search_param)))

Translating that to PostgreSQL, it is case-sensitive, which requires you to check it. This would be a bug in production if testing with SQLite and running in PostgreSQL.

The replaced code, using ilike for the expected behavior, is as follows:

param = f'%{search_param}%'
query = (query.filter(ThoughtModel.text.ilike(param)))

We kept the old code in a comment to show this issue.

To create a database container, we need to define the corresponding Dockerfile. We store all the files in the docker/db/ subdirectory. Let's take a look at Dockerfile and its different parts. The whole file can be found on GitHub (https://github.com/PacktPublishing/Hands-On-Docker-for-Microservices-with-Python/blob/master/Chapter03/docker/db/Dockerfile). This Dockerfile can be divided into the following stages:

  1. Using the ARG keyword, define the basic PostgreSQL configuration such as the name of the database, user, and password. They get set in environment variables so that the PostgreSQL commands can use them.
These commands are for local development only. They'll need to match with the environment set up. The ARG keyword defines a parameter for Dockerfile at build time. We'll see how they are set up as input parameters in the docker-compose.yaml file.

The ARG elements are also defined as ENV variables, so we keep them defined as environment variables:

# This Dockerfile is for localdev purposes only, so it won't be
# optimised for size
FROM alpine:3.9

# Add the proper env variables for init the db
ARG POSTGRES_DB
ENV POSTGRES_DB $POSTGRES_DB
ARG POSTGRES_USER
ENV POSTGRES_USER $POSTGRES_USER
ARG POSTGRES_PASSWORD
ENV POSTGRES_PASSWORD $POSTGRES_PASSWORD
ARG POSTGRES_PORT
ENV LANG en_US.utf8
EXPOSE $POSTGRES_PORT

# For usage in startup
ENV POSTGRES_HOST localhost
ENV DATABASE_ENGINE POSTGRESQL
# Store the data inside the container, as we don't care for
# persistence
RUN mkdir -p /opt/data
ENV PGDATA /opt/data
  1. Install the postgresql package and all its dependencies, such as Python 3 and its compilers. We will need them to be able to run the application code:
RUN apk update
RUN apk add bash curl su-exec python3
RUN apk add postgresql postgresql-contrib postgresql-dev
RUN apk add python3-dev build-base linux-headers gcc libffi-dev
  1. Install and run the postgres-setup.sh script:
# Adding our code
WORKDIR /opt/code

RUN mkdir -p /opt/code/db
# Add postgres setup
ADD ./docker/db/postgres-setup.sh /opt/code/db/
RUN /opt/code/db/postgres-setup.sh

This initializes the database, setting the correct user, password, and so on. Note that this doesn't create the specific tables for our application yet. 

As part of our initialization, we create the data files inside the container. This means that the data won't persist after the container stops. This is a good thing for testing, but, if you want to access the data for debug purposes, remember to keep the container up.
  1. Install the requirements for our application and specific commands to run in the database container:
## Install our code to prepare the DB
ADD ./ThoughtsBackend/requirements.txt /opt/code

RUN pip3 install -r requirements.txt
  1. Copy the application code and database commands stored in docker/db. Run the prepare_db.sh script, which creates the application database structure. In our case, it sets up the thoughts table:
## Need to import all the code, due dependencies to initialize the DB
ADD ./ThoughtsBackend/ /opt/code/
# Add all DB commands
ADD ./docker/db/* /opt/code/db/

## get the db ready
RUN /opt/code/db/prepare_db.sh

This script first starts the PostgreSQL database running in the background, then calls init_db.py, and then gracefully stops the database.

Keep in mind that, in each of the steps of Dockerfile, in order to access the database, it needs to be running, but it will also be stopped at the end of each step. In order to avoid corruption of the data or the abrupt killing of the process, be sure to use the  stop_postgres.sh script until the end. Though PostgreSQL will normally recover for an abruptly stopped database, it will slow the startup time.
  1. To start the database in operation, the CMD is just the postgres command. It needs to run with the postgres user:
# Start the database in normal operation
USER postgres
CMD ["postgres"]

To run the database service, we need to set it up as part of the docker-compose file:

    db:
build:
context: .
dockerfile: ./docker/db/Dockerfile
args:
# These values should be in sync with environment
# for development. If you change them, you'll
# need to rebuild the container
- POSTGRES_DB=thoughts
- POSTGRES_USER=postgres
- POSTGRES_PASSWORD=somepassword
- POSTGRES_PORT=5432
ports:
- "5432:5432"

Note that the args parameter will set up the ARG values during the build. We also route the PostgreSQL port to allow access to the database.

You can now build and start the server:

$ docker-compose up build
$ docker-compose up db
Creating ch3_db_1 ... done
Attaching to ch3_db_1
...
db_1 | 2019-06-02 13:55:38.934 UTC [1] LOG: database system is ready to accept connections

In a different Terminal, you can use a PostgreSQL client to access the database. I recommend the fantastic pgcli. You can check out its documentation (https://www.pgcli.com/).

You can use also the official  psql client or any other PostgreSQL client of your preference. The documentation for the default client can be found here:  https://www.postgresql.org/docs/current/app-psql.html.

Here, we use the PGPASSWORD environment variable to show that the password is the previously configured one:

$ PGPASSWORD=somepassword pgcli -h localhost -U postgres thoughts
Server: PostgreSQL 11.3
Version: 2.0.2
Chat: https://gitter.im/dbcli/pgcli
Mail: https://groups.google.com/forum/#!forum/pgcli
Home: http://pgcli.com
postgres@localhost:thoughts> select * from thought_model
+------+------------+--------+-------------+
| id | username | text | timestamp |
|------+------------+--------+-------------|
+------+------------+--------+-------------+
SELECT 0
Time: 0.016s

Being able to access the database is useful for debugging purposes.