Deployment¶
Mozilla deploys Ichnaea in an Amazon AWS environment, and there are some optional dependencies on specific AWS services like Amazon S3. The documentation assumes you are also using a AWS environment, but Ichnaea can be run in non-AWS environments as well.
Mozilla’s Production Deployment¶
Mozilla’s deployment of Ichnaea looks something like this:
The required parts are:
One or more WebApp workers running the user-facing web page and APIs. Mozilla uses 20 EC2 instances in an Auto Scaling Group (ASG), behind an Elastic Load Balancer (ELB).
One or more Async workers that run Celery tasks that process observations, update the station database, create map tiles, export data, and other tasks. Mozilla uses 5 EC2 instances in an ASG.
A Celery scheduler to schedule periodic tasks. Mozilla uses an EC2 instance.
A MySQL or compatible database, to store station data. Mozilla uses Amazon’s Relational Database Service (RDS), MySQL 5.7, in Multi-AZ mode. The user-facing website does not write to a database, and reads from a read-only replica.
A Redis cache server, for cached data, Celery tasks queues, and observation data pipelines. Mozilla uses Amazon’s ElastiCache Redis, in Multi-AZ mode.
The optional parts are:
An S3 asset bucket to store map tiles and public data like cell exports. Mozilla uses Cloudfront as a CDN in front of the asset bucket.
An S3 backup bucket to store observation samples.
An Admin node, to provide interactive access to the cluster and to run database migrations. Mozilla uses an EC2 instance.
DNS entries to publish on the Internet. Mozilla uses AWS’s Route 53.
Optional parts not shown on the diagram:
A statsd-compatible metrics server. Mozilla uses InfluxDB Cloud.
A log aggregator. Mozilla uses Google Cloud Logging.
Sentry, for aggregating captured exceptions. Mozilla uses a self-hosted instance.
MySQL / Amazon RDS¶
The application is written and tested against MySQL 5.7.x or Amazon RDS of the same versions. MariaDB 10.5 has also been tested in the development environment.
The default configuration works for the most part, but ensure you are using
UTF-8 to store strings. For example in my.cnf
:
[mysqld]
character-set-server = utf8
collation-server = utf8_general_ci
init-connect='SET NAMES utf8'
The WebApp frontend role only needs access to a read-only version of the database, for example a read-replica. The Async Worker backend role needs access to the read-write primary database.
You need to create a database called location
and a user with DDL
privileges for that database.
Mozilla’s deployment processes 500 to 1000 million observations a day. We have
had issues in the past with replica lag, related binary log sizes, and
transaction log sizes. The replica lag and disk usage should be monitored, for
example with AWS RDS metrics. The transaction history length can be monitored
via the metric trx_history.length. Mozilla reduced replica lag to 2 seconds
or less by increasing innodb_log_file_size
from 125 MB to 2 GB, and
control observation throughput dynamically. See Processing Backlogs and Rate Control for more
information.
Redis / Amazon ElastiCache¶
The application uses Redis:
As the Celery backend and broker for the Async backend workers,
As queues for observation data,
As an API key cache, and
As storage for API key rate limits and usage.
You can install a standard Redis or use Amazon ElastiCache (Redis). The application is tested against Redis 3.2.
Amazon S3¶
The application uses Amazon S3 for various tasks, including backup of observations, export of the aggregated cell table and hosting of the data map image tiles.
All of these are triggered by asynchronous jobs and you can disable them if you are not hosted in an AWS environment.
If you use Amazon S3 you might want to configure a lifecycle policy to delete old export files after a couple of days and observation data after one year.
Statsd / Sentry¶
The application uses Statsd to aggregate metrics and Sentry to log exception messages.
To use Statsd and Sentry, you need to configure them via environment variables as detailed in the config section.
Installation of Statsd and Sentry are outside the scope of this documentation.
Logging¶
The application logs to stdout
by default. The WebApp logs using the
MozLog format,
while the Async workers have more traditional logs. If you want to view logs
across a deployment, a logging aggregation system is needed. This is outside
the scope of this documentation.
Image Tiles¶
The code includes functionality to render out image tiles for a data map of places where observations have been made. These can be stored in an S3 bucket, allowing them to be viewed on the website.
You can trigger this functionality periodically via a cron job, by
calling the application container with the map
argument.
Docker Config¶
The development section describes how to set up an environment used for working on and developing Ichnaea itself. For a production install, you should use pre-packaged docker images, instead of installing and setting up the code from Git.
Docker images are published to https://hub.docker.com/r/mozilla/location.
When the main
branch is updated (such as when a pull request is merged),
an image is uploaded with a label matching the commit hash, such as
082156a5a8714a0db0b78f7b405ced2153184c1b
, as well as the latest
tag.
This is deployed to the
stage deployment, and the deployed
commit can be viewed at
/__version__ on stage.
When it is ready for production, it is tagged with the date, such as
2021.08.16
, and is deployed to
production. The deployed tag and
commit can be viewed at
/__version__ on prod,
and the available tags at
https://github.com/mozilla/ichnaea/tags.
Pull the desired docker image:
docker pull mozilla/location:2021.11.23
To test if the image was downloaded successfully, you can create a container and open a shell inside of it:
docker run -it --rm mozilla/location:2021.11.23 shell
Close the container again, either via exit
or Ctrl-D
.
Next create the application config as a docker environment file, for example called env.txt:
DB_READONLY_URI=mysql+pymysql://USER:PASSWORD@HOSTNAME:3306/location
DB_READWRITE_URI=mysql+pymysql://USER:PASSWORD@HOSTNAME:3306/location
SQLALCHEMY_URL=mysql+pymysql://USER:PASSWORD@HOSTNAME:3306/location
GEOIP_PATH=/mnt/geoip/GeoLite2-City.mmdb
REDIS_URI=redis://HOST:PORT/0
SECRET_KEY=change_this_value_or_it_will_not_be_secret
You can use either a single database user with DDL/DML privileges, or
separate users for DDL (SQLALCHEMY_URL
), read-write (DB_READWRITE_URI
),
and read-only (DB_READONLY_URI
) privileges.
See Environment variables for additional options.
Database Setup¶
The user with DDL privileges and a database called location
need to
be created manually. If multiple users are used, the initial database
setup will create the read-only / read-write users. Something like this
should work in a mysql
shell:
CREATE DATABASE location;
CREATE USER 'read'@'%' IDENTIFIED BY 'read-password';
GRANT SELECT ON location.* TO 'read'@'%';
CREATE USER 'write'@'%' IDENTIFIED BY 'write-password';
GRANT SELECT, INSERT, UPDATE, DELETE ON location.* TO 'write'@'%';
CREATE USER 'admin'@'%' IDENTIFIED BY 'admin-password';
GRANT ALL PRIVILEGES ON * TO 'admin'@'%';
quit
These usernames and passwords need to match the database connection URLs in the
env.txt
file. Next up, run the initial database setup:
docker run -it --rm --env-file env.txt \
mozilla/location:2021.11.23 shell alembic stamp base
And update the database schema to the latest version:
docker run -it --rm --env-file env.txt \
mozilla/location:2021.11.23 shell alembic upgrade head
The last command needs to be run whenever you upgrade to a new version
of Ichnaea. You can inspect available database schema changes via
alembic with the history
and current
sub-commands.
An API key will be needed to use the service, and a testing one can
be created now that the database is available. This can be used to
create one called test
:
docker run -it --rm --env-file env.txt \
mozilla/location:2021.11.23 shell /app/ichnaea/scripts/apikey.py create test
GeoIP¶
The application uses a Maxmind GeoIP City database for various tasks. It works both with the commerically available and Open-Source GeoLite databases in binary format.
You can download the GeoLite database for free from MaxMind after signing up for a GeoLite2 account.
Download and untar the downloaded file. Put the GeoLite2-City.mmdb
into a directory accessible to docker (for example /opt/geoip
).
The directory or file can be
mounted
into the running docker containers.
You can update this file on a regular basis. Typically once a month is enough for the GeoLite database. Make sure to stop any containers accessing the file before updating it and start them again afterwards. The application code doesn’t tolerate having the file being changed underneath it.
Docker Runtime¶
Finally you are ready to start containers for the three different application roles.
There is a web frontend, a task worker and a task scheduler role. The scheduler role is limited to a single running container. You need to make sure to never have two containers for the scheduler running at the same time. If you use multiple physical machines, the scheduler must only run on one of them.
The web app and task worker roles both scale out and you can run
as many of them as you want. You can tune the web instance with the variables
GUNICORN_WORKERS
and similar variables - see
docker/run_web.sh
for details. You can run a single docker container per physical/virtual
machine, or multiple with a system like Kubernetes.
All roles communicate via the database and Redis only, so can be run on different virtual or physical machines. The task workers load balance their work internally via data structures in Redis.
If you run multiple web frontend roles, you need to put a load balancer in front of them. The application does not use any sessions or cookies, so the load balancer can simply route traffic via round-robin.
You can configure the load balancer to use the /__lbheartbeat__
HTTP
endpoint to check for application health.
If you want to use docker as your daemon manager run:
docker run -d --env-file env.txt \
--volume /opt/geoip:/mnt/geoip
mozilla/location:2021.11.23 scheduler
The /opt/geoip
directory is the directory on the docker host, with
the GeoLite2-City.mmdb
file inside it. The /mnt/geoip/
directory
corresponds to the GEOIP_PATH
config section in the env.txt
file.
The two other roles are started in the same way:
docker run -d --env-file env.txt \
--volume /opt/geoip:/mnt/geoip
mozilla/location:2021.11.23 worker
docker run -d --env-file env.txt \
--volume /opt/geoip:/mnt/geoip
-p 8000:8000/tcp
mozilla/location:2021.11.23 web
The web role can take an additional argument to map the port 8000 from inside the container to port 8000 of the docker host machine.
You can put a web server (e.g. Nginx) in front of the web role and proxy pass traffic to the docker container running the web frontend.
Runtime Checks¶
To check whether or not the application is running, you can check the web role, via:
curl -i http://localhost:8000/__heartbeat__
This should produce output like:
HTTP/1.1 200 OK
Content-Type: application/json
Date: Mon, 10 Jan 2022 23:34:25 GMT
Content-Length: 193
Connection: keep-alive
{"database": {"up": true, "time": 2, "alembic_version": "3be4004781bc"},
"geoip": {"up": true, "time": 0, "age_in_days": 4, "version": "2022-01-06T16:20:21Z"},
"redis": {"up": true, "time": 1}}
The __lbheartbeat__
endpoint has simpler output and doesn’t check
the database / Redis backend connections. The application is designed
to degrade gracefully and continue to work with limited capabilities
without working database and Redis backends.
The __version__
endpoint shows what version of the software is
currently running.
To test one of the HTTP API endpoints, you can use:
curl -H "X-Forwarded-For: 81.2.69.192" \
http://localhost:8000/v1/geolocate?key=test
Change the command if you used a name other that test
for a first
API key in the Database Setup.
This should produce output like:
{"location": {"lat": 51.5142, "lng": -0.0931}
Test this with different IP addresses like 8.8.8.8
to make sure
the database file was picked up correctly.
Upgrade¶
In order to upgrade a running installation of Ichnaea to a new version, first check and get the docker image for the new version, for example:
docker pull mozilla/location:2021-11-23
Next up stop all containers running the scheduler and task worker roles.
If you use docker’s own daemon support, the ps
, stop
and rm
commands can be used to accomplish this.
Now run the database migrations found in the new image:
docker run -it --rm --env-file env.txt \
mozilla/location:2.2.0 alembic upgrade head
The web app role can work with both the old database and new database schemas. The worker role might require the new database schema right away.
Start containers for the scheduler, worker and web roles based on the new image.
Depending on how you run your web tier, swich over the traffic from the old web containers to the new ones. Once all traffic is going to the new web containers, stop the old web containers.