add docker-compose documentation for deploying on aws/gcp/digital ocean
Created by: ggilmore
Overview
This PR adds documentation for deploying docker-compose on each of the major cloud providers: AWS, GCP, and Digital Ocean. Each set of instructions gives recommendations on what machine size to choose, how much storage to give the machine, and documentation pointers about how to create backups of their Sourcegraph instance.
This PR also tells people what deployment option they should choose when they're starting out:
Deployment Type | Suggested for | Setup time | Multi-machine? | Auto healing? | Monitoring? |
---|---|---|---|---|---|
Single-container server | Local testing | 60 seconds | Impossible | No | No |
Docker-compose | Small & medium production deployments | 5 minutes | Possible | No | Yes |
Kubernetes | Medium & large highly-available cluster deployments | 30 minutes | Easily | Yes | Yes |
A later PR will give instructions for migrating off of sourcegraph/server
to Docker Compose.
Notes:
Startup scripts
Each of the cloud provider docs provides a startup script to initialize the instance. Each of the scripts follows this general flow:
- Install
git
- Clone https://github.com/sourcegraph/deploy-sourcegraph-docker to the installation directory (AWS:
/home/ec2-user/deploy-sourcegraph-docker
, GCP/DO:/root/deploy-sourcegraph-docker
) and checkout the latest release - Prepare the external disk (used for storing all of Docker's data) and mount it into the file system
- Using the external disk's device name provided in the startup script, check to see if the disk already has an existing file system
- If it's unformatted, then format with
xfs
for AWS, andext4
for GCP/DO - If the disk already has a filesystem, do nothing (this prevents data from being destroyed if the user is using a disk that had a backup restored onto it).
- Notes:
- The same init script can be used on both the initial install and when spinning up a VM that has existing sourcegraph data on it.
- Digital Ocean's webUI doesn't allow you to attach an existing volume to a new droplet during the creation process - you can only do so after the instance is initialized. I'll need to investigate instructions for how to spin up a backup on DO (simply run the init script after you login for the first time)?
- If it's unformatted, then format with
- Add the external disk's UUID to
/etc/fstab
so that the disk is automatically mounted upon reboots.- A lot of cloud provider documentation said that you can't rely on the device name being stable between reboots. Some cloud providers allow you to give names to disks, but the resulting label in
/dev/
wasn't predictable in my experience. I think using the device name is fine for now since it's only used for the initial format and the UUID is what actually gets stored in/etc/fstab
.
- A lot of cloud provider documentation said that you can't rely on the device name being stable between reboots. Some cloud providers allow you to give names to disks, but the resulting label in
- Using the external disk's device name provided in the startup script, check to see if the disk already has an existing file system
- Install docker
- Install jq
- Use jq modify
/etc/docker/daemon.json
to change thedata-root
setting to point to/mnt/docker-data
.- All of docker-compose's data is stored via docker volumes. Telling Docker to store all of its data under
/mnt/docker-data
allows customer's to easily snapshot/backup all of the user data in a similar workflow to what they have withsourcegraph/server
.
- All of docker-compose's data is stored via docker volumes. Telling Docker to store all of its data under
- Restore the docker daemon to pickup these changes
- Install docker-compose
- Run
docker-compose up -d
to start the sourcegraph/services and restart them on reboots
Backups
Docker Compose doesn't offer any built-in tool to mass backup Docker volumes, and Docker's documentation only offers instructions for how to create/restore a tarball of a container on the individual container level. (I haven't tested this yet, but going down this route would involve custom scripting and would require managing these tarballs somewhere such as an S3 bucket, etc.)
As mentioned above, most of our customers currently solve this by mounting an external disk to the ~/.sourcegraph directory. I have replicated this workflow by telling Docker to store all of its data (including volumes) under /mnt/docker-data
. The docs also contain pointers for how to take snapshots of the external disk.
In addition to the above method, I also call out that using an external Postgres service is an easy way to backup all of Sourcegraph's user data.
Notes:
I have personally tested these docs for AWS, GCP, and DO. I have also tested restoring backups on AWS and GCP (and not DO for reasons stated above).
I would still like someone to take an in depth look at this PR, the wording, and actually test out my instructions, but this PR needs to be merged already so that customers and take a look.