Migrating Ubersystem onsite from offsite

last update: pre-super2018

Notes

BEFORE DOING ANYTHING HERE, make sure EVERYTHING in Pre-migration setup for onsite uber is completed. Those steps can be done anytime before the event (like a week or two in advance), but the steps below should only be taken when you are ready to migrate the data onsite:

see Migrating Ubersystem offsite from onsite for reverse instructions

We enable postgresql database replication. This means that as soon as data is written to the onsite database, it's replicated immediately out to the cloud server DB. The DB replication is nice, but it should also be relied upon only as a last resort. If things are going normally, then when transferring data around, always use backups and don't rely on the replication to have worked correctly. In particular, if the internet is down for an extended period of time (like more than an hour), the DB may not replicate fully back out.

If something catastrophic happens to the onsite server (like, if the sprinklers go off), the point of replication is that we can, in theory, point everything back at the cloud server, change the cloud server's db from slave to master, and pick things back up with very little downtime. We need to document this, the key change is touching a 'trigger file' on the cloud server which will set the DB into master mode, some details are here: https://github.com/magfest/ubersystem-puppet/blob/master/templates/pg-recovery.conf.erb#L9

Before you start

You will want to set aside a solid 2 hours for this, and let techops and registration know loudly that this is about to happen. Ubersystem will go offline for a little bit. I recommend you do this with two people who are double checking all the steps with each other, and callout which server you are on. i.e. say "running command X on rams1" and have the other person look at the terminal and say "agree". Be paranoid.

Setup config

Get a pull request going that sets up the following in production-config:
1. at_the_con=True for rams1.uber.magfest.org
2. replication settings enabled for rams1.uber.magfest.org and the cloud uber
3. onsite_uber_address set to https://onsite.uber.magfest.org
4. redirect_all_traffic_onsite TRUE for cloud server (labs.uber.magfest.org.yaml)
5. Example PR's:
  1. PR showing how to set this up for production https://github.com/magfest/production-config/pull/124/files
  2. optional: set replication settings for testing staging servers if you want to test it https://github.com/magfest/production-config/pull/122/files
Merge those PRs
Set secret settings on mcp.magfest.net for replication password
1. edit the following YAML files in /home/dom/sysadmin/deploy/puppet/hiera/nodes/external/secret
  1. common.yaml → change replication passwords here to something new

Migrate the server onsite for real

set maintenance mode by creating a file named maintenance.html on cloud server, this will redirect everything to it.
1. echo "ubersystem is down for quick maintenance and will return shortly" > /var/www/maintenance.html
on both cloud server and on rams1:
1. supervisorctl stop all
paranoia: on cloud server, manually change the following config settings in development.ini: (these will get overridden when we migrate back, don't worry)
1. set SEND_EMAILS to False
2. set SEND_SMS to False
3. comment out AWS key
4. comment out STRIPE key
5. comment out TWILIO twilio_sid, twilio_token, panels_twilio_sid, panels_twilio_token
Copy static file content to onsite server
1. band stage plots, w9s, pic/bios - /usr/local/uber/plugins/guests/uploaded_files
2. mivs game images (if magprime):
  1. /usr/local/uber/plugins/mivs/uploaded_files
  2. /usr/local/uber/plugins/mivs/screenshots (no longer used, but probably keep for safety's sake)
3. mits game images (if magprime) - /usr/local/uber/plugins/mits/pictures
4. attendance graphs history JSON file - /usr/local/uber/plugins/uber_analytics/uber_analytics/static/analytics/extra-attendance-data.json
5. chown rams:rams these files
6. Take a quick glance through each plugin dir to make sure you got all the static files
backup the database (from mcp), this will make a copy in /home/dom/backup
1. Make sure the correct server names are in the following file, then run it
2. /home/dom/sysadmin/backup-all-production-dbs.sh
find the backup file: ls -alltr /home/dom/backup/
VERIFY THE BACKUP LOOKS OK. use 'bzcat (filename) | less'