Tag: mysql

  • LAMP Stack App on AWS with Docker, RDS and phpMyAdmin

    Intro

    I was recently tasked with migrating an old LAMP-stack app (PHPv5) running on a Centos6 server to the newer PHPv7 on a Centos8 machine, and ensuring that the code didn’t break in the php upgrade. I figured the best way to do that would be to use Docker to simulate PHP 7 on a Centos8 machine running on my laptop.

    However, the plan changed and instead of deploying the new app on a Centos8 machine, it was decided that we would deploy the app to its own EC2 instance. Since I was already using Docker, and since I no longer had to plan for a Centos8 deployment, I decided to use Ubuntu 20.04 for the EC2 instance. I installed docker and docker-compose, and adapted the code to use proper PHP-Apache and phpMyAdmin Docker images. I also decided to use AWS RDS mysql, and to use the EC2 instance to implement logical backups of the mysql DB to AWS S3.

    The rest of this article consists in more detailed notes on how I went about all of this:

    • Dockerizing a LAMP-stack Application
      • php-apache docker image
      • creating dev and prod versions
      • updating code from PHPv5 to PHPv7
      • handling env variables
      • Adding a phpMyAdmin interface
    • AWS RDS MySQL Setup
      • rdsadmin overview
      • creating additional RDS users
      • connecting from a server
    • AWS EC2 Deployment
      • virtual machine setup
      • deploying prod version with:
        • Apache proxy with SSL Certification
        • OS daemonization
      • MySQL logical backups to AWS S3

    Dockerizing a LAMP-stack Application

    php-apache docker image

    I’ll assume the reader is somewhat familiar with Docker. I was given a code base in a dir called DatasetTracker developed several years ago with PHPv5. The first thing to do was to set up a git repo for the sake of development efficiency, which you can find here.

    Next, I had to try and get something working. The key with Docker is to find the official image and RTFM. In this case, you want the latest php-apache image, which leads to the first line in your docker file being: FROM php:7.4-apache. When you start up this container, you get an apache instance that will interpret php code within the dir /var/www/html and listening on port 80.

    creating dev and prod versions

    I decided to set up two deployment tiers: dev and prod. The dev tier is chiefly for local development, wherein changes to the code do not require you to restart the docker container. Also, you want to have php settings that allow you to debug the code. The only hiccup I experienced in getting this to work was understanding how php extensions are activated within a docker context. It turns out that the php-apache image comes with two command-line tools: pecl and docker-php-ext-install. In my case, I needed three extensions for the dev version of the code: xdebug, mysqli, and bcmath. Through trial and error I found that you could activate those extensions with the middle 3 lines in the docker file (see below).

    You can also set the configurations of your php to ‘development’ by copying the php.ini-development file. In summary, the essence of a php-apache docker file for development is as follows:

    FROM php:7.4-apache
    
    RUN pecl install xdebug
    RUN docker-php-ext-install mysqli
    RUN docker-php-ext-install bcmath
    
    RUN cp /usr/local/etc/php/php.ini-development /usr/local/etc/php/php.ini

    When you run a container based on this image, you just need to volume-mount the dir with your php code to /var/www/html to get instant updates, and to map port 80 to some random port for local development.

    Next, we need to write a docker-compose file in order to have this image run as a container along with a phpMyAdmin application, as well as to coordinate environment variables in order to connect to the remote AWS RDS mysql instance.

    An aspect of the set up that required a bit of thought was how to log into phpMyAdmin. The docker-image info was a bit confusing. In the end though, I determined that you really only need one env variable — PMA_HOST — passed to the phpMyAdmin container through the docker-compose file. This env variable just needs to point to your remote AWS RDS instance. phpMyAdmin is really just an interface to your mysql instance, so you then log in through the interface with your mysql credentials. (See .env-template in the repo.)

    (NOTE: you might first need to also pass env variables for PMA_USER and PMA_PASSWORD to get it to work once, and then you can remove these; I am not sure why this seems to be needed.)

    updating code from PHPv5 to PHPv7

    Once I had an application running through docker-compose, I was able to edit the code to make it compatible with PHPv7. This included, amongst other things, replacing mysql_connect with mysqli_connect, and replacing hard-coded mysql credentials with code for grabbing such values from env variables. A big help was using the VSCode extension intelephense, which readily flags mistakes and code that is deprecated in PHPv7.

    AWS RDS MySQL Setup

    rdsadmin overview

    Note: discussions about ‘databases’ can be ambiguous. Here, I shall use ‘DB’ or ‘DB instance’ to refer to the mysql host/server, and ‘db’ to refer to the internal mysql collection of tables that you select with the syntax `use [db name];`. As such, a mysql DB instance can have multiple dbs within it.

    In order to migrate the mysql database from our old Centos6 servers to an RDS instance, I first used the AWS RDS interface to create a mysql db instance.

    When I created the mysql DB instance via the AWS RDS interface, I assumed that the user I created was the root user with all privileges. But this is not the case! Behind the scenes, RDS creates a user called rdsadmin, and this user holds all the cards.

    To see the privileges of a given user, you need to use SHOW GRANTS FOR 'user'@'host'. Note: you need to provide the exact host associated with the user you are interested in; if you are not sure what the host is for the user, you first need to run:

    SELECT user, host FROM mysql.user WHERE user='user';

    In the case of an RDS DB instance, rdsadmin is created so as to only be able to log into the DB instance from the same host machine of the instance, so you need to issue the following command to view the permissions of the rdsadmin user:

    SHOW GRANTS for 'rdsadmin'@'localhost';

    I’ll call the user that you initially create via the AWS console the ‘admin’ user. You can view the admin’s privileges by running SHOW GRANTS; which yields the following result:

    GRANT SELECT, INSERT, UPDATE, DELETE, CREATE, 
    DROP, RELOAD, PROCESS, REFERENCES, INDEX, 
    ALTER, SHOW DATABASES, CREATE TEMPORARY TABLES, 
    LOCK TABLES, EXECUTE, REPLICATION SLAVE, 
    REPLICATION CLIENT, CREATE VIEW, SHOW VIEW, 
    CREATE ROUTINE, ALTER ROUTINE, CREATE USER, 
    EVENT, TRIGGER ON *.* TO `admin`@`%` 
    WITH GRANT OPTION

    The final part — WITH GRANT OPTION — is mysql for “you can give all of these permissions to another user”. So this user will let you create another user for each db you create.

    If you compare these privileges with those for rdsadmin, you’ll see that rdsadmin has the following extra privileges:

    SHUTDOWN, FILE, SUPER, CREATE TABLESPACE, CREATE ROLE, DROP ROLE, SERVICE_CONNECTION_ADMIN, SET_USER_ID, SYSTEM_USER

    Several of these privileges — such as shutdown — can be executed via the AWS console. In summary, rdsadmin is created in such a way that you can never use it directly, and you will never need to. The admin user has plenty of permissions, and one needs to consider best practices as to whether to use the admin user when connecting from one’s application.

    I personally think that it is good general practice to have a separate db for each deployment tier of an application. So if you are developing an app with, say, a ‘development’, ‘stage’, and ‘production’ deployment tier, then it’s wise to create a separate db for each tier. Alternatively, you might want to have the non-production tiers share a single db. The one thing that I believe is certain though is that you need a dedicated db for production, that it needs to have logical backups (i.e. mysqldump to file) carried out regularly, and that you ideally never edit the prod db directly (or, if you do, that you do so with much fear and trembling).

    Is it a good practice to have multiple dbs on a single DB instance? This totally depends on the nature of the applications and their expected load on the DB instance. Assuming that you do have multiple applications using dbs on the same DB instance, you might want to consider creating a specialized user for each application in case compromise of one user compromises ALL your applications. In that case, the role of the admin is ONLY to create users whose credentials will be used to connect an application to the db. The next section shows how to accomplish that.

    creating additional RDS users

    So lets assume that you want to create a user who’s sole purpose is to enable an application deployed on some host HA (application host) to connect to the host on which the DB instance is running Hdb (db host). Enter the RDS DB instance with your admin user credentials and enter:

    CREATE USER 'newuser'@'%' IDENTIFIED BY 'newuser_password';
    GRANT ALL PRIVILEGES ON db_name.* TO 'newuser'@'%';
    FLUSH PRIVILEGES;

    This will create user ‘newuser’ with all of the privileges of the admin user. The ‘user’@’%’ syntax means “this user connecting from any host”.

    Of course, if you want to be extra secure, you can specify that the user can only connect from specific hosts by running this command multiple times replacing the wildcard ‘%’.

    As an aside, if you want to know the name of the host you are currently connecting from, then run:

    mysql> SELECT USER() ;
    +-------------------------------------------+
    | USER()                                    |
    +-------------------------------------------+
    | admin@c-XX-XX-XXX-XXX.hsd1.sc.comcast.net |
    +-------------------------------------------+
    1 row in set (0.07 sec)

    In this case, the host ‘c-XX-XX-XXX-XXX.hsd1.sc.comcast.net’ has been determined as pointing to my home’s public IP address (assigned by my ISP). (I assume that under the hood mysql has used something like nslookup MYPUBLIC_IPADDRESS to determine the hostname as it prefers that rather than my present IP address, which is assumed to be less permanent.)

    enabling user to change password

    As of Nov 2022, there seems to be an issue with phpmyadmin whereby a user thus created cannot change his/her own password through the phpmyadmin interface. Presumably under the hood the sql command to change the user’s password is such as to require certain global privileges (and this user has none). A temporary solution is to connect to the DB instance with your admin user and run:

    GRANT CREATE USER ON *.* TO USERNAME WITH GRANT OPTION; 

    connecting from a server

    One thing that threw me for a while was the need to explicitly white-list IP addresses to access the DB instance. When I created the instance, I selected the option to be able to connect to the database from a public IP address. I assumed that this meant that, by default, all IP addresses were permitted. However, this is not the case! Rather, when you create the DB instance, RDS will determine the public IP address of your machine (in my case – my laptop at my home public IP address), and apply that to the inbound rule of the AWS security group attached to the DB instance.

    In order to be able to connect our application running on a remote server, you need to go that security group in the AWS console and add another inbound-rule for MySQL/Aurora for connections from the IP address of your server.

    AWS EC2 Deployment

    virtual machine setup

    I chose Ubuntu server 20.04 for my OS with a single core and 20GB of storage. (The data will be stored in the external DB and S3 resources, so not much storage is needed.) I added 4GB of swap space and installed docker and docker-compose.

    apache proxy with SSL Certification

    I used AWS Route 53 to create two end points pointing to the public IP address of the EC2 instance. To expose the two docker applications to the outside world, I installed apache on the EC2 instance and proxy-ed these two end points to ports 5050 and 5051. I also used certbot to establish SSL certification. The apache config looks like this:

    <IfModule mod_ssl.c>
    <Macro SSLStuff>
        ServerAdmin webmaster@localhost
        ErrorLog ${APACHE_LOG_DIR}/error.log
        CustomLog ${APACHE_LOG_DIR}/access.log combined
        Include /etc/letsencrypt/options-ssl-apache.conf
        SSLCertificateFile /etc/letsencrypt/live/xxx/fullchain.pem
        SSLCertificateKeyFile /etc/letsencrypt/live/xxx/privkey.pem
    </Macro>
    
    <VirtualHost _default_:443>
        Use SSLStuff
        DocumentRoot /var/www/html
    </VirtualHost>
    
    <VirtualHost *:443>
        Use SSLStuff
        ServerName dataset-tracker.astro-prod-it.aws.umd.edu
        ProxyPass / http://127.0.0.1:6050/
        ProxyPassReverse / http://127.0.0.1:6050/
    </VirtualHost>
    
    <VirtualHost *:443>
        Use SSLStuff
        ServerName dataset-tracker-phpmyadmin.astro-prod-it.aws.umd.edu
        ProxyPass / http://127.0.0.1:6051/
        ProxyPassReverse / http://127.0.0.1:6051/
        RequestHeader set X-Forwarded-Proto "https"
        RequestHeader set X-Forwarded-Port "443"
    </VirtualHost>
    </IfModule>

    OS daemonization

    Once you clone the code for the applications to the EC2 instance, you can begin it in production mode with:

    docker-compose -f docker-compose.prod.yml up -d

    … where the flag ‘-d’ means to start it in the background (‘daemonized’).

    One of the nice things about using docker is that it becomes super easy to set up your application as a system service by simply adding restart: always to your docker-compose file. This command will cause docker to take note to restart the container if it registers an internal error, or if the docker service is itself restarted. This means that if the EC2 instance crashes or is otherwise restarted then docker (which, being a system service, will itself restart automatically) will automatically restart the application.

    MySQL logical backups to AWS S3

    Finally, we need to plan for disaster recovery. If the EC2 instance gets messed up, or the AWS RDS instance gets messed up, then we need to be able to restore the application as easily as possible.

    The application code is safe, thanks to github, and so we just need to make sure that we never lose our data. RDS performs regular disk backups, but I personally prefer to create logical backups because, in the event that the disk becomes corrupted, I feel wary about trying to find a past ‘uncorrupted’ state of the disk. Logical backups to file do not rely on the intergrity of the entire disk, and thereby arguably provide a simpler and therefore less error-prone means to preserve data.

    (This is in accordance with my general philosophy of preferring to backup files over than disk images. If something serious goes wrong at the level of e.g. disk corruption, I generally prefer to ‘start afresh’ with a clean OS and copy over files as needed, rather than to try and restore a previous snapshot of a disk. This approach also helps maintain disk cleanliness since disks tend to accumulate garbage over time.)

    To achieve these backups, create an S3 bucket on AWS and called it e.g. ‘mysql-backups’. Then install an open-source tool to mount S3 buckets onto a linux file system with sudo apt install s3fs.

    Next, add the following line to /etc/fstab:

    mysql-backups /path/to/dataset-tracker-mysql-backups fuse.s3fs allow_other,passwd_file=/home/user/.passwd-s3fs 0 0

    Next, you need to create an AWS IAM user with permissions for full programmatic access your S3 bucket. Obtain the Access key ID and Secret access key for that user and place them into a file /home/user/.passwd-s3fs in the format:

    [Access key ID]:[Secret access key]

    Now you can mount the S3 bucket by running sudo mount -a (which will read the /etc/fstab file).

    Check that the dir has successfully mounted by running df -h and/or by creating a test file within the dir /path/to/dataset-tracker-mysql-backups and checking in the AWS S3 console that that file has been placed in the bucket.

    Finally, we need to write a script to be run by a daily cronjob that will perform a mysql dump of your db to file to this S3-mounted dir, and to maintain a history of backups by removing old/obsolete backup files. You can see the script used in this project here, which was adapted from this article. Add this as a daily cronjob, and it will place a .sql file in your S3 dir and remove obsolete versions.

  • WordPress Backup Restoration Practice

    Intro

    This is Part II of a multi-part series on creating and managing a WordPress-NextJs app. In the first part, we set up a WordPress (WP) instance on AWS EC2 with OpenLiteSpeed, MySql on RDS, and S3 for our media storage. The WP site will serve as a headless CMS for a hypothetical client to manage his/her content; it will act as the data source for nextJs builds to generate a pre-rendered site deployed on AWS Cloudfront.

    In this part, having set up our WP backend, we will practice restoring the site in the event that the EC2 instance gets bricked.

    Rationale

    In this section, I am going to argue for why we only need to be able to recreate a corrupted EC2 instance given our setup so far.

    One of the advantages of (essentially) storing all of our application’s state in AWS RDS and S3 is that these managed services provide backup functionality out of the box.

    S3 has insane multi-AZ “durability” out of the box of with, to quote a quick search google, “at least 99.999999999% annual durability, or 11 nines… [; t]hat means that even with one billion objects, you would likely go a hundred years without losing a single one!” You could make your S3 content even more durable by replicating across regions but, this is plenty durable for anything I’ll be building any time soon.

    RDS durability requires a little more understanding. On a default setup, snapshots are taken of the DB’s disk storage every day, and retained for a week. RDS uses these backups to provide you with the means to restore the state of your DB to any second of your choosing between your oldest and most recent backups. This is your “retention period”. At extra cost, you can set the retention period to be as far back as 35 days, and you can also create manual snapshots to be stored indefinitely.

    These RDS backups, while good and all, do not guard against certain worst-case scenarios. If, say, your DB instance crashes during a write operation and corrupts the physical file, then you will need to restore a backup before the end of the retention period; this could mean that you will lose your data if you do not also regularly check the WP site to make sure everything is in order.

    Perhaps even worse is the case where some mishap occurs with your data that does not show up in an immediate or obvious way by visiting your site. For example, suppose you or a client installs a crappy plugin that deletes a random bunch; which you are not going to know about unless you do a thorough audit of your site’s content (and who’s got time for that!)

    For this reason, you also really want to create regular logical backups of your DB and save them to e.g. S3 Glacier.

    EC2 Restoration

    We’ll practice reconstructing a working EC2 instance to host our WP site. This section assumes that you have set up everything as prescribed in Part I of this series.

    First, in order to create a more authentic restoration simulation, you might want to stop your existing/working EC2 instance. Before you do that though, consider whether or not you have noted the details for connecting to your WP DB; if not, first note them down somewhere “safe”. Also be aware that if you did not associate an elastic-ip address with your EC2 instance, then it will get reset when you restart it later.

    Next, create a new EC2 Ubuntu 20.04 instance, open port 80 and 7080, ssh into it, and run the following:

    sudo apt update
    sudo apt upgrade -y
    curl -k https://raw.githubusercontent.com/litespeedtech/ols1clk/master/ols1clk.sh -o ols1clk.sh
    sudo bash ols1clk.sh -w

    The ols1clk.sh script will print to screen the details of the server and WP it is about to install. Save those details temporarily and proceed to install everything.

    In the AWS RDS console, go to your active mysql instance, go to its security group, and add an inbound rule allowing connections from your the security group attached to your new EC2 instance. Copy the RDS endpoint.

    Back in the EC2 instance, check that you can connect to the RDS instance by running:

    mysql --host RDS_ENDPOINT -u WPDB_USER -p

    … and entering the password for this user. Also make sure you can use the WP database within this instance.

    If the connection works, then open the file /usr/local/lsws/wordpress/wp-config.php in your editor and replace the mysql connection details with those corresponding to your RDS instance and WP DB_NAME. You also need to add the following two lines in order to override the fact that the DB is set with a site base url corresponding to your previous EC2 instance:

    define('WP_HOME', '[NEW_IP_ADDRESS]');
    define('WP_SITEURL', '[NEW_IP_ADDRESS]');

    … where NEW_IP_ADDRESS is taken from your new EC2 console.

    Now you can go to NEW_IP_ADDRESS in a browser and expect to find a somewhat functioning WP site. If you try navigating to post though you will get a 404 error. To fix this, you need to go into OLS Admin account on port 7080, login using the new credentials generated by ols1clk.sh, go to the “Server Configuration” section, and under the “General” tab, in the “Rewrite Control” table, set the “Auto Load from .htaccess” field to “Yes”. Now you can expect to be able to navigate around posts.

    (Side note: it’s very surprising to me that ols1clk.sh, in installing WP, does not set this field to Yes.)

    The images will not work of course, because the DB records their location at the address of the old EC2 instance, which is not running. So in an actual restoration scenario, we would need to point the previous hostname from the old instance to the new instance, and then set up S3fs (which I am not going to test right now).

    Having gone through this backup restoration practice, you can switch back to the old EC2 instance and, since we had to play around with our login credentials on a non-SSL site, it is a good idea to update your WP login password.