Thursday, September 1, 2016

Storing HTTP Sessions with Amazon Elastic File System (EFS)

Amazon has recently released the Elastic File System (EFS) to the general public after a long beta period.  According to Amazon, EFS is a distributed file system capable of high throughput, low latency and auto scaling in addition to fault tolerance.  In the US region, EFS costs $0.30/GB-month at the time of writing.  You can NFS-mount and share an EFS file system with multiple EC2 instances in Virtual Private Cloud (VPC) or non-VPC via ClassicLink.

There are many applications that can take advantage of a distributed file system.  For example, if you are running a web application on multiple machines in a cluster, you can use EFS for storing user HTTP sessions.  One of the benefits is that EFS costs far less than running ElasticCache, a Memcached compliant service offered by Amazon.

Setting up EFS in a VPC is easy and Amazon provides step-by-step tutorial on how to do this.  Once you have EFS and security group set up, you can use cloud-init to mount the EFS file system automatically at EC2 Instance Launch by following the Amazon instructions at

http://docs.aws.amazon.com/efs/latest/ug/mount-fs-auto-mount-onreboot.html

This involves adding a script during the Launch Instance wizard of the EC2 management console. The script installs the NFS client and writes an entry in the /etc/fstab file to mount the EFS file system on /mnt/efs with the following line:


 echo "$(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone).file-system-id.efs.aws-region.amazonaws.com:/    /mnt/efs   nfs4    defaults" >> /etc/fstab


The mount target DNS name is dynamically constructed so that you don't accidentally mount your file system across the wrong Availability Zone.

For my own web application running on Amazon Linux nodes, I did not want to hassle with cloud-init so I opted for a different approach of auto-mounting EFS by appending the following line to /etc/rc.d/rc.local


 mount -t nfs4 -o nfsvers=4.1 $(curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone).file-system-id.efs.aws-region.amazonaws.com:/ /mnt/efs


Note, you must replace file-system-id and aws-region with your own values.  For example, aws-region could be us-east-1 depending on your EC2 region.

In case you're wondering, you cannot run scripts or use environment variables within /etc/fstab, so it's not possible to run the curl command for mount target DNS name in /etc/fstab directly.  This is the reason why I chose to append to /etc/rc.d/rc.local instead.


Got Too Many Session Files?

If your web application uses PHP and leaves a lot of expired session files behind, you should consider disabling the default session garbage collection and replace it with a scheduled cron job because, by default configuration, there is a 1% chance that PHP will garbage collect session files on each request.  On EFS, file operations like directory listing are slower than local file system, so this can potentially cause your web application to be less responsive every time the garbage collector kicks in.

To turn off session garbage collection, set session.gc_probability to 0 in php.ini and restart your web server.  Next, add the following cron job to garbage collect session files.


 */5 * * * * find /mnt/efs -type f -mmin +60 -delete &>/dev/null


This will run every 5 minutes and delete all files with modified time older than 60 minutes in /mnt/efs.  Now, you're good to go!