Musings of an 8-bit Vet

A student coder’s diary.

Managing Rails Environment Variables

Environment variables are a set of dynamic named values that affect the way running processes behave on a computer. In almost all operating systems each process has a private set of environment variables, and by default when a process is created it inherits a duplicate environment from the parent process, except where explicit changes are made.

Typically environment variables are set in your .bash_profile, .bashrc, or .zshrc.

Examples include $PATH, $PWD, $SHELL, $EDITOR. Typing ENV in your terminal will show you all the environment variables currently set. You can access environment variables using echo $<env_var_name> or if you don’t know the exact name of the environment variable by piping the output of env to grep (or ack, or ag), for example:

Piping ENV output to grep
1
2
$ ENV | grep ssh
SSH_AUTH_SOCK=/private/tmp/com.apple.launchd.okWDAcFz3u/Listeners

The Rails framework also relies on environment variables, typically to set TEST, DEVELOPMENT or PRODUCTION modes. These variables are stored in a hash named ENV, and they can be explored from the Rails console:

Fetching environment values from the ENV hash
1
2
irb(main):001:0 ENV['RAILS_ENV']
=> "development"

If you’ve ever found that a passing test suite suddenly is failing for no reason, it could be because the rails environment variable is not running as “test”, but as “development”. The takeaway here is: Environment Variables Configure Behavior

They also configure other applications, such as the secret token to set Rails sessions, and the keys used to access cloud services like AWS, or email services like Send Grid. Often we, unknowingly set these environment variables in files that are committed to Github, which is open source.

This has become such a problem that services such as AWS monitor repos looking to see if AWS credentials have been compromised, and will send a warning notice to the credential owners without minutes of the git commit that exposed them. A cursory search of Stack Overflow will reveal sad tales of credentials that were scraped from Github and then used to run up very expensive bills for services that the credential owner was liable for.

But what about ‘secrets.yml’ in Rails?

The secrets.yml file in Rails 4.1 was added to address this issue, bu the problem is, it’s not really secret, and it’s a bit convoluted to use. It violates these two basic rules:

  1. Don’t store environment variables in any file that is committed to a repo.
  2. Manage your .gitignore, and make sure your team excludes files that contain credentials.

The Solution?

Like most things in the Ruby world, if it’s a common problem there is probably a gem for it.

In this care there are two: dotenv & figaro.

They both approach the problem in similar ways, I’ll give the basics for using dotenv

Step 1: Gemfile

Add dotenv gem to the Gemfile
1
2
3
4
5
6
source 'https://rubygems.org'

ruby '2.2.0'
gem 'rails', '4.2.0'

gem 'dotenv-rails', groups [:development, :test]

Then run bundle install

Step 2: Update your .gitignore

Add the .env file to .gitignore
1
2
3
4
log/*
.bundle/

.env

This is important, you don’t want to create the .env file before it is ignored because it’s extra steps to remove it if it gets accidentally committed.

Step 3: Create the .env file

The .env file is a hidden file that will hold your environment variables.

Example .env file
1
2
3
4
SECRET_TOKEN = '2398sljae23lkj;2'
S3_BUCKET_NAME = 'portal-device'
AWS_ACCESS_KEY_ID = '052kjJKW620'
SENDGRID_API = 'Twer2590lkQ5'

Step 4: Configure secrets.yml and your initializers

Your Rails app needs to know how to get to these ENV variables, so you’ll pass them in from either secrets.yml or the initializers for your service, e.g.

secrets.yml
1
2
3
4
5
6
7
8
9
10
11
12
13
14
development:
secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
omniauth_provider_key: <%= ENV["SENDGRID_API"] %>
omniauth_provider_secret: <%= ENV["AWS_ACCESS_KEY_ID"] %>

test:
secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
send_gred_key: <%= ENV["SENDGRID_API"] %>
aws_access_key: <%= ENV["AWS_ACCESS_KEY_ID"] %>

production:
secret_key_base: <%= ENV["SECRET_KEY_BASE"] %>
send_gred_key: <%= ENV["SENDGRID_API"] %>
aws_access_key: <%= ENV["AWS_ACCESS_KEY_ID"] %>

Step 5: Set the variables in production

Make sure to set your environment variables in your production VPS, either from the web interface or via the command line. For heroku this would be look like this:

heroku config:set SECRET_TOKEN=45FEW%dak27ks

Step 6: Distribute to your development team

Make the .env file available to your collaborators, but do it via a secure means. That means don’t use Dropbox, Slack, HipChat, etc. Also make sure that everyone has pulled the git commit that updated .gitignore to ignore the the .env file and use the dotenv gem.

Resources

Rails Guides: Configuring Rails Applications

Rails App Project: Rails Environment Variables

dotenv gem

figaro gem

Brandon Hilkert Blog: Using Rails 4.1 Secrets for Configuration

Upgrading OS X PostgreSQL App to 9.4

Postgres 9.4 has finally shipped. This long awaited release includes numerous upgrades, including native support for JSON, and improved scalability and index performance.

Upgrading to Postgres 9.4.0 is not as simple as merely replacing your current 9.3.x version with the newer version. The release notes indicate that users who wish to upgrade must first migrate their existing data from prior versions using one of two strategies: pg_dumpall or pg_upgrade.

There are typically two ways that OS X users install Postgres on their machine: via the OS X native Postgres.app (the one with the blue elephant icon) or using the package manager homebrew.

I’ll provide instructions for Postgres.app using pg_dumpall to upgrade on OS X.

If you installed with homebrew follow the excellent instructions found here: Keita’s Blog: Homebrew and PostgreSQL 9.4

1: Export your existing databases to a SQL script file

Postgres stores databases in version specific directories. So before we get started we need to know what the current Postgres data directory is for the installed version. Fire up your postgres console and find out with SHOW data_directory;. Your output may look different, but what we want is the path that is returned. On my machine it looked like this:

Using psql to locate the data directory
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
$ psql
Null display is "[NULL]".
Expanded display is used automatically.
psql (9.3.5.2)
Type "help" for help.

localhost @David=# SHOW data_directory;

                      data_directory
-----------------------------------------------------------
 /Users/David/Library/Application Support/Postgres/var-9.3
(1 row)

localhost @David=# \q

$

Make a note of this directory. Once the data has been migrated and you’ve confirmed everything is working you’ll probably want to delete the old copy to free up space.

Next export the existing 9.3 databases out of that directory prior to upgrading to 9.4:

Using pg_dumpall to create a single SQL script file for migration to 9.4
1
2
3
mkdir pg_migrate
cd pg_migrate
pg_dumpall > db.out

It’s worth noting that like many unix commands you will get no feedback from this command while it is running. Depending on how many and/or how large your pg databases are this may take a while. When you are returned to your standard shell prompt run ls and if you see db.out you are ready to go.

2: Download the latest Postgres.app

Click the elephant icon in the OS X menubar and choose quit to stop the server.

Go to your Applications directory and find the Postgres app (the blue elephant icon). Rename it to Postgres.old.

Go to http://postgresapp.com and download the latest version. Double cick the downloaded zip file and drag the new Postgres app into your Applications folder.

Double click the icon and wait for the application to start. From either the elephant icon in the menu bar or the psql command check the version number. It should be 9.4.0.

3: Import the SQL dump into the new server directory.

From the directory where you ran pg_dumpall in step 1 run the following command:

1
psql -f db.out postgres

Unlike the original pg_dumpall command you will see a lot of activity on the screen, including some warnings that look like errors. This is expected output, so be patient while the databases are recreated in the new 9.4 server directory.

Once you get your command prompt back, go into the psql console and check you databases are all there the command \l. At a minimum you should have output like the example below, where one database typically has your OSX or postgres admin username, two are templates, and one is postgres:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
$ psql
Null display is "[NULL]".
Expanded display is used automatically.
psql (9.4.0)
Type "help" for help.

localhost @David=# \l
                                       List of databases
            Name             | Owner | Encoding |   Collate   |    Ctype    | Access privileges
-----------------------------+-------+----------+-------------+-------------+-------------------
 David                       | David | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 postgres                    | David | UTF8     | en_US.UTF-8 | en_US.UTF-8 |
 template0                   | David | UTF8     | en_US.UTF-8 | en_US.UTF-8 | =c/David         +
                             |       |          |             |             | David=CTc/David
 template1                   | David | UTF8     | en_US.UTF-8 | en_US.UTF-8 | David=CTc/David  +
                             |       |          |             |             | =c/David
(4 rows)

localhost @David=# \q
$

4: Clean up the export file and the old data directory

Verify that your applications can access your databases, and that everything is working as expected. Once you are comfortable with that you have some cleaning up to do.

Backup or delete the db.out file.

If you are using postgres.app delete the postgres.old application from your Applications folder.

Finally using the SHOW data_directory; command from step 1, verify that the new directory is not the same as the old one, and then you can safely delete the old database directory. In my case the old directory was ~/Library/Application Support/Postgres/var-9.3, and the new one is now:

1
2
3
4
5
6
7
8
9
10
11
$ psql
Null display is "[NULL]".
Expanded display is used automatically.
psql (9.4.0)
Type "help" for help.

localhost @David=# SHOW data_directory;
                      data_directory
-----------------------------------------------------------
 /Users/David/Library/Application Support/Postgres/var-9.4
(1 row)

You can remove or backup the old directory at your discretion.

That completes your upgrade to 9.4.0. For any patch revisions, i.e. 9.4.x, that come later you can simply upgrade the postgres application by downloading the newer Postgres.app and installing it over the old version.

Parsing Command Line Input

One of the biggest challenges with a terminal/command line driven interface (CLI) is parsing out the relavent pieces of the input. For a lot of web and touch based interfaces this kind of input is constrained by input widgets.

We recently completed an ‘Event Reporter’ project, and it allows the user to interact with a library of data about attendees to an event. Commands would include: find last_name Smith, load data.csv, save to results.csv, or queue count and queue print.

Ruby provides a capable set of methods for breaking input into strings and arrays, which then allows you to manipulate and extract the parts that are meaningful so they can be passed as messages to the other classes in the application.

The basic CLI requires a REPL loop that Reads input, Evaluates it, Prints a response, and repeats until some condition is met that terminates the loop.

These are examples of how we tackled some of the problems we came up against:

The initial REPL loop
1
2
3
4
5
6
7
8
  def start
    printer.intro
    until finished?
      printer.command_request
      process_command
    end
    printer.ending
  end

In the code snippet above, the two methods on lines 4 & 5 handle the request for input and then pass that input off for processing.

The process_command method is a case statement that has boolean conditions based on the input received, and for each type of command request, breaks the command down to extract the required content. For example, the command load has two conditions: as a single term that will load a default file, and with a filename that will load a given file. One strategy looks like this:

Parsing the LOAD request
1
2
3
4
5
6
7
8
when load?
  if @input.length > 1
    load_file(@input.last)
  else
    load_file
  end
  [...]
end

Here we already know that the second (or last) part of the command is going to be the filename, so we use length to determine that condition, and then pass the corresponding string out of the @input array to the load() method. The load method has it’s own logic to check if a file exists, and handle malformed file names, since that is not a command parsing responsiblity.

When evaluating the find method there are three components to the command, find, criteria, i.e. state, city, etc., and the term to search against, ‘CO’ or ‘Denver’. However the term may be consist of multiple words, such as ‘New Orleans’. Because we decompose the command into an array, split on spaces, the search term needs to be reconsituted. One way to approach that might look like this:

Parsing the FIND request with multi-word term
1
2
3
4
5
6
7
8
9
10
when find
  if @input.length > 2
    value = input[2..input.length].join(" ")
    @repository.find_by(@input[1], value)
    printer.results(@repository.results_count)
  else
    printer.not_a_valid_command
  end
  [...]
end

In a similar strategy to the load method, we evaluate how long the input request is. Then we go after the criteria and the term. Criteria is the second element in the array after the command, so that is handled with input[1] (remembering that Ruby is zero-indexed).

The search term or criteria is all the remaining string elements of the array, so we get that value on line 3 by re-combining all the array elements from [input[2]..input.length] and then .join(" ") them with a space between. The resulting terms are then handled off to the repository on line 4 to the find(criteria, search_term) method.

As part of the evaluation of our project, all of this parsing logic should eventually be refactored into a separate class of it’s own, but that’s a topic for another post.