An Iterative Approach to Building and Deploying an API

In startup land, one of the most common admonitions is to build the API first. This way, other developers can access your data, there’s one codebase to worry about, and a consistent behavior everywhere.

Part One – Building the API skeleton with YAML
Part Two – From YAML to ElasticSearch
Part Three – Writing Chef Scripts for Amazon Deployment
Part Four – Writing Fabric Scripts for Code Deployment

For more information on why you should build your API first, read this blog post.

Today I’m going to show you how to build a basic API in Tornado and gunicorn. Later we’ll add an ElasticSearch backend, and then use Chef scripts and Fabric for instant cloud deployment on a free Amazon EC2 micro instance.

Part 1: Building the API Structure

Instead of taking the traditional route, of showing you everything working in an auto-magic fashion, we’re going to build this project step by step, just as you would do in the real world.

This is more of a how to think through a project, and develop something more complicated.

Our API description is very basic, and consists of the following:

merchantapi:
  description: Deals API for finding active deals for your merchant.
  path: /v1/<merchant_name>/deals?status=active
  options:
           active
           inactive
  version: 1.0

Let’s start writing code!

Building the Web App

Start by creating a new Python virtual environment, and a new project directory:

virtualenv --distribute apiEnv
source apiEnv/bin/activate
pip install tornado gunicorn nose requests
mkdir elasticAPI
cd elasticAPI
git init

Next, open up your favorite text editor (emacs is the answer), and save the following file as webapp.py:

# Run with:                                                                                                                                                                                                                                                                                                                                                                                                            
#   $ gunicorn -k egg:gunicorn#tornado webapp:app 
 
from tornado.web import Application, RequestHandler
 
class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")
 
app = Application([
    (r"/", MainHandler)
])

Let’s verify that this works by writing a test for it. Create a new directory called tests, and create a file named ApiTests.py:

import requests
 
def test_serverup():
    r = requests.get('http://localhost:8000')
 
    assert r.status_code == 200

Now we can run nosetests in our elasticAPI directory, and see that our server runs:

gunicorn -k egg:gunicorn#tornado webapp:app &
nosetests

If we did everything right, we should now have a functioning tornado server running with gunicorn. Let’s save the work we’ve done so far by adding it to git:

git add webappy.py
git add tests
git commit

When we enter git commit, we should now be prompted to enter a message about what we’ve written in our code so far. It helps to be verbose.

Building the RESTful API

For simplicity’s sake, we’re going to just implement one piece of the API. And that will be a GET for deals. Let’s start by writing the tests we’d like to pass, and afterwards implement the code to make the tests pass.

According to Wikipedia, a RESTful API implements the verbs of HTTP. These include GET, POST, DELETE, and PUT. Let’s add those cases to our API tests:

import requests
 
def test_serverup():
    r = requests.get('http://localhost:8000')
 
    assert r.status_code == 200
 
def test_api_get():
    r = requests.get('http://localhost:8000/v1/homedepot/deals?status=active')
 
    assert r.status_code == 200
 
def test_api_post():
    r = requests.post('http://localhost:8000/v1/homedepot/deals?status=active')
 
    assert r.status_code == 403
 
def test_api_put():
    r = requests.put('http://localhost:8000/v1/homedepot/deals?status=active')
 
    assert r.status_code == 403
 
def test_api_delete():
    r = requests.delete('http://localhost:8000/v1/homedepot/deals?status=active')
 
    assert r.status_code == 403

These tests make sure we get a status code of 200 for a correctly formed URL request, and a 403 Permission denied for all the other requests we aren’t implementing. Now, let’s write the code to make those tests pass:

# Run with:                                                                                                                                                                                                                                                                                                                                                                                                            
#   $ gunicorn -k egg:gunicorn#tornado webapp:app 
 
from tornado.web import Application, RequestHandler, HTTPError
 
class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")
 
class DealsHandler(RequestHandler):
    def get(self, merchant_name):
        self.write(merchant_name)
 
    def post(self, merchant_name):
        raise HTTPError(403)
 
    def delete(self, merchant_name):
        raise HTTPError(403)
 
    def put(self, merchant_name):
        raise HTTPError(403)
 
app = Application([
    (r"/", MainHandler),
    (r"/v1/(.*)/deals", DealsHandler)
])

Verify everything is working properly by running nosetests, then commit our changes.

nosetests
git commit

Adding Example Data to the API With YAML

YAML is a way to write human readable data into text files. We’ll use it to mock up our data we want the API to read. YAML files look like this:

company_name:                                                                                                                                                                                               
  homedepot                                                                                                                                                                                                 

active:                                                                                                                                                                                                     
  - {25% percent off: ZYZZ, Buy One Get One: REDDIT}                                                                                                                                                        

inactive:                                                                                                                                                                                                   
  - {0% off: DIVIDEBYZERO, Buy None Get A Ton: FREEONETONTRUCK}

Create a new directory named data, and save the file above as homedepot.yaml in the data/ directory.

While we’re at it, let’s add another company. Save the following as lowes.yaml in the data/ directory:

company_name:
  lowes

active:
  - {50% off fun: YCOMBINATOR, 1 Free Like: INSTAGRAM}

inactive:
  - {100 Free Likes: GTL}

Loading Test Data with iPython

Great! Now we’ve got two pieces of data, and we can tell exactly what our responses will look like. But now we’ve got to install pyyaml. Let’s do that now, and try loading our data in an ipython session:

pip install pyyaml ipython
ipython
Python 2.7.3 (default, Apr 20 2012, 22:39:59) 
Type "copyright", "credits" or "license" for more information.
 
IPython 0.13 -- An enhanced Interactive Python.
?         -&gt; Introduction and overview of IPython's features.
%quickref -&gt; Quick reference.
help      -&gt; Python's own help system.
object?   -&gt; Details about 'object', use 'object??' for extra details.
 
In [1]: a = open("data/lowes.yaml")
 
In [2]: import yaml
 
In [3]: b = yaml.load(a)
 
In [4]: b
Out[4]: 
{'active': [{'1 Free Like': 'INSTAGRAM', '50% off fun': 'YCOMBINATOR'}],
 'company_name': 'lowes',
 'inactive': [{'100 Free Likes': 'GTL'}]}
 
In [5]: exit()

Rewriting API With YAML Test Data
So now we know what our data looks like to Python when loaded by pyyaml. We can now write our tornado server to load and print the data from these YAML files:

# Run with:                                                                                                                                                                                                                                                                                                                                                                                                            
#   $ gunicorn -k egg:gunicorn#tornado webapp:app 
 
import yaml
import glob # Allows us to load all the files in the data directory
 
from tornado.web import Application, RequestHandler, HTTPError
 
def getData():
    data = {}
    a = glob.iglob("data/*.yaml") # Loads all the yaml files in the data directory
    for file in a:
        b = open(file)
        c = yaml.load(b)
        data.update({c['company_name']: c}) # Takes the company_name and uses it as the key for lookups in dictionary
        b.close()
    return data
 
dataDictionary = getData()
 
class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")
 
class DealsHandler(RequestHandler):
    def get(self, merchant_name):
        status = self.request.arguments['status'][0] # Active or Inactive, TODO: VERIFY THIS!
        self.write(dataDictionary[merchant_name][status][0])
 
    def post(self, merchant_name):
        raise HTTPError(403)
 
    def delete(self, merchant_name):
        raise HTTPError(403)
 
    def put(self, merchant_name):
        raise HTTPError(403)
 
app = Application([
    (r"/", MainHandler),
    (r"/v1/(.*)/deals", DealsHandler)
])

Great! Now we can start adding tests to make sure our data exists, and returns properly. If you looked above, you may have noticed that we have a few potentially big bugs in our code now.

Somebody can make bad requests and get server errors that don’t work. Let’s add those test cases at the end of our existing tests/TestApi.py:

def test_existing_data():
    r = requests.get('http://localhost:8000/v1/homedepot/deals?status=active') 
    assert r.status_code == 200 # Make sure our API reads existing data
    r = requests.get('http://localhost:8000/v1/lowes/deals?status=active') 
    assert r.status_code == 200 # For both our files!
 
def test_nonexistant_data():
    r = requests.get('http://localhost:8000/v1/nonedepot/deals?status=inactive')  
    assert r.status_code == 404 # Make sure we get a 404 for missing
 
def test_improper_query():
   r = requests.get('http://localhost:8000/v1/homedepot/deals?blastus=djantive111') 
   assert r.status_code == 400 # And a bad request for bad requests!

And now we can run nosetests, and see that our test_existing_data() works, but we’re not giving the proper error codes back for non-existant data and improper queries.

Formalizing the API

Let’s formalize our API by creating a description. In your root elasticAPI directory, create a new file called APIDescription.yaml:

merchantapi:
  description: Deals API for finding active deals for your merchant.
  path: /v1/<merchant_name>/deals?status=active
  options:
           active
           inactive
  version: 1.0

Finalizing the API and Error Handling
So now we can create a root API location to let people know how the API works, and also to verify the passed options:

# Run with:                                                                                                                                                                                                                                                                                                                                                                                                            
#   $ gunicorn -k egg:gunicorn#tornado webapp:app 
 
import yaml
import glob # Allows us to load all the files in the data directory
 
from tornado.web import Application, RequestHandler, HTTPError
 
def getData():
    data = {}
    a = glob.iglob("data/*.yaml") # Loads all the yaml files in the data directory
    for file in a:
        b = open(file)
        c = yaml.load(b)
        data.update({c['company_name']: c}) # Takes the company_name and uses it as the key for lookups in dictionary
        b.close()
    return data
 
def getAPIDescription():
    a = open("APIDescription.yaml")
    return yaml.load(a)
 
apiDescription = getAPIDescription()
dataDictionary = getData()
allowableOptions = apiDescription['merchantapi']['options']
 
class MainHandler(RequestHandler):
    def get(self):
        self.write("Hello, world")
 
class APIHandler(RequestHandler):                                                                                                                                                                           
    def get(self):                                                                                                                                                                                          
        self.write(apiDescription)                                                                                                                                                                          
 
class DealsHandler(RequestHandler):
class DealsHandler(RequestHandler):
    def get_key_or_error(self, arguments, key):
        if (key in arguments.keys()) and (arguments[key][0] in allowableOptions):
            return arguments['status'][0]
        raise HTTPError(400)
 
    def get(self, merchant_name):
        status = self.get_key_or_error(self.request.arguments, 'status')
        if merchant_name in dataDictionary:
            response = dataDictionary[merchant_name][status]
            self.write(response[0])
        else:
            raise HTTPError(404)
 
    def post(self, merchant_name):
        raise HTTPError(403)
 
    def delete(self, merchant_name):
        raise HTTPError(403)
 
    def put(self, merchant_name):
        raise HTTPError(403)
 
app = Application([
    (r"/", MainHandler),
    (r"/v1/", APIHandler),
    (r"/v1/(.*)/deals", DealsHandler)
])

Alright! Now we should be able to verify that everything works by running nosetests and seeing everything pass! Go ahead and do it.

In Part 2, we’ll add our ElasticSearch backend to our existing data. Finally, in Part 3 we’ll write our Chef scripts and a Fabfile to deploy our API to the cloud.

Part One – Building the API skeleton with YAML
Part Two – From YAML to ElasticSearch
Part Three – Writing Chef Scripts for Amazon Deployment
Part Four – Writing Fabric Scripts for Code Deployment

In the meantime, you can view the final code, and the Chef scripts at github.