In startup land, one of the most common admonitions is to build the API first. This way, other developers can access your data, there’s one codebase to worry about, and a consistent behavior everywhere.
Part One – Building the API skeleton with YAML
Part Two – From YAML to ElasticSearch
Part Three – Writing Chef Scripts for Amazon Deployment
Part Four – Writing Fabric Scripts for Code Deployment
For more information on why you should build your API first, read this blog post.
Today I’m going to show you how to build a basic API in Tornado and gunicorn. Later we’ll add an ElasticSearch backend, and then use Chef scripts and Fabric for instant cloud deployment on a free Amazon EC2 micro instance.
Part 1: Building the API Structure
Instead of taking the traditional route, of showing you everything working in an auto-magic fashion, we’re going to build this project step by step, just as you would do in the real world.
This is more of a how to think through a project, and develop something more complicated.
Our API description is very basic, and consists of the following:
merchantapi: description: Deals API for finding active deals for your merchant. path: /v1/<merchant_name>/deals?status=active options: active inactive version: 1.0 |
Let’s start writing code!
Building the Web App
Start by creating a new Python virtual environment, and a new project directory:
virtualenv --distribute apiEnv source apiEnv/bin/activate pip install tornado gunicorn nose requests mkdir elasticAPI cd elasticAPI git init |
Next, open up your favorite text editor (emacs is the answer), and save the following file as webapp.py:
# Run with: # $ gunicorn -k egg:gunicorn#tornado webapp:app from tornado.web import Application, RequestHandler class MainHandler(RequestHandler): def get(self): self.write("Hello, world") app = Application([ (r"/", MainHandler) ]) |
Let’s verify that this works by writing a test for it. Create a new directory called tests, and create a file named ApiTests.py:
import requests def test_serverup(): r = requests.get('http://localhost:8000') assert r.status_code == 200 |
Now we can run nosetests in our elasticAPI directory, and see that our server runs:
gunicorn -k egg:gunicorn#tornado webapp:app & nosetests |
If we did everything right, we should now have a functioning tornado server running with gunicorn. Let’s save the work we’ve done so far by adding it to git:
git add webappy.py git add tests git commit |
When we enter git commit, we should now be prompted to enter a message about what we’ve written in our code so far. It helps to be verbose.
Building the RESTful API
For simplicity’s sake, we’re going to just implement one piece of the API. And that will be a GET for deals. Let’s start by writing the tests we’d like to pass, and afterwards implement the code to make the tests pass.
According to Wikipedia, a RESTful API implements the verbs of HTTP. These include GET, POST, DELETE, and PUT. Let’s add those cases to our API tests:
import requests def test_serverup(): r = requests.get('http://localhost:8000') assert r.status_code == 200 def test_api_get(): r = requests.get('http://localhost:8000/v1/homedepot/deals?status=active') assert r.status_code == 200 def test_api_post(): r = requests.post('http://localhost:8000/v1/homedepot/deals?status=active') assert r.status_code == 403 def test_api_put(): r = requests.put('http://localhost:8000/v1/homedepot/deals?status=active') assert r.status_code == 403 def test_api_delete(): r = requests.delete('http://localhost:8000/v1/homedepot/deals?status=active') assert r.status_code == 403 |
These tests make sure we get a status code of 200 for a correctly formed URL request, and a 403 Permission denied for all the other requests we aren’t implementing. Now, let’s write the code to make those tests pass:
# Run with: # $ gunicorn -k egg:gunicorn#tornado webapp:app from tornado.web import Application, RequestHandler, HTTPError class MainHandler(RequestHandler): def get(self): self.write("Hello, world") class DealsHandler(RequestHandler): def get(self, merchant_name): self.write(merchant_name) def post(self, merchant_name): raise HTTPError(403) def delete(self, merchant_name): raise HTTPError(403) def put(self, merchant_name): raise HTTPError(403) app = Application([ (r"/", MainHandler), (r"/v1/(.*)/deals", DealsHandler) ]) |
Verify everything is working properly by running nosetests, then commit our changes.
nosetests
git commit |
Adding Example Data to the API With YAML
YAML is a way to write human readable data into text files. We’ll use it to mock up our data we want the API to read. YAML files look like this:
company_name: homedepot active: - {25% percent off: ZYZZ, Buy One Get One: REDDIT} inactive: - {0% off: DIVIDEBYZERO, Buy None Get A Ton: FREEONETONTRUCK} |
Create a new directory named data, and save the file above as homedepot.yaml in the data/ directory.
While we’re at it, let’s add another company. Save the following as lowes.yaml in the data/ directory:
company_name: lowes active: - {50% off fun: YCOMBINATOR, 1 Free Like: INSTAGRAM} inactive: - {100 Free Likes: GTL} |
Loading Test Data with iPython
Great! Now we’ve got two pieces of data, and we can tell exactly what our responses will look like. But now we’ve got to install pyyaml. Let’s do that now, and try loading our data in an ipython session:
pip install pyyaml ipython ipython Python 2.7.3 (default, Apr 20 2012, 22:39:59) Type "copyright", "credits" or "license" for more information. IPython 0.13 -- An enhanced Interactive Python. ? -> Introduction and overview of IPython's features. %quickref -> Quick reference. help -> Python's own help system. object? -> Details about 'object', use 'object??' for extra details. In [1]: a = open("data/lowes.yaml") In [2]: import yaml In [3]: b = yaml.load(a) In [4]: b Out[4]: {'active': [{'1 Free Like': 'INSTAGRAM', '50% off fun': 'YCOMBINATOR'}], 'company_name': 'lowes', 'inactive': [{'100 Free Likes': 'GTL'}]} In [5]: exit() |
Rewriting API With YAML Test Data
So now we know what our data looks like to Python when loaded by pyyaml. We can now write our tornado server to load and print the data from these YAML files:
# Run with: # $ gunicorn -k egg:gunicorn#tornado webapp:app import yaml import glob # Allows us to load all the files in the data directory from tornado.web import Application, RequestHandler, HTTPError def getData(): data = {} a = glob.iglob("data/*.yaml") # Loads all the yaml files in the data directory for file in a: b = open(file) c = yaml.load(b) data.update({c['company_name']: c}) # Takes the company_name and uses it as the key for lookups in dictionary b.close() return data dataDictionary = getData() class MainHandler(RequestHandler): def get(self): self.write("Hello, world") class DealsHandler(RequestHandler): def get(self, merchant_name): status = self.request.arguments['status'][0] # Active or Inactive, TODO: VERIFY THIS! self.write(dataDictionary[merchant_name][status][0]) def post(self, merchant_name): raise HTTPError(403) def delete(self, merchant_name): raise HTTPError(403) def put(self, merchant_name): raise HTTPError(403) app = Application([ (r"/", MainHandler), (r"/v1/(.*)/deals", DealsHandler) ]) |
Great! Now we can start adding tests to make sure our data exists, and returns properly. If you looked above, you may have noticed that we have a few potentially big bugs in our code now.
Somebody can make bad requests and get server errors that don’t work. Let’s add those test cases at the end of our existing tests/TestApi.py:
def test_existing_data(): r = requests.get('http://localhost:8000/v1/homedepot/deals?status=active') assert r.status_code == 200 # Make sure our API reads existing data r = requests.get('http://localhost:8000/v1/lowes/deals?status=active') assert r.status_code == 200 # For both our files! def test_nonexistant_data(): r = requests.get('http://localhost:8000/v1/nonedepot/deals?status=inactive') assert r.status_code == 404 # Make sure we get a 404 for missing def test_improper_query(): r = requests.get('http://localhost:8000/v1/homedepot/deals?blastus=djantive111') assert r.status_code == 400 # And a bad request for bad requests! |
And now we can run nosetests, and see that our test_existing_data() works, but we’re not giving the proper error codes back for non-existant data and improper queries.
Formalizing the API
Let’s formalize our API by creating a description. In your root elasticAPI directory, create a new file called APIDescription.yaml:
merchantapi: description: Deals API for finding active deals for your merchant. path: /v1/<merchant_name>/deals?status=active options: active inactive version: 1.0 |
Finalizing the API and Error Handling
So now we can create a root API location to let people know how the API works, and also to verify the passed options:
# Run with: # $ gunicorn -k egg:gunicorn#tornado webapp:app import yaml import glob # Allows us to load all the files in the data directory from tornado.web import Application, RequestHandler, HTTPError def getData(): data = {} a = glob.iglob("data/*.yaml") # Loads all the yaml files in the data directory for file in a: b = open(file) c = yaml.load(b) data.update({c['company_name']: c}) # Takes the company_name and uses it as the key for lookups in dictionary b.close() return data def getAPIDescription(): a = open("APIDescription.yaml") return yaml.load(a) apiDescription = getAPIDescription() dataDictionary = getData() allowableOptions = apiDescription['merchantapi']['options'] class MainHandler(RequestHandler): def get(self): self.write("Hello, world") class APIHandler(RequestHandler): def get(self): self.write(apiDescription) class DealsHandler(RequestHandler): class DealsHandler(RequestHandler): def get_key_or_error(self, arguments, key): if (key in arguments.keys()) and (arguments[key][0] in allowableOptions): return arguments['status'][0] raise HTTPError(400) def get(self, merchant_name): status = self.get_key_or_error(self.request.arguments, 'status') if merchant_name in dataDictionary: response = dataDictionary[merchant_name][status] self.write(response[0]) else: raise HTTPError(404) def post(self, merchant_name): raise HTTPError(403) def delete(self, merchant_name): raise HTTPError(403) def put(self, merchant_name): raise HTTPError(403) app = Application([ (r"/", MainHandler), (r"/v1/", APIHandler), (r"/v1/(.*)/deals", DealsHandler) ]) |
Alright! Now we should be able to verify that everything works by running nosetests and seeing everything pass! Go ahead and do it.
In Part 2, we’ll add our ElasticSearch backend to our existing data. Finally, in Part 3 we’ll write our Chef scripts and a Fabfile to deploy our API to the cloud.
Part One – Building the API skeleton with YAML
Part Two – From YAML to ElasticSearch
Part Three – Writing Chef Scripts for Amazon Deployment
Part Four – Writing Fabric Scripts for Code Deployment
In the meantime, you can view the final code, and the Chef scripts at github.