Understand WSGI by building a microframework

When you’re learning web development in Python, it’s tempting to go straight for higher level frameworks like Django and Flask that abstract away the interactions between the web server and your application. While this is certainly a productive way of getting started, it’s a good idea to go back to the lower level at some point so you understand what these frameworks are doing. In this post, you will learn about the Web Server Gateway Interface (WSGI) – the standard interface between web servers like nginx and Apache and Python applications. You’ll do that by working from a simple “Hello, world!” example up to a microframework that supports Flask-like URL routing with decorators, templating, and lets you code your application logic as simple Django-like controller functions.

The simplest WSGI application

Take a look at this “Hello, world!” code.

The make_server function that you imported from wsgiref.simple_server  is part of the reference WSGI implementation included in the Python standard library. It returns a web server instance that you can start by calling its serve_forever  method. make_server  takes three arguments: the hostname, the port and the WSGI application itself. In your case, the hello_world  function is the WSGI application. WSGI applications have to be Python callables, so either functions or classes that implement the __call__  method.

Now let’s take a look at hello_world . You’ll notice that it has two arguments: environ  and start_response . environ  is a dictionary that holds information about the execution environment and the request, such as the path, the query string, the body contents and HTTP headers. start_response  is a function that starts the HTTP response by writing out the status code and the headers. Finally, the function returns a list with a single string in it. This is because the return value of a WSGI application must be an iterable. (Strings are iterable too, of course, but iterating over a string and writing it out one character at a time is pretty slow.)

Save this code in a helloworld.py  and run it. Then open a browser and go to http://localhost:8000 . You should see “Hello, world!”. (If you don’t check that you don’t have anything else running on port 8000.)

What our framework will look like

So far, our application is very limited. In your browser, go to http://localhost:800/helloworld/ . You will see “Hello, world!” again. As it stands, your application returns the same response for every path you try. It would be much nicer if you could write something like this (MicroFramework is the framework class, which we will examine later.):

We will get to the route decorators in a moment. For right now, just take a look at what is happening in the home and hello_world controller functions.

Now your functions are taking a Request  object as a parameter and returning a Response object. This is much cleaner than trying to take everything out of the WSGI environ  dictionary and then calling start_response , setting headers manually every time. Here is the Request  class. It’s just a wrapper that extracts information from the environ dictionary and makes it accessible in a more convenient way.

The Request object

Most of the code in this class is just extracting keys from environ , but a few lines deserve a mention. HTTP headers sent with the request are stored in environ in keys with the “HTTP_” prefix, so you can use a dict comprehension to extract them and store them in self.headers.

GET and POST data

Parsing the query string to extract GET and POST data is also interesting. The urlparse  module in the standard library contains a function called parse_qs  that takes a standard HTTP query string in the format ?key1=value1&key2=value2  and converts it into a dictionary that maps keys to list of values. To extract GET data and store it in self.GET , you can just call parse_qs(environ["QUERY_STRING"]) .

Extracting POST data is a bit more complicated as it is contained in the body of the HTTP request. First you have to check if the HTTP method is POST, then read the content length, then get the POST query string by reading that number of bytes from the input stream. Finally, you call parse_qs  on the query string.

Now you can build a request object like so:

The Response object

What about Response ? It follows a similar principle:

As in the Request  class, most of the code here is just for convenience. You can assume that the default response code will be “200 OK” and allow the code to be set manually when it differs. In the same way, it is more natural to manipulate response headers as a dictionary, but they need to be passed to start_response  as a list of tuples, so the wsgi_headers  function, with the @property  decorator, returns them in that structure. You will see how this is used when we take a look at the __call__  method in our framework class.

Now you can build different types of responses by passing different keyword arguments. How about a normal “200 OK” response?

What about a “404 Not Found”?

Or what if an error occurs?

And what if you want to issue a redirect? It is as simple as setting the “301” status code and adding a location header.

The MicroFramework class

Look at the __call__  function. It gets an instance of Response  by first building a  Request  object based on  environ , then passing it to the framework’s dispatch  method. Then it does the work of calling start_response  with the status code and the headers included in the response. Notice how wsgi_headers  is used.

Dispatching requests to controllers

The next interesting thing in the framework class is the dispatch  method. In the constructor, you initialized a self.routes dictionary. It contains a mapping from regular expressions that represent request paths to controller functions. The method iterates through the regular expressions until it finds one that matches the request path, then it calls the associated controller function and returns the response from it to __call__ . If no route matches the path, then it returns returns a “404” response generated by the not_found  static method.

If an error occurs while executing the controller function, the framework grabs the stacktrace, prints it, and returns a “500” response generated by the internal_error  static method.

The route decorator

How do routes get into the self.routes  dictionary in the first place? That’s where the route  decorator comes in. All it does is add a mapping from the regex provided as an argument to the decorator to the controller function itself. The regexes can contain also capture groups that are stored in the request.groups  list and made available to controller functions, as in the following example:

In this route regex, the capture group is a sequence of one or more numeric characters. The question mark (?) after the last slash makes the slash optional.

Integrating Jinja2

The framework we have built so far has no support for templating, but it is easy to integrate Jinja2 or any other template engine. Here is a simple example, which assumes that you have a template directory called “templates” with a file called “helloworld.html” in it.

This post showed you how to build a very simple web framework in Python, but there is much more to do. For instance, our framework doesn’t support cookies or sessions, and there is no database access facility. Caching, form handling and other niceties that Django and Flask provide either out of the box or as plugins would also need to be added to turn it into a fully featured framework.

Why don’t you try to implement some of these features yourself?