Create an API with Flask

Download this notebook

Now that we have created our own model, we can look at how to publish it so that it can be used in various applications. As a first step, we need an API that will later be accessible via the Internet. There are various frameworks for Python that can help us create this. Two of the best known are Django and Flask. In this tutorial, we use Flask, which is very lean and therefore enables a quick start.

However, we will only deal with the programming of the backend. Although this can already be tested via the browser at the end of the chapter, an additional front end, for example in the form of a website or app, is required for publication, which then accesses our back end.

What is Flask?

Welcome to Flask - Flask Documentation

Flask is an extremely lean web framework and comes with its own web server for development. This makes it possible to provide your first own web interface very quickly. The few specifications from Flask leave plenty of room for your own individual solutions, but do not automatically ensure uniform, clean code. Flask uses the template engine Jinja, which could also be used to build a frontend. The web server supplied is only intended for the development period and should be replaced by a more stable and secure alternative when it is released. We will also introduce the package Flask CORS, which is necessary if the frontend and backend are not running on the same server.

Flask and Flask CORS can be installed directly via pip:

pip install Flask

pip install flask-cors

Building the basic structure

The entire functionality of our app is stored in the folder 3_1_Flask/app. We will use this later in Docker. In addition to main.py, the app also contains the models folder, in which we store our model from 1_2_Tensorflow, and utils, in which the familiar utils.py file is located.

We will build and explain the file main.py in this tutorial. The file can be viewed here in our Git.

First, we import all the required packages:

# Some model related imports
import pickle
import re
import pandas as pd
import tensorflow as tf

# New Flask related imports
from flask import Flask, request, jsonify
from flask_cors import CORS

# Our own helper function
from utils.utils import str_to_category

A new app object can then be created. We also use CORS to be able to use our backend together with a separate frontend. If, for example, the frontend is also created directly with the Jinja Template Engine and Flask, this step is not necessary and the flask_cors package is not required. However, as in most cases an app or similar is used as the frontend and problems may otherwise arise, this package is presented here. However, all subsequent instructions are independent of this step.

app = Flask(__name__)
CORS(app)

In principle, Flask works very simple. Normal Python functions can be written, which only need to be supplemented with a decorator. Here, the path is specified and further settings can be made optionally. The return value of the function is sent back as a response to the request. Flask offers functions such as jsonify(), which prepare Python objects accordingly and convert them into a JSON object in this specific case:

@app.route('/hello')
def hello_world():
    return jsonify({'message': 'Hello World!'})

Since we also want to make a prediction using the trained models in this case, we can use our test script (see here) as a guide. We had already saved all the required models as files. They will only be accessed read-only during the predictions, which is why it is sufficient to read them in once at the beginning:

# Load model, encoder and scaler
model = tf.keras.models.load_model('./models/model.h5')
model_enc = pickle.load(open('./models/model_enc.pkl', 'rb'))
transmission_enc = pickle.load(open('./models/transmission_enc.pkl', 'rb'))
fuelType_enc = pickle.load(open('./models/fuelType_enc.pkl', 'rb'))
data_scaler = pickle.load(open('./models/data_scaler.pkl', 'rb'))
label_scaler = pickle.load(open('./models/label_scaler.pkl', 'rb'))

We then need an interface via which requests for a price prediction can be sent to our server. In contrast to our hello_world() function, however, these requests should also contain user-specific parameters.

Accepting entries

There are various options for accepting input. A simple and often useful variant is to explicitly specify the parameters as part of the path. We extend our Hello World example to an individual greeting function:

@app.route('/hello/<string:name>', methods=['GET'])
def hello_name(name: str):
    return jsonify({'message': f'Hello {name}!'})

Flask recognizes the corresponding parts of the path and even pays attention to the data type. The values found are then passed to our function as normal parameters so that they can simply be used as with any other Python function. In our case, however, we expect eight different inputs, the order of which must be exactly right with this method. We have therefore opted for a different variant.

As you can already see in the decorator above, this time we have set an additional parameter methods=['GET']. In this way, it is possible to select the HTTP method. By default, GET is already used, but it would also be possible to specify POST or to allow several methods, which are then processed accordingly in the function. If we had an HTML form as a frontend, for example, POST would be more appropriate. However, as we do not yet have a frontend, but would like to simply test our API later, we use GET. This allows us to encode our parameters in a Query-String directly as part of our URL and call it up with a browser without any other tools.

A query string is part of a URL and can contain several named parameters. A ? is used to mark the beginning. This is followed by the query string itself consisting of pairs field1=value1, which are connected using & or ;. As all names and values are part of the URL, they must be coded accordingly. For example, a space is replaced by + or %20.

With this method, we cannot specify exactly which parameters we need. Instead, Flask provides us with all the transferred parameters in a request object as a kind of dictionary.

@app.route('/api/car-price', methods=['GET'])
def predict():
    # Get data from request
    data = request.args

As we always need all parameters as input for our model and the data types must also be correct, we check both before we process the parameters further:

if 'model' not in data:
    return jsonify({'message': 'Please provide model'}), 400
elif str_to_category(data['model']) not in model_enc.categories_[0]:
    return jsonify({'message': 'Model not supported'}), 400
else:
    car_model = str_to_category(data['model'])
 
    if 'year' not in data:
            ...

For the car model, for example, we expect the parameter model. We first check whether it is contained in the request at all. If this is the case, we then make sure that the specified car model can also be converted by our encoder or that it was included in our training data. If one of the conditions is not met, we can simply use Flask to return an error message with a corresponding error code. In addition, we should name the problem briefly and specifically to make it easier to use later.

If unexpected errors occur during program execution so that our method cannot be terminated correctly, Flask takes care of sending an error message back to the client. This is usually the error code 500 Internal Server Error. Since (outside of debug mode) no more precise error description is otherwise transmitted, a more precise identification of the problem for the client is not possible (but of course often not desired). In our case, we want to inform the client of an incorrect input and help to correct it if possible. To do this, we use the error code 400 Bad Request and add a concrete error-specific message.

If the parameter is available and valid, we convert and save it temporarily in order to check the other parameters in the same way. Once we have done this for all parameters, we can construct our dataframe and use our model to predict a possible price:

# Create dataframe
df = pd.DataFrame({
    'model': [car_model],
    'year': [year],
    'transmission': [transmission],
    'mileage': [mileage],
    'fuelType': [fuel_type],
    'tax': [tax],
    'mpg': [mpg],
    'engineSize': [engine_size]
})
 
# Encode and scale data
df.loc[:, 'model'] = model_enc.transform(df.loc[:, ['model']])
df.loc[:, 'transmission'] = transmission_enc.transform(df.loc[:, ['transmission']])
df.loc[:, 'fuelType'] = fuelType_enc.transform(df.loc[:, ['fuelType']])
df = data_scaler.transform(df)
 
# Predict
result = model(df)
result = label_scaler.inverse_transform(result)[0, 0]
result = round(max(result, 0.0), 2)
return jsonify({'predicted_price': result})

The steps for this are almost identical to our test script (see here), which is why we will not go into these any further. We have only added the last two lines to make the return of our server a little nicer. We round our prediction to two decimal places and also set negative predictions to 0 and wrap the resulting price in a JSON dictionary again. Of course, completely different answers can also be constructed here. It is only important to ensure that there is an agreement between the front and back end.

Run Flask

We can test our newly written API directly on our computer thanks to the web server supplied with Flask. All we need to do is call run() on the app object we created at the beginning:

if __name__ == '__main__':
    app.run(debug=True)

We can then simply execute our script like any other Python script. During development, it is also useful to set the parameter debug=True. This allows Flask to display the complete error message directly in the browser in the event of problems and not hide it behind the previously mentioned generic server error. If no other port is specified, the Flask server usually runs on port 5000, so you can reach it at localhost:5000. If you have not written a method for the root path /, you will receive an error message at this address. However, you can simply add a corresponding path. For the URL http://localhost:5000/hello/REACH, for example, we receive the following response:

{
  "message": "Hello REACH!"
}

However, the server is currently only available on your local computer. To change this, you can specify a host. The address 0.0.0.0 ensures that your server is accessible via all public IP addresses on your system. Of course, you should only do this if you trust the other participants in your network. However, depending on the other network settings, you can then also access your server from other devices in your network.

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

We can now test our model by sending a corresponding query as a query string (we use the same inputs here as in the TensorFlow tutorial): http://localhost:5000/api/car-price?model=T-Roc&year=2019&transmission=manual&mileage=12123&fuelType=petrol&tax=145&mpg=42.7&engineSize=2.0. As expected, we receive the following output:

{
  "predicted_price": 26004.03
}

In the event of invalid entries (http://localhost:5000/api/car-price?model=Cybertruck), we receive error code 400 and our previously defined error message instead:

{
  "message": "Car model not supported"
}

Production server

Instead of using a method call in the script, you can also start your server using your own Flask command in the command line. To do this, however, you must first store the name of your script in an environment variable FLASK_APP. The command is then simply flask run.

However, as the server supplied is only intended for development and scales poorly, we should replace it with another server. There are numerous alternatives, we will use Gunicorn as it works well with Flask and is also easy to set up. The server only works on UNIX systems and can be easily installed with pip. After we have changed to the /app folder, a single call in the command line is then sufficient, similar to Flask’s own server:

pip install gunicorn
gunicorn --bind 0.0.0.0:5000 main:app

Since we want to operate our server later within a Docker container and not on our own computer anyway, it is not a problem if the development takes place on a non-UNIX system or if Gunicorn doesn’t work on your local machine.

Introduction to IncubAItor

Software Engineering

Docs

REACH Incub.AI

Create an API with Flask

What is Flask?

Building the basic structure

Accepting entries

Run Flask

Production server

Create an API with Flask

What is Flask?#

Building the basic structure#

Accepting entries#

Run Flask#

Production server#

What is Flask?

Building the basic structure

Accepting entries

Run Flask

Production server