How to build a scalable architecture for your Webservice — from Day 1

Carlos Schwabe
9 min readFeb 10, 2022

During the first few weeks at Brick I encountered an old dilemma that, I believe, every early stage startup find themselves in: Should I spend more time on building a robust architecture that, when day breaks, will handle the exponentially growing number of users or, as many people do, just deploy some quick and dirty solution to Heroku, following Jonathan Taplin advice — to move fast and break things. As it turn out, the answer was not so straightforward.

As it turns out, for the kind of task my webservice was supposed to do, even deploying to Heroku would demand a huge effort, due to the fact that most of the things we did back in the day was to webcrawl and interact with some insurance carriers platforms — definitely not within the 30s timeout of Heroku. This limitation implies that we would need to use workers to perform our task and return them afterwards, which is not very straightforward and also creates the need for a paid plan.

The first solution we came with was to simply launch a single EC2 instance running our code: This solved the time issue, but since our main code was running in python (and selenium for that matter), our maximum concurrency was? Guessed right: 1.

The issue was not that this wasn’t scalable in the short run, it was that it was simply impossible to run the MVP, even with a very reduced number of users.

The only trivial solution to that was to instantiate EC2s on the fly, as requests were received. You can see were this is going? We were heading toward a complex architecture — using ECR for images, orchestrators and dealing with a plethora of AWS services simultaneously and spending a good amount of money — for a simple MVP.

Enter AWS lambda and the Zappa framework. This free and simple framework helped us to reduce consistently the quantity of time and resources we were throwing at building our MVP by a factor of 10, and best of all, we can archive a maximum concurrency of 1000 requests without breaking a sweat.

My objective in this article is to go, step by step, from nothing to a functioning, asynchronous webservice that will help you build a minimally scalable MVP -something I wish I had access to when I started.

The only things you should know here is some Python and AWS basics, as well as a little bit of Flask and basic SQL. I will not dive into the details of how Zappa works, just keep in mind that it does.

How will it work?

The idea behind this asynchronous API is to send a POST request to create a job, get its job id, wait for it to complete and send a GET request to check the results. In a more general level, this is how it is supposed to happen:

First things first: Creating the project structure

For the project to work properly, we need to create a simple structure with files that Zappa will need to work.

  1. venv: This will be our Python virtual environment. In this case I am using Python 3.8. To create it, just open the terminal, navigate to the project folder and type in -m venv <name_of_environment> this should create the virtual environment. To activate it just type in .\<env_name>\Scripts\activate
  2. psycopg2: Since AWS lambda uses the amazon linux kernel, we need some of the libraries that use custom dll or driver in Windows to be build in that environment. For that reason, psycopg2, the lib we will use to interact with our database, must be prebuilt in that environment. You can always instantiate an EC2 with Amazon Linux AMI, but to save you the trouble, you can copy this from the GitHub repository at the end of this article.
  3. requirements.txt: You know what it is
  4. zapa_settings.json: This is the configuration json that Zappa uses to create our lambda. Here we will state some things that will tell zappa where and how it should create our functions and gateways inside AWS
  5. app.py: Here we will put all the routes of our API, using the Flask framework. Zappa automatically creates the bridge between these routes and the API gateway.
  6. functions.py: This is the actual code that contain the functions we want to execute in runtime, as well as our database connection helper

Side Note: since psycopg2 is now in our project folder, if you import psycopg2, your local python interpreter will load the Linux version of it. To run your code locally, rename the psycopg2 folder to something else while working in your machine

Lets get to it then

  1. Create a directory where you want your project to be on (you can create a project on git if you want too), navigate there and write the command to create the virtual environment. Once it is done, just type in .\<env_name>\Scripts\activate to activate your venv.
  2. Ok, leave the terminal alone for a moment and lets create our requirements file. We will need to install requests(it does not come in the lambda runtime), Flask, zappa, psycopg2(it will be used for local testing) and BeatifulSoup. Save the file and type in pip install -r requirements.txt
  3. Now, lets move to the zappa_settings. Create a json file and open it with you preferred text editor
{
"dev":{ // this is to tell if we are deploying to dev, prod...
"app_function": "app.app",//name of our main function object
"project_name": "example_project",
"runtime": "python3.8", //the runtime we will use
"s3_bucket": "<your_bucket_name>",//bucket to transfer DP
"api_key_required":false, // we do not need API keys for now
"timeout_seconds":150 //lambda timeout (not gateway timeout)
"aws_environment_variables":{
"db_url":"url", //url for our database
"db_port":"port", //port of db, for PG it is 5432
"db_usr":"user", //database user
"db_name":"name", //database name
"db_pass":"password", //user password
}
},
}

Note that the lambda timeout is set to 150s. API gateway will return a timeout after 30s, however, the lambda task will keep running our asynchronous task.

There are several other parameters that you can check on Zappa documentation and can be used for several other functionalities.

Create the Database

For this exercise we will be using an instance of RDS, running a Postgres database. If you prefer MySQL you can easily change it as you please. The good thing here is that we can use AWS free tier for our database, incurring zero cost for your company

I will go through each step of the creation of this database in AWS. If you already have a database or already know how to create one, just skip to the next session.

Access RDS

First, log in to your AWS account and navigate to the RDS panel and click on create database

Configure your instance

When on the create database page, select the following configuration rules and insert a master password:

Take note of your master user and password, you will need them later.

Get the connection parameters

Once your database is created, go to the created database overview and copy the endpoint and port as well.

Create a table to store the job executions

After you done that, use your preferred database client tool (I would recommend pgAdmin or DBeaver), connect to the database and create a table called “jobs” (this database name is probably “postgres”, which is the default database created when you instantiate RDS).

This table must have the following columns:

We will use a json field to store our results. It is not exactly a good practice since we might lose data consistency eventually, but for this use case it is just fine…

Just remember to set your PK as IDENTITY, so it will generate new ids whenever you insert data into it.

Now that we have all the parameters we need, we can start actually coding

Write the functions

The initial step is to write the code that you actually want to be executed. In our example case, we will use a simple function that scrapes google for the first search results and purposely waits 40 seconds. Since the goal of this is just to show the concept of how to build this kind of architecture, we will not do any kind of error handling or anything like that.

To deal with the HTML, we will be using BeautifulSoup, which we will later have to install in our virtual environment.

The code is as follows:

What this snippet does is simply mimic your browser request to google with a query string, parse the HTML looking for the headers and return a list of all headers found.

Now, we need to code the way we will interact with the database. We will need an INSERT script to create the new jobs, an UPDATE script to insert the job results and a SELECT script to retrieve the job results.

Ok, it is a little bit too long and we could create something more generic, however, for our purpose it works fine since it keeps the bulk of the query and DB interaction out of our API route call functions.

The last thing to do is code our app.py function, that will be the handler of the requests.

This file is essentially telling Flask what to do with the requests that are arriving on our TCP port. We use the @route decorator from flask to designate what will be executed on each route.

Here we will be dealing with async requests and, as much as we love Python, its certainly does not like to handle things in parallel. Fortunately for us, Flask has one little trick that will help us manage that: the call_on_close decorator returns a given value of the function while running another one.

HealthCheck Route

@app.route('/', methods=['GET'])
def healthcheck():
return
jsonify({
'status':'Online'
})

This is simply a get request to tell us if the API is online

Get job Route

@app.route('/job/<job_id>', methods=['GET'])
def get_job(
job_id):
job_result=retrieve_job(job_id)

if not
job_result:
#If job does not exist, warn the client
return jsonify({
'id':
job_id,
'status':'Not Created',
'result':None,
'created_at':None,
})
return
jsonify({
'id':
job_id,
'status':job_result[0][1],
'result':job_result[0][2],
'created_at':job_result[0][3]
})

This is the route the client will be using to check the job status and results. It is very straightforward: you call the DB connection function we created before, if the result is not empty, you return all the data from the job.

Create job Route

@app.route('/job', methods=['POST'])
def create():
data=flask.request.json
j_id = create_job()
#Return the Job id to client
response=jsonify(
{
'status':'Created!'
,
'id':j_id
}
)
#Flask will skip this function, write the return and come back to execute it
@response.call_on_close
def on_close():
#Here we execute our function and log the results
try:
response_list=get_google_results(data['query'])
status='Finished'
except:
# if anything happens during execution return fail
response_list=[]
status='Failed'
update_job(j_id,status,{'content':response_list})
return
response,201

Here is when things get interesting:

  1. We create a net line in the database and get its id (automatically generated by PostgreSQL).
  2. We use the call_on_close decorator on a new function that will execute our scraping and update the database entry with the job status and results
  3. Python will, behind the scenes, skip the function wrapped on the decorator and return the job created response to the client. Afterwards it will execute the on_close function.

Putting it all together

Inside your app.py file you can put the following code snippet and we are now ready to deploy.

You can always test this code locally, just don’t forget to set your env vars correctly and make sure you are not letting the Linux build of Psycopg2 on your project directory.

Deploy the WebService

Now its the easy part: use zappa to deploy your code to AWS.

But before the actual deployment, just make sure that you have your AWS Credentials on your computer and that the user with that credential has full access to IAM, CloudWatch,Lambda, EventBrigde, S3 and CloudFormation.

Once you are done with that, the steps are quite simple:

  1. Open your terminal, navigate to your folder and type <env_name>\Scripts\activate
  2. Now just type in zappa deploy dev and you should see the magic happening
  3. It will do some operations, zip your deployment package and then schedule a keep warm function, so you don’t have to worry about cold start times.
  4. At the end, it will show you the url in which the webservice is up. If anything has gone wrong, it will warn you and you can use zappa tail dev to acquire more information on exactly what happened.

Conslusion

Well, I hope you learned how to easily deploy any webservice to AWS without needing to know a lot about architecture or spending a lot of money.

You can extend the code here to create virtually any API you wish. Just keep in mind a small restriction of lambda: You have 500MB of memory to store your entire deployment package, so deploying ML models (specially Neural Nets) might be asking too much.

GitHub Project:

https://github.com/carlos-schwabe/scalable_architecture

Email (in case you have any questions):

carlos@brickseguros.com.br

carlos@brickseguros.com.br

--

--

Carlos Schwabe

Co-Founder at Brick Insurance and advanced analytics enthusiast