A Geographic Microservice

Will Carter
Level Up Coding
Published in
12 min readMar 2, 2021

--

Part 2: Serverless, AWS API Gateway, Lambda and Mapbox

Quick Recap

In part 1, we completed the back end for an AWS microservice, comprised of a PostGreSQL relational database with the PostGIS geospatial extension enabled. The PostGIS extension allows the new cloud database to store geographic data and the ability perform geographic SQL queries.

Remember, vector based geographic data can be stored in the database as points, lines and polygons.

Last time we successfully imported U.S. Congressional Districts polygon data sourced from the U.S. Census. The polygons make up each district’s borderlines. We used QGIS to connect to the new AWS cloud database and import district data up to the cloud.

Once populated, we were able to query the data in pgAdmin. Here are the first 100 districts, queried from the cb_2018_us_cd116_20m table.

SELECT id, geom, statefp, cd116fp, affgeoid, geoid, lsad, cdsessn
FROM public.cb_2018_us_cd116_20m
limit 100;
Querying geographic congressional district data from the cloud AWS RDS

Because the district borderline data is represented by polygons, we can see them displayed visually on a map within pgAdmin with the Geometry Viewer, a very useful addition to the pgAdmin.

District query result displayed visually on a map

From here, we want to move up the stack and expose this district data through a RESTful microservice API, which will also be hosted using AWS Lambda functions. From the AWS Lambda documentation

With AWS Lambda, you can run code without provisioning or managing servers. You pay only for the compute time that you consume — there’s no charge when your code isn’t running. You can run code for virtually any type of application or back end service — all with zero administration. Just upload your code and Lambda takes care of everything required to run and scale your code with high availability. You can call it directly from any web or mobile app.

This type of setup, where we have a self contained service, is a building block of a resilient application. In this case, the microservice will return Congressional Districts GeoJSON, an open standard for geographic data, ready to be displayed on a web map.

After doing a bit of research, The Serverless Framework came recommended as a useful tool for working with AWS Lambda functions. With the Serverless Framework, we’ll be able to easily…

…develop and deploy your AWS Lambda functions, along with the AWS infrastructure resources they require. It’s a CLI that offers structure, automation and best practices out-of-the-box, allowing you to focus on building sophisticated, event-driven, serverless architectures, comprised of Functions and Events.

serverless framework

With the hope that Serverless will help deployment to AWS go as smoothly as possible, we will use the structure and function templates that it offers.

First, install serverless it globally:

$ npm install serverless -g
Installing the Serverless framework

Next, create a new template called aws-postgres-serverless by entering the following at the terminal prompt.

$ serverless

When asked, select AWS Node.js for the type of project.

Next, initialize the project.

$ cd aws-postgres-serverless
$ npm init
Initialize Serverless project

At this point the project has 4 files:

Initial template

In handler.js we can see a single Lambda function pre-written for us. It looks like this is a “hello world” function that delivers a message stating that the function executed successfully.

module.exports.hello = async (event) => {
return {
statusCode: 200,
body: JSON.stringify(
{
message: 'Go Serverless v1.0! Your function executed
successfully!',
input: event,
},null,2),
};
};

Let’s change the message slightly, so that we can be sure that our code is updating.

module.exports.hello = async (event) => {
return {
statusCode: 200,
body: JSON.stringify(
{
message: 'Hooray! It works!',
input: event,
},null,2),
};
};

Next, take a look at serverless.yml, the configuration file for the project. It provides for much customization as well as definitions for the routes that the microservice will repond to.

Notice how this function entry references the hello function in handler.js.

serverless.yml function entry

In order to test the service and the hello function, we need to install another npm package, serverless-offline.

$ npm install --save serverless-offline
Installing serverless-offline

Next, replace (or comment out) the text in serverless.yml and add the following configuration settings. Indentation in serverless.yml matters, so be careful.

service: aws-postgres-serverlessframeworkVersion: '2'provider:
name: aws
runtime: nodejs12.x
lambdaHashingVersion: 20201221
functions: hello:
handler: handler.hello
events:
- http:
method: get
path: /hello
plugins:
- serverless-offline

Here the full serverless.yml at this point. The service will expose one function named hello.

serverless.yml

Start the service in the test environment at the terminal.

$ serverless offline
Testing the service

If we visit the local service at the hello route…

http://localhost:3000/dev/hello

We can see that it’s working in our web browser. Hooray!

Hello World!

Now, before we can deploy the function up to the AWS cloud, we need to set up a user in AWS that has the permissions to interact with other parts of our AWS ecosystem, specifically the AWS RDS Postgres/PostGIS cloud database that was discussed and set up in part 1.

Configuring AWS IAM

To create the user, visit the AWS console and sign in with your account and password. Once signed in, navigate to the AWS IAM console. Here we will create a new user and add that user to a new group, granting the permissions necessary to interact with AWS RDS, API Gateway and Lambda.

First, create the aws-postgres-users group and give it the necessary permissions.

Create aws-postgres-users group
Grant the following policies:
AmazonRDSFullAccess
AWSLambdaFullAccess
IAMFullAccess
AmazonAPIGatewayAdministrator
AWSCloudFormationFullAccess

Then, create a new user named aws-postgres and add this user to the aws-postgres-users group. You will be presented with the Access key ID and the Secret access key for the user.

You are only shown this information only once, so take care to note it.

Now, back at the terminal, configure your security credentials with the following command, replacing your values where appropriate with the values noted above. This will allow the service to act with the credentials provided.

$ serverless config credentials --provider aws --key <your_access_key> --secret <your_secret_key>

Deploy!

Now we are ready to deploy to the cloud, which is done with the following command:

$ serverless deploy
Deployed microservice!

Then if we visit the service in a browser, we can see that the hello function works!

Hello from the cloud

A hello world function in a microservice is a decent start. However, the true goal of this post is to expose geospatial data from the AWS RDS PostgreSQL back end created in part 1.

To do this, first we need to install the pg npm package, a non-blocking PostgreSQL client for Node.js, allowing connections to our AWS RDS PostgreSQL database.

$ npm install --save pg
Install pg

Next we need to define our connection to the database in a new file called db.js with the following content. This is the connection to the AWS RDS database we created in part 1.

module.exports = {
database: 'carto_boundaries',
host: 'aws-postgres-postgis.x.us-east-1.rds.amazonaws.com',
user: 'awspostgres',
password: 'XXXXXX'
}

Replace the values with actual values based on the database created.

NOTE: If you commit this code to github, don’t forget to .gitignore this db.js file. It contains the database credentials.

Hide your credentials!!!

Next, in handler.js, let’s add a new method called getDistricts that connects tot the database and returns the first district. Here is the SQL for that query.

SELECT id, geom, statefp, cd116fp, affgeoid, geoid, lsad, cdsessn,
aland, awater
FROM public.cb_2018_us_cd116_20m
limit 1;

The SQL above drives in the getDistricts function below:

module.exports.getDistricts = (event, context, callback) => {  const client = new Client(dbConfig)
client.connect()

let sql = `
SELECT id, geom, statefp, cd116fp, affgeoid, geoid, lsad, cdsessn,
aland, awater
FROM public.cb_2018_us_cd116_20m
limit 1`.trim()
client
.query(sql, null)
.then((res) => {
const response = {
statusCode: 200,
body: JSON.stringify(res.rows),
}
callback(null, response)
client.end()
})
.catch((error) => {
const errorResponse = {
statusCode: error.statusCode || 500,
body: `${error}`,
}
callback(null, errorResponse)
client.end()
})
}

An order to expose this new getDistricts function, we need to update serverless.yml adding an http method and path for the new function.

getDistricts:
handler: handler.getDistricts
events:
- http:
method: get
path: /getDistricts

Now, if we visit our new route in a web browser:

http://localhost:3000/dev/getDistricts

We are able to see that the connection worked, and here is the data returned for the first district in the database!

Results from the database

Things are looking good at this point.

However, the geom column value in the output above is not useful for our front end web map.

Raw geom column

Luckily, PostGIS allows for returning of results in GeoJSON format by wrapping the query in function(s) to transform the result into GeoJSON. This getGeoJsonSQL helper function will take in a non-geo SQL statement and returns a SQL statement that will return GeoJSON as the result. All the original SQL query columns are included in the returned GeoJSON.

SELECT GeoJSON from PostGIS

Now, in the getDistricts function, we pass the SQL to the getGeoJsonSqlFor function. When we return the result this time in the body of the response, we return jsonb_build_object from the first row of the geo query result, which will be a GeoJSON string that describes the district boundaries.

Returning GeoJSON from a SQL result

Visit the /getDistricts route again and we see GeoJSON MultiPolygon data is being returned instead of the raw geom data as before.

GeoJSON for the a congressional district

By examining the GeoJSON result more closely, notice that it is comprised of an array of features, each containing a coordinates array. The coordinates array contains a series of latitude and longitude coordinates, which follow the outline of the district border, forming a polygon for each district.

GeoJSON is made of up latitude and longitude coordinates

So now, the microservice returns GeoJSON for a single district. We can easily build upon it to complete where we were headed at the end of part 1, returning districts for a particular state.

The districts table that we imported in part 1 has a statefp column, but this is not a user friendly value to be passed into an API, because statefp values are not commonly known. We want to be able to accept the two letter state abbreviation from the user and have that drive the statefp parameter as in the query below.

In order to make such a query, we need to add additional data to the database behind the service, specifically U.S. State data. This data needs to have the statefp as well as the stusps (two letter state abbreviation column) to complete the join.

Conveniently, the U.S. state borderline data available from the U.S. Census has the needed statefp, stusps, and name columns.

So, let’s import U.S. State data (including state boundary geo data) into the cloud database into a new table as we did with the districts data earlier using QGIS.

Importing U.S. State data into the cloud

With the new table, we can query a particular state in pgAdmin by its two letter state abbreviation (stusps).

State data is in the cloud

Further, we can use a join to get the districts for a particular state abbreviation.

SELECT districts.*, states.stusps as state_abbrev, states.name as state_name
FROM cb_2018_us_state_20m states
JOIN cb_2018_us_cd116_20m districts on districts.statefp = states.statefp
WHERE stusps = $1;

The SQL above to drives a new getDistrictsForState function.

getDistrictsForState function

In order to expose getDistrictsForState as a new AWS Lambda function, We again must add the reference to in serverless.yml.

Then, if we start the instance again and pass ‘IN’ for Indiana, we get all the districts for Indiana, in GeoJSON.

Indiana congressional districts in GeoJSON

Adjust Microservice Access

One more quick addition is necessary at this point in order for our service to be accessible from production on AWS.

In the response object returned, add the headers key with the following object as it’s value.

headers: {
"Access-Control-Allow-Origin": '*',
"Access-Control-Allow-Methods": 'GET'
}

This will allow connections to our service from anywhere.

If the following CORS policy error is familiar, you’ll understand why the headers value is needed.

Don’t get blocked by the CORS policy

Finally, let’s push the latest AWS Lambda function (getDistrictsForState) to the AWS cloud microservice.

$ serverless deploy
Deploying the microservice

After the service is deployed, we can visit the production route in our browser.

Indiana districts, from an AWS microservice

If CA is desired, then change the parameter in the URL.

California districts, from an AWS microservice

The Map

Now that we have a microservice with the ability to return Congressional District GeoJSON data dynamically for each U.S. State, how easy is it to access this service to display districts on a map? Very.

In QGIS, we can add a districts layer from the source easily. Navigate to Layer > add Layer > add Vector Layer. Choose Protocol HTTP(S). Be sure to select GeoJSON as the type. For the URI, append IN for Indiana to the source url to retrieve the districts for that state.

Adding districts from the cloud into QGIS

How about a web map? Certainly.

Mapbox GL JS is a very capable open source library for creating maps and displaying geographic data on them. GeoJSON is a common source format for web maps.

Here is a demonstration Mapbox based web map that calls the microservice and displays the appropriate districts depending on the state value passed in the URL.

https://fergusdevelopmentllc.github.io/awsmap/index.html?state=IN
Indiana Congressional Districts

Here is the HTML and JavaScript that produces the map above.

On line 48, you will see where the service is called, passing the state abbreviation parameter.

To see districts for another state, change the URL. For example, here is California.

https://fergusdevelopmentllc.github.io/awsmap/index.html?state=CA
California Congressional Districts

In part 3 of this project, we will explore more capabilities with PostGIS, aiming toward a way of finding areas of interest based on U.S. County populations.

Here is the github repository for the serverless code discussed in this post.

--

--