Skip to content

Commit 6752460

Browse files
author
Matt C
committed
start of python version
1 parent 06d877c commit 6752460

10 files changed

Lines changed: 201 additions & 0 deletions

File tree

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,10 @@
1+
*.swp
2+
package-lock.json
3+
__pycache__
4+
.pytest_cache
5+
.env
6+
*.egg-info
7+
8+
# CDK asset staging directory
9+
.cdk.staging
10+
cdk.out
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
# The Predictive Lambda Pattern
2+
3+
![architecture](img/arch.png)
4+
5+
This is a pattern that uses a container inside Lambda to deploy a custom Python ML model to predict the nearest Chipotle restaurant based on your lat/long.
6+
7+
Some Useful References:
8+
9+
| Author | Link |
10+
| ------------- | ------------- |
11+
| AWS Blog | [New for AWS Lambda – Container Image Support](https://aws.amazon.com/blogs/aws/new-for-aws-lambda-container-image-support/) |
12+
| AWS Docs | [Lambda now supports container images](https://aws.amazon.com/about-aws/whats-new/2020/12/aws-lambda-now-supports-container-images-as-a-packaging-format/) |
13+
| Yan Cui | [Package your Lambda function as a container image](https://lumigo.io/blog/package-your-lambda-function-as-a-container-image/) |
14+
| Scikit Learn Docs | [User Guide](https://scikit-learn.org/stable/user_guide.html) |
15+
| AWS ECR Gallery | [Python Lambda Image](https://gallery.ecr.aws/lambda/python) |
16+
| Docker Docs | [CLI Reference](https://docs.docker.com/reference/) |
17+
18+
## What's Included In This Pattern?
19+
This pattern uses sklearn to create a custom k nearest neighbour model to predict the nearest Chipotle to a given Latitude and Longitude. The model is deployed inside a container attached to AWS Lambda.
20+
21+
### The Data
22+
If you want to look at the data used for this model you can look at the [jupyter notebook](model/training/Chipotle.ipynb), the raw data came from [kaggle](https://www.kaggle.com/jeffreybraun/chipotle-locations)
23+
24+
### The ML Model
25+
This is a very simple model to demonstrate the concept (I didn't even check the accuracy because it doesn't change the pattern). It uses [sklearn nearest neighbors](https://scikit-learn.org/stable/modules/neighbors.html) to predict the closest Chipotle location to a given lat/long
26+
27+
### Two Docker Containers
28+
I use the Lambda image to train the ML model in one container and then I use a separate container for the deployed Lambda Function. The reason I do this is because it means that you know you have pickled your model in the same environment it will be deployed but you can use things that wont be packaged into your deployed function keeping it as lightweight as possible. You will also have a built container image containing the raw data, the training logic and the trained model. These images could be archived to have a history of your model.
29+
30+
### A Lambda Function
31+
I have this setup with a 15 second timeout and 4GB ram to comfortably run our model
32+
33+
### An API Gateway HTTP API
34+
Setup as a proxy integration, all requests hit the Lambda Function
35+
36+
## How Do I Test This Pattern?
37+
38+
do "npm run deploy" from the base directory and you will have the url for an API Gateway output into the logs or in the CloudFormation console. Open that url in a browser but add "?lat=39.153198&long=-77.066176" to the end and you should get back a prediction.
39+
40+
## How Does It Work?
41+
42+
Most of the logic for this lives in the model folder. There are two Dockerfiles:
43+
- Dockerfile - used by Lambda during the deploy
44+
- TrainingDockerfile - used to spin up the container to train our model
45+
46+
I have added the trained model to version control but if you want to retrain it yourself what you have to do is make sure docker is running and:
47+
48+
```bash
49+
cd model
50+
./trainmodel.sh
51+
```
52+
53+
This uses the Lambda Python image to run the file training/training.py and then copy the chipotle.pkl file out of the container. The requirements.txt is shared between the training container and the deployed container.
54+
55+
The actual logic that runs when we hit our url is in model/deployment/app.py, it unpickles the model, makes a prediction and returns the response as a string.
56+
57+
58+
## Useful CDK Commands
59+
60+
To manually create a virtualenv on MacOS and Linux:
61+
62+
```
63+
$ python3 -m venv .venv
64+
```
65+
66+
After the init process completes and the virtualenv is created, you can use the following
67+
step to activate your virtualenv.
68+
69+
```
70+
$ source .venv/bin/activate
71+
```
72+
73+
If you are a Windows platform, you would activate the virtualenv like this:
74+
75+
```
76+
% .venv\Scripts\activate.bat
77+
```
78+
79+
Once the virtualenv is activated, you can install the required dependencies.
80+
81+
```
82+
$ pip install -r requirements.txt
83+
```
84+
85+
At this point you can now synthesize the CloudFormation template for this code.
86+
87+
```
88+
$ cdk synth
89+
```
90+
91+
To add additional dependencies, for example other CDK libraries, just add
92+
them to your `setup.py` file and rerun the `pip install -r requirements.txt`
93+
command.
94+
95+
## Useful commands
96+
97+
* `cdk ls` list all stacks in the app
98+
* `cdk synth` emits the synthesized CloudFormation template
99+
* `cdk deploy` deploy this stack to your default AWS account/region
100+
* `cdk diff` compare deployed stack with current state
101+
* `cdk docs` open CDK documentation
102+
103+
Enjoy!
Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,11 @@
1+
#!/usr/bin/env python3
2+
3+
from aws_cdk import core
4+
5+
from the_predictive_lambda.the_predictive_lambda_stack import ThePredictiveLambdaStack
6+
7+
8+
app = core.App()
9+
ThePredictiveLambdaStack(app, "the-predictive-lambda")
10+
11+
app.synth()
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
{
2+
"app": "python3 app.py",
3+
"context": {
4+
"@aws-cdk/core:enableStackNameDuplicates": "true",
5+
"aws-cdk:enableDiffNoFail": "true",
6+
"@aws-cdk/core:stackRelativeExports": "true",
7+
"@aws-cdk/aws-ecr-assets:dockerIgnoreSupport": true
8+
}
9+
}
117 KB
Loading
Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
-e .
Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,45 @@
1+
import setuptools
2+
3+
4+
with open("README.md") as fp:
5+
long_description = fp.read()
6+
7+
8+
setuptools.setup(
9+
name="the_predictive_lambda",
10+
version="0.0.1",
11+
12+
description="An empty CDK Python app",
13+
long_description=long_description,
14+
long_description_content_type="text/markdown",
15+
16+
author="author",
17+
18+
package_dir={"": "the_predictive_lambda"},
19+
packages=setuptools.find_packages(where="the_predictive_lambda"),
20+
21+
install_requires=[
22+
"aws-cdk.core==1.76.0",
23+
],
24+
25+
python_requires=">=3.6",
26+
27+
classifiers=[
28+
"Development Status :: 4 - Beta",
29+
30+
"Intended Audience :: Developers",
31+
32+
"License :: OSI Approved :: Apache Software License",
33+
34+
"Programming Language :: JavaScript",
35+
"Programming Language :: Python :: 3 :: Only",
36+
"Programming Language :: Python :: 3.6",
37+
"Programming Language :: Python :: 3.7",
38+
"Programming Language :: Python :: 3.8",
39+
40+
"Topic :: Software Development :: Code Generators",
41+
"Topic :: Utilities",
42+
43+
"Typing :: Typed",
44+
],
45+
)
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
@echo off
2+
3+
rem The sole purpose of this script is to make the command
4+
rem
5+
rem source .venv/bin/activate
6+
rem
7+
rem (which activates a Python virtualenv on Linux or Mac OS X) work on Windows.
8+
rem On Windows, this command just runs this batch file (the argument is ignored).
9+
rem
10+
rem Now we don't need to document a Windows command for activating a virtualenv.
11+
12+
echo Executing .venv\Scripts\activate.bat for you
13+
.venv\Scripts\activate.bat

the-predictive-lambda/python/the_predictive_lambda/__init__.py

Whitespace-only changes.
Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
from aws_cdk import core
2+
3+
4+
class ThePredictiveLambdaStack(core.Stack):
5+
6+
def __init__(self, scope: core.Construct, construct_id: str, **kwargs) -> None:
7+
super().__init__(scope, construct_id, **kwargs)
8+
9+
# The code that defines your stack goes here

0 commit comments

Comments
 (0)