| Key | Value |
|---|---|
| Services | EMR Serverless, S3 |
| Integrations | AWS CLI, Maven |
| Categories | Analytics; Big Data; Spark |
A demo application illustrating how to run a Java Spark job on EMR Serverless using LocalStack. The sample builds a Maven project, uploads the JAR to S3, creates an EMR Serverless application, and submits a Spark job — all locally without connecting to AWS.
- A valid LocalStack for AWS license. Your license provides a
LOCALSTACK_AUTH_TOKENto activate LocalStack. - Docker
localstackCLIawslocalCLI- Java and Maven
make checkIf not using awslocal, add the following profile to ~/.aws/config:
[profile localstack]
region=us-east-1
output=json
endpoint_url = http://localhost:4566And to ~/.aws/credentials:
[localstack]
aws_access_key_id=test
aws_secret_access_key=testBuild the Java Spark application (java-demo-1.0.jar is included; you can also rebuild it):
make installexport LOCALSTACK_AUTH_TOKEN=<your-auth-token>
make startBefore creating the EMR Serverless job, we need to create a JAR file containing the Java code. We have the java-demo-1.0.jar file in the current directory. Alternatively, you can create the JAR file yourself by following the steps below.
cd hello-world
mvn packageCreate an S3 bucket and upload the JAR file:
cd ..
make deployCreates an EMR Serverless Spark application (which will run Spark 3.3.0), starts it, and submits a job that runs the HelloWorld class:
make runThe Spark job submits with:
- Entry point: the JAR file in S3
--class HelloWorld- Spark logs written to
s3://<bucket>/logs/(specified in thelogUriparameter)
To stop the EMR Serverless application after the job completes:
awslocal emr-serverless stop-application --application-id $APPLICATION_IDThis code is available under the Apache 2.0 license.