Skip to content

Commit 1883f55

Browse files
authored
feat: adds k8s config options to Bytewax materialization engine (feast-dev#3518)
feat: adds k8s config options Signed-off-by: adamschmidt <aschmidt1978@gmail.com>
1 parent ec08a55 commit 1883f55

File tree

2 files changed

+41
-2
lines changed

2 files changed

+41
-2
lines changed

docs/reference/batch-materialization/bytewax.md

Lines changed: 24 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,8 @@ To configure secrets, first create them using `kubectl`:
2323
kubectl create secret generic -n bytewax aws-credentials --from-literal=aws-access-key-id='<access key id>' --from-literal=aws-secret-access-key='<secret access key>'
2424
```
2525

26+
If your Docker registry requires authentication to store/pull containers, you can use this same approach to store your repository access credential and use when running the materialization engine.
27+
2628
Then configure them in the batch_engine section of `feature_store.yaml`:
2729

2830
``` yaml
@@ -40,6 +42,8 @@ batch_engine:
4042
secretKeyRef:
4143
name: aws-credentials
4244
key: aws-secret-access-key
45+
image_pull_secrets:
46+
- docker-repository-access-secret
4347
```
4448
4549
#### Configuration
@@ -51,9 +55,28 @@ batch_engine:
5155
type: bytewax
5256
namespace: bytewax
5357
image: bytewax/bytewax-feast:latest
58+
image_pull_secrets:
59+
- my_container_secret
60+
service_account_name: my-k8s-service-account
61+
annotations:
62+
# example annotation you might include if running on AWS EKS
63+
iam.amazonaws.com/role: arn:aws:iam::<account number>:role/MyBytewaxPlatformRole
64+
resources:
65+
limits:
66+
cpu: 1000m
67+
memory: 2048Mi
68+
requests:
69+
cpu: 500m
70+
memory: 1024Mi
5471
```
5572

56-
The `namespace` configuration directive specifies which Kubernetes [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) jobs, services and configuration maps will be created in.
73+
**Notes:**
74+
75+
* The `namespace` configuration directive specifies which Kubernetes [namespace](https://kubernetes.io/docs/concepts/overview/working-with-objects/namespaces/) jobs, services and configuration maps will be created in.
76+
* The `image_pull_secrets` configuration directive specifies the pre-configured secret to use when pulling the image container from your registry
77+
* The `service_account_name` specifies which Kubernetes service account to run the job under
78+
* `annotations` allows you to include additional Kubernetes annotations to the job. This is particularly useful for IAM roles which grant the running pod access to cloud platform resources (for example).
79+
* The `resources` configuration directive sets the standard Kubernetes [resource requests](https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/) for the job containers to utilise when materializing data.
5780

5881
#### Building a custom Bytewax Docker image
5982

sdk/python/feast/infra/materialization/contrib/bytewax/bytewax_materialization_engine.py

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,6 +46,17 @@ class BytewaxMaterializationEngineConfig(FeastConfigBaseModel):
4646
These environment variables can be used to reference Kubernetes secrets.
4747
"""
4848

49+
image_pull_secrets: List[str] = []
50+
""" (optional) The secrets to use when pulling the image to run for the materialization job """
51+
52+
resources: dict = {}
53+
""" (optional) The resource requests and limits for the materialization containers """
54+
55+
service_account_name: StrictStr = ""
56+
""" (optional) The service account name to use when running the job """
57+
58+
annotations: dict = {}
59+
""" (optional) Annotations to apply to the job container. Useful for linking the service account to IAM roles, operational metadata, etc """
4960

5061
class BytewaxMaterializationEngine(BatchMaterializationEngine):
5162
def __init__(
@@ -248,9 +259,14 @@ def _create_job_definition(self, job_id, namespace, pods, env):
248259
"parallelism": pods,
249260
"completionMode": "Indexed",
250261
"template": {
262+
"metadata": {
263+
"annotations": self.batch_engine_config.annotations,
264+
},
251265
"spec": {
252266
"restartPolicy": "Never",
253267
"subdomain": f"dataflow-{job_id}",
268+
"imagePullSecrets": self.batch_engine_config.image_pull_secrets,
269+
"serviceAccountName": self.batch_engine_config.service_account_name,
254270
"initContainers": [
255271
{
256272
"env": [
@@ -300,7 +316,7 @@ def _create_job_definition(self, job_id, namespace, pods, env):
300316
"protocol": "TCP",
301317
}
302318
],
303-
"resources": {},
319+
"resources": self.batch_engine_config.resources,
304320
"securityContext": {
305321
"allowPrivilegeEscalation": False,
306322
"capabilities": {

0 commit comments

Comments
 (0)