|
| 1 | +# Parallel Compilation |
| 2 | + |
| 3 | +Parallel compilation allows Feldera to compile multiple pipelines concurrently by distributing the workload across several compiler server pods. This dramatically reduces total compile time for large numbers of pipelines, especially in production environments. |
| 4 | + |
| 5 | +--- |
| 6 | + |
| 7 | +## How Parallel Compilation Works |
| 8 | + |
| 9 | +Feldera deploys the compiler server as a Kubernetes StatefulSet with **N** replicas. Each replica (pod) acts as a worker responsible for compiling a subset of pipelines. The assignment is deterministic: |
| 10 | + |
| 11 | +- Each pipeline is assigned to a worker using: |
| 12 | + `pipeline_id % N == worker_id` |
| 13 | + The worker ID is the pod index (e.g., `feldera-compiler-server-1` has worker ID `1`). |
| 14 | +- Each pod compiles only the pipelines assigned to it. |
| 15 | +- **Worker 0** (the pod with index 0) acts as the leader. All other workers transfer their compiled binaries to the leader over HTTP/HTTPS before marking the program as successfully compiled. |
| 16 | + |
| 17 | +To further accelerate builds, Feldera optionally supports [sccache](https://github.com/mozilla/sccache) with an S3-compatible backend. This allows workers to share compiled operator artifacts instead of rebuilding identical code. |
| 18 | + |
| 19 | +:::info |
| 20 | +Autoscaling based on workload is not yet supported. You must set the number of compiler server replicas at install time or scale them manually later. |
| 21 | +::: |
| 22 | + |
| 23 | +--- |
| 24 | + |
| 25 | +## Configuration |
| 26 | + |
| 27 | +### Enabling Parallel Compilation |
| 28 | + |
| 29 | +To enable parallel compilation with 3 compiler server replicas: |
| 30 | + |
| 31 | +**Via file `values.yaml`** |
| 32 | + |
| 33 | +```yaml |
| 34 | +parallelCompilation: |
| 35 | + # Enable parallel compilation features |
| 36 | + enabled: true |
| 37 | + # Number of compiler server replicas when parallel compilation is enabled |
| 38 | + replicas: 3 |
| 39 | + # ...other configuration |
| 40 | +``` |
| 41 | + |
| 42 | +**In Helm command** |
| 43 | + |
| 44 | +```bash |
| 45 | +helm upgrade --install feldera \ |
| 46 | + oci://public.ecr.aws/feldera/feldera-chart --version "${VERSION}" \ |
| 47 | + --namespace feldera \ |
| 48 | + --set parallelCompilation.enabled=true \ |
| 49 | + --set parallelCompilation.replicas=3 \ |
| 50 | + # ...other configuration |
| 51 | +``` |
| 52 | + |
| 53 | +You should see multiple compiler server pods, for example: |
| 54 | + |
| 55 | +```bash |
| 56 | +kubectl get pods -n feldera |
| 57 | +``` |
| 58 | + |
| 59 | +``` |
| 60 | +NAME READY STATUS RESTARTS AGE |
| 61 | +feldera-compiler-server-0 1/1 Running 0 2m |
| 62 | +feldera-compiler-server-1 1/1 Running 0 2m |
| 63 | +feldera-compiler-server-2 1/1 Running 0 2m |
| 64 | +feldera-api-server-xxx 1/1 Running 0 2m |
| 65 | +feldera-kubernetes-runner-xxx 1/1 Running 0 2m |
| 66 | +feldera-db-0 1/1 Running 0 2m |
| 67 | +``` |
| 68 | + |
| 69 | +--- |
| 70 | + |
| 71 | +### Setting Up sccache (Optional, Recommended) |
| 72 | + |
| 73 | +Compiler server has all dependcies percompiled in it's target directory. We only need to perform compilation of the program generated based on the pipeline SQL. |
| 74 | + |
| 75 | +If there is just 1 compiler server, sccache does not provide any benefit as dependencies are already there and as all pipelines are compiled in same workspace, the operators and other compiled artifacts are shared. |
| 76 | + |
| 77 | +When there are multiple compiler server, that is when we want to make sure operators compiled on a server are reusable by others, and sccache achieves that. |
| 78 | + |
| 79 | +Example: |
| 80 | +Pipeline A uses operator `xx` and is assigned to pod 0. Pod 0 builds `xx`. Later pipeline B, assigned to pod 1, also needs `xx`. Without sccache, pod 1 rebuilds `xx` from scratch. With sccache (S3/MinIO backend), pod 1 fetches the cached object files, avoiding a full rebuild. |
| 81 | + |
| 82 | +**1. Provision S3 Credentials** |
| 83 | + |
| 84 | +Use either IRSA (IAM Roles for Service Accounts) or a Kubernetes secret with S3 credentials. sccache uses these credentials to access the cache bucket. |
| 85 | + |
| 86 | +- **IRSA**: The compiler server checks for `AWS_ROLE_ARN` and `AWS_WEB_IDENTITY_TOKEN_FILE`. |
| 87 | +- **Kubernetes Secret**: Create a secret containing your S3 access keys: |
| 88 | + |
| 89 | + ```bash |
| 90 | + kubectl create secret generic sccache-s3-secret -n feldera \ |
| 91 | + --from-literal=access_key_id="your-access-key" \ |
| 92 | + --from-literal=secret_access_key="your-secret-key" |
| 93 | + ``` |
| 94 | +The secret must define keys `access_key_id` and `secret_access_key`. You can configure the secret name in `values.yaml`. |
| 95 | + |
| 96 | + |
| 97 | +**2. Configure sccache in `values.yaml`** |
| 98 | + |
| 99 | +```yaml |
| 100 | +parallelCompilation: |
| 101 | + enabled: true |
| 102 | + replicas: 3 |
| 103 | + # sccache configuration for sharing compilation artifacts between compiler servers |
| 104 | + sccache: |
| 105 | + # Enable sccache for compilation artifact caching (optional, recommended) |
| 106 | + enabled: true |
| 107 | + # S3 backend configuration for sccache |
| 108 | + s3: |
| 109 | + # S3 bucket name for cache storage |
| 110 | + bucket: "sccache-bucket" |
| 111 | + # Use SSL for S3 connections |
| 112 | + # set to true to use HTTPS/TLS |
| 113 | + useSSL: false |
| 114 | + # Key prefix for cache objects used by sccache |
| 115 | + keyPrefix: "sccache" |
| 116 | + # AWS region of bucket |
| 117 | + region: "us-east-1" |
| 118 | + # custom URL ( <ip>:<port> ) of a server you want to use, such as MinIO. |
| 119 | + # Defaults to ${BUCKET}.s3-{REGION}.amazonaws.com for AWS S3 if not set. |
| 120 | + # endpoint: "minio.extra.svc.cluster.local:9000" |
| 121 | + # |
| 122 | + # Server-side encryption (optional) |
| 123 | + # serverSideEncryption: false |
| 124 | + # |
| 125 | + # Existing secret containing S3 credentials |
| 126 | + # The secret must have keys: access_key_id and secret_access_key |
| 127 | + # If IRSA is setup, you don't need to specify existingSecret, |
| 128 | + # credentials would be configured via AWS_ROLE_ARN and AWS_WEB_IDENTITY_TOKEN_FILE |
| 129 | + # environment variables automatically. |
| 130 | + # existingSecret: "sccache-s3-secret" |
| 131 | +``` |
| 132 | + |
| 133 | + |
| 134 | +--- |
| 135 | + |
| 136 | +## Troubleshooting & FAQs |
| 137 | + |
| 138 | +- **Resource requirements:** |
| 139 | + |
| 140 | +Ensure your cluster nodes have enough resources to run the desired number of compiler server replicas. |
| 141 | + |
| 142 | +- **Pipeline stuck on some status:** |
| 143 | + |
| 144 | +If a pipeline is assigned to a worker pod that is not yet running or is unhealthy, it will not be compiled until that pod is available and running. Make sure to validate all pods are running. |
| 145 | + |
| 146 | +- **SystemError: Failed to upload binary:** |
| 147 | + |
| 148 | +If the pipeline gets this status, that means pod N failed to upload its binary to `<_>-compiler-server-0`. |
| 149 | + |
| 150 | +You can check if `<_>-compiler-server-0` is healthy or not by `/cluster_healthz` endpoint. Make sure to adjust binary upload related configuration as per your needs, e.g. if your _upgrade_ takes a while, we should configure retries and backoff interval to sane values such that pods get time to come up to receive the binary. |
| 151 | + |
| 152 | +- **error: process didn't exit successfully: `sccache .. rustc -vV`:** |
| 153 | + |
| 154 | +Check the `Errors` tab in web console ( enable `Verbatim errors` if required ) to check full error regarding why sccache failed. |
| 155 | + |
| 156 | +Comman causes can be misconfigured S3 bucket / endpoint / credentials. |
| 157 | + |
| 158 | +- **Scaling with kubectl:** If you scale the compiler server StatefulSet using `kubectl` without restarting, the compiler server will detect the change and panic with `SCALING DETECTED: StatefulSet has X replicas but compiler was started with Y workers`. This would trigger a restart to ensure correct work distribution. |
| 159 | + |
| 160 | +--- |
| 161 | + |
| 162 | +### Configuration Options Reference |
| 163 | + |
| 164 | +| Key | Description | Default/Example | |
| 165 | +|-----|-------------|-----------------| |
| 166 | +| `parallelCompilation.enabled` | Enable parallel compilation | `false` (ex: `true`) | |
| 167 | +| `parallelCompilation.replicas` | Number of compiler server pods | `1` (ex: `3`) | |
| 168 | +| `parallelCompilation.sccache.enabled` | Enable sccache build cache | `false` (ex: `true`) | |
| 169 | +| `parallelCompilation.sccache.s3.bucket` | S3/MinIO bucket for cache | `"sccache-bucket"` (ex: `"feldera-sccache"`) | |
| 170 | +| `parallelCompilation.sccache.s3.useSSL` | Use SSL/TLS for S3/MinIO | `false` (ex: `true`) | |
| 171 | +| `parallelCompilation.sccache.s3.region` | Bucket region | `"us-east-1"` (ex: `"us-east-1"`) | |
| 172 | +| `parallelCompilation.sccache.s3.keyPrefix` | Cache object key prefix | `"sccache"` (ex: `"sccache"` or `""`) | |
| 173 | + |
| 174 | + |
| 175 | +**Optional Configurations** |
| 176 | +| Key | Description | Example | |
| 177 | +|-----|-------------|---------| |
| 178 | +| `parallelCompilation.sccache.s3.existingSecret` | Secret for S3 credentials (omit if using IRSA) | `sccache-s3-secret` | |
| 179 | +| `parallelCompilation.sccache.s3.serverSideEncryption` | Enable server-side encryption with s3 managed key (SSE-S3) | `false` | |
| 180 | +| `parallelCompilation.sccache.s3.endpoint` | Custom endpoint (e.g. MinIO) | `minio.mydomain.com:9000` | |
| 181 | + |
| 182 | +--- |
0 commit comments