You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
-[Step 1: Fetch features for batch scoring (method 1)](#step-1-fetch-features-for-batch-scoring-method-1)
30
31
-[Step 2: Fetch features for batch scoring (method 2)](#step-2-fetch-features-for-batch-scoring-method-2)
31
-
-[Step 3 (optional): Scaling to large datasets](#step-3-optional-scaling-to-large-datasets)
32
+
-[Step 3 (optional): Scaling `get_historical_features`to large datasets](#step-3-optional-scaling-get_historical_features-to-large-datasets)
32
33
-[User group 3: Data Scientists](#user-group-3-data-scientists)
33
34
-[Conclusion](#conclusion)
34
35
-[FAQ](#faq)
@@ -381,6 +382,8 @@ Additionally, users will often want to have a dev/staging environment that's sep
381
382
382
383
Data scientists or ML engineers can use the defined `FeatureService` (corresponding to model versions) and schedule regular jobs that generate batch predictions (or regularly retrain).
### Step 3 (optional): Scaling `get_historical_features` to large datasets
432
435
You may note that the above example uses a `to_df()` method to load the training dataset into memory and may be wondering how this scales if you have very large datasets.
433
436
434
437
`get_historical_features`actually returns a `RetrievalJob` object that lazily executes the point-in-time join. The `RetrievalJob` class is extended by each offline store to allow flushing results to e.g. the data warehouse or data lakes.
0 commit comments