Skip to content

Commit 4aeae0f

Browse files
committed
Add more description to notebook in module 2
Signed-off-by: Danny Chiao <danny@tecton.ai>
1 parent 1ef3e19 commit 4aeae0f

File tree

1 file changed

+109
-9
lines changed

1 file changed

+109
-9
lines changed

module_2/client/module_2_client.ipynb

Lines changed: 109 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,26 @@
77
"# Retrieving on demand features"
88
]
99
},
10+
{
11+
"cell_type": "code",
12+
"execution_count": 1,
13+
"metadata": {},
14+
"outputs": [
15+
{
16+
"name": "stdout",
17+
"output_type": "stream",
18+
"text": [
19+
"Requirement already satisfied: Pygments in /Users/dannychiao/.pyenv/versions/3.8.10/envs/python-3.8/lib/python3.8/site-packages (2.11.2)\n",
20+
"\u001b[33mWARNING: You are using pip version 22.0.3; however, version 22.1 is available.\n",
21+
"You should consider upgrading via the '/Users/dannychiao/.pyenv/versions/3.8.10/envs/python-3.8/bin/python -m pip install --upgrade pip' command.\u001b[0m\u001b[33m\n",
22+
"\u001b[0mNote: you may need to restart the kernel to use updated packages.\n"
23+
]
24+
}
25+
],
26+
"source": [
27+
"%pip install Pygments"
28+
]
29+
},
1030
{
1131
"cell_type": "markdown",
1232
"metadata": {},
@@ -16,7 +36,7 @@
1636
},
1737
{
1838
"cell_type": "code",
19-
"execution_count": 1,
39+
"execution_count": 2,
2040
"metadata": {},
2141
"outputs": [],
2242
"source": [
@@ -27,7 +47,7 @@
2747
},
2848
{
2949
"cell_type": "code",
30-
"execution_count": 2,
50+
"execution_count": 3,
3151
"metadata": {},
3252
"outputs": [],
3353
"source": [
@@ -46,12 +66,51 @@
4666
"metadata": {},
4767
"source": [
4868
"### model_v2 feature service\n",
49-
"This one leverages dummy `val_to_add` and `val_to_add_2` request data. Request data must be passed in as part of the `entity_df`. This may come from the same source that includes your labels for your model."
69+
"This one leverages dummy `val_to_add` and `val_to_add_2` request data. Request data must be passed in as part of the `entity_df`. This may come from the same source that includes your labels for your model.\n",
70+
"\n",
71+
"A quick reminder of the on demand feature view (ODFV) being used in this feature service:"
72+
]
73+
},
74+
{
75+
"cell_type": "code",
76+
"execution_count": 12,
77+
"metadata": {},
78+
"outputs": [
79+
{
80+
"name": "stdout",
81+
"output_type": "stream",
82+
"text": [
83+
"@on_demand_feature_view(\n",
84+
" sources=[driver_hourly_stats_view, val_to_add_request],\n",
85+
" schema=[\n",
86+
" Field(name=\"conv_rate_plus_val1\", dtype=Float64),\n",
87+
" Field(name=\"conv_rate_plus_val2\", dtype=Float64),\n",
88+
" ],\n",
89+
")\n",
90+
"def transformed_conv_rate(inputs: pd.DataFrame) -> pd.DataFrame:\n",
91+
" df = pd.DataFrame()\n",
92+
" df[\"conv_rate_plus_val1\"] = inputs[\"conv_rate\"] + inputs[\"val_to_add\"]\n",
93+
" df[\"conv_rate_plus_val2\"] = inputs[\"conv_rate\"] + inputs[\"val_to_add_2\"]\n",
94+
" return df\n",
95+
"\n"
96+
]
97+
}
98+
],
99+
"source": [
100+
"import dill\n",
101+
"print(dill.source.getsource(store.get_on_demand_feature_view(\"transformed_conv_rate\").udf))"
102+
]
103+
},
104+
{
105+
"cell_type": "markdown",
106+
"metadata": {},
107+
"source": [
108+
"Now let's retrieve historical features from this feature service. The transformation will happen on the fly after doing the point-in-time join to produce the `conv_rate_plus_val1` and `conv_rate_plus_val2` features"
50109
]
51110
},
52111
{
53112
"cell_type": "code",
54-
"execution_count": 4,
113+
"execution_count": 30,
55114
"metadata": {},
56115
"outputs": [
57116
{
@@ -98,12 +157,53 @@
98157
"metadata": {},
99158
"source": [
100159
"### model_v3 feature service\n",
101-
"This one generates geohash features"
160+
"This one generates geohash features from latitudes and longitudes. This is useful for generating features relating to geographic regions (e.g. A geohash of `qfb9c3mw8hte` represents a sub-region within the region represented by a `qfb` geohash.)\n",
161+
"\n",
162+
"Let's look at the on demand feature view used in this feature service:"
163+
]
164+
},
165+
{
166+
"cell_type": "code",
167+
"execution_count": 13,
168+
"metadata": {},
169+
"outputs": [
170+
{
171+
"name": "stdout",
172+
"output_type": "stream",
173+
"text": [
174+
"@on_demand_feature_view(\n",
175+
" sources=[driver_daily_features_view],\n",
176+
" schema=[Field(name=f\"geohash_{i}\", dtype=String) for i in range(1, 7)],\n",
177+
")\n",
178+
"def location_features_from_push(inputs: pd.DataFrame) -> pd.DataFrame:\n",
179+
" import pygeohash as gh\n",
180+
"\n",
181+
" df = pd.DataFrame()\n",
182+
" df[\"geohash\"] = inputs.apply(lambda x: gh.encode(x.lat, x.lon), axis=1).astype(\n",
183+
" \"string\"\n",
184+
" )\n",
185+
"\n",
186+
" for i in range(1, 7):\n",
187+
" df[f\"geohash_{i}\"] = df[\"geohash\"].str[:i].astype(\"string\")\n",
188+
" return df\n",
189+
"\n"
190+
]
191+
}
192+
],
193+
"source": [
194+
"print(dill.source.getsource(store.get_on_demand_feature_view(\"location_features_from_push\").udf))"
195+
]
196+
},
197+
{
198+
"cell_type": "markdown",
199+
"metadata": {},
200+
"source": [
201+
"Now we retrieve features. This will compute the `geohash_X` features"
102202
]
103203
},
104204
{
105205
"cell_type": "code",
106-
"execution_count": 6,
206+
"execution_count": 31,
107207
"metadata": {},
108208
"outputs": [
109209
{
@@ -162,12 +262,12 @@
162262
"metadata": {},
163263
"source": [
164264
"### model_v2 feature service\n",
165-
"This one leverages dummy `val_to_add` and `val_to_add_2` request data so this is passed into the `entity_rows` parameter"
265+
"This one leverages dummy `val_to_add` and `val_to_add_2` request data so this is passed into the `entity_rows` parameter. The on demand transformation is executed on the fly and combines in this request data with the pre-computed `conv_rate` feature in the online store."
166266
]
167267
},
168268
{
169269
"cell_type": "code",
170-
"execution_count": 9,
270+
"execution_count": 32,
171271
"metadata": {},
172272
"outputs": [
173273
{
@@ -202,7 +302,7 @@
202302
},
203303
{
204304
"cell_type": "code",
205-
"execution_count": 11,
305+
"execution_count": 33,
206306
"metadata": {},
207307
"outputs": [
208308
{

0 commit comments

Comments
 (0)