Skip to content

Commit 9df3448

Browse files
committed
Update CONTRIBUTING doc
1 parent b9bb22a commit 9df3448

File tree

7 files changed

+277
-13
lines changed

7 files changed

+277
-13
lines changed

CONTRIBUTING.md

Lines changed: 259 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,262 @@
11
# Contributing Guide
22

3+
## Getting Started
4+
5+
The following guide will help you quickly run Feast in your local machine.
6+
7+
The main components of Feast are:
8+
- **Feast Core** handles FeatureSpec registration, starts and monitors Ingestion
9+
jobs and ensures that Feast internal metadata is consistent.
10+
- **Feast Ingestion** subscribes to streams of FeatureRow and writes the feature
11+
values to registered Stores.
12+
- **Feast Serving** handles requests for features values retrieval from the end users.
13+
14+
![Feast Components Overview](docs/assets/feast-components-overview.png)
15+
16+
**Pre-requisites**
17+
- Java SDK version 8
18+
- Python version 3.6 (or above) and pip
19+
- Access to Postgres database (version 11 and above)
20+
- Access to [Redis](https://redis.io/topics/quickstart) instance (tested on version 5.x)
21+
- Access to [Kafka](https://kafka.apache.org/) brokers (tested on version 2.x)
22+
- [Maven ](https://maven.apache.org/install.html) version 3.6.x
23+
- [grpc_cli](https://github.com/grpc/grpc/blob/master/doc/command_line_tool.md)
24+
is useful for debugging and quick testing
25+
- An overview of Feast specifications and [protos](./protos/feast)
26+
27+
> **Assumptions:**
28+
>
29+
> 1. Postgres is running in "localhost:5432" and has a database called "postgres" which
30+
> can be accessed with credentials user "postgres" and password "password".
31+
> To use different database name and credentials, please update
32+
> "$FEAST_HOME/core/src/main/resources/application.yml"
33+
> or set these environment variables: DB_HOST, DB_USERNAME, DB_PASSWORD.
34+
> 2. Redis is running locally and accessible from "localhost:6379"
35+
> 3. Feast has admin access to BigQuery.
36+
37+
38+
```
39+
# Clone Feast branch 0.3-dev
40+
# $FEAST_HOME will refer to be the root directory of this Feast Git repository
41+
42+
git clone -b 0.3-dev https://github.com/gojek/feast
43+
cd feast
44+
```
45+
46+
#### Starting Feast Core
47+
48+
```
49+
cd $FEAST_HOME/core
50+
51+
# Please check the default configuration for Feast Core in
52+
# "$FEAST_HOME/core/src/main/resources/application.yml" and update it accordingly.
53+
#
54+
# Start Feast Core GRPC server on localhost:6565
55+
mvn spring-boot:run
56+
57+
# If Feast Core starts successfully, verify the correct Stores are registered
58+
# correctly, for example by using grpc_cli.
59+
grpc_cli call localhost:6565 GetStores ''
60+
61+
# Should return something similar to the following.
62+
# Note that you should change BigQuery projectId and datasetId accordingly
63+
# in "$FEAST_HOME/core/src/main/resources/application.yml"
64+
65+
store {
66+
name: "SERVING"
67+
type: REDIS
68+
subscriptions {
69+
name: ".*"
70+
version: ">0"
71+
}
72+
redis_config {
73+
host: "localhost"
74+
port: 6379
75+
}
76+
}
77+
store {
78+
name: "WAREHOUSE"
79+
type: BIGQUERY
80+
subscriptions {
81+
name: ".*"
82+
version: ">0"
83+
}
84+
bigquery_config {
85+
project_id: "my-google-project-id"
86+
dataset_id: "my-bigquery-dataset-id"
87+
}
88+
}
89+
```
90+
91+
#### Starting Feast Serving
92+
93+
Feast Serving requires administrators to provide an **existing** store name in Feast.
94+
An instance of Feast Serving can only retrieve features from a **single** store.
95+
> In order to retrieve features from multiple stores you must start **multiple**
96+
instances of Feast serving. If you start multiple Feast serving on a single host,
97+
make sure that they are listening on different ports.
98+
99+
```
100+
cd $FEAST_HOME/serving
101+
102+
# Start Feast Serving GRPC server on localhost:6566 with store name "SERVING"
103+
mvn spring-boot:run -Dspring-boot.run.arguments='--feast.store-name=SERVING'
104+
105+
# To verify Feast Serving starts successfully
106+
grpc_cli call localhost:6566 GetFeastServingType ''
107+
108+
# Should return something similar to the following.
109+
type: FEAST_SERVING_TYPE_ONLINE
110+
```
111+
112+
113+
#### Registering a FeatureSet
114+
115+
Create a new FeatureSet on Feast by sending a request to Feast Core. When a
116+
feature set is successfully registered, Feast Core will start an **ingestion** job
117+
that listens for new features in the FeatureSet. Note that Feast currently only
118+
supports source of type "KAFKA", so you must have access to a running Kafka broker
119+
to register a FeatureSet successfully.
120+
121+
```
122+
# Example of registering a new driver feature set
123+
# Note the source value, it assumes that you have access to a Kafka broker
124+
# running on localhost:9092
125+
126+
grpc_cli call localhost:6565 ApplyFeatureSet '
127+
feature_set {
128+
name: "driver"
129+
version: 1
130+
131+
entities {
132+
name: "driver_id"
133+
value_type: INT64
134+
}
135+
136+
features {
137+
name: "city"
138+
value_type: STRING
139+
}
140+
141+
source {
142+
type: KAFKA
143+
kafka_source_config {
144+
bootstrap_servers: "localhost:9092"
145+
}
146+
}
147+
}
148+
'
149+
150+
# To check that the FeatureSet has been registered correctly.
151+
# You should also see logs from Feast Core of the ingestion job being started
152+
grpc_cli call localhost:6565 GetFeatureSets ''
153+
```
154+
155+
156+
#### Ingestion and Population of Feature Values
157+
158+
```
159+
# Produce FeatureRow messages to Kafka so it will be ingested by Feast
160+
# and written to the registered stores.
161+
# Make sure the value here is the topic assigned to the feature set
162+
# ... producer.send("feast-driver-features" ...)
163+
#
164+
# Install Python SDK to help writing FeatureRow messages to Kafka
165+
cd $FEAST_HOME/sdk/python
166+
pip3 install -e .
167+
pip3 install pendulum
168+
169+
# Produce FeatureRow messages to Kafka so it will be ingested by Feast
170+
# and written to the corresponding store.
171+
# Make sure the value here is the topic assigned to the feature set
172+
# ... producer.send("feast-test_feature_set-features" ...)
173+
python3 - <<EOF
174+
import logging
175+
import pendulum
176+
from google.protobuf.timestamp_pb2 import Timestamp
177+
from kafka import KafkaProducer
178+
from feast.types.FeatureRow_pb2 import FeatureRow
179+
from feast.types.Field_pb2 import Field
180+
from feast.types.Value_pb2 import Value, Int32List, BytesList
181+
182+
logging.basicConfig(level=logging.INFO)
183+
logger = logging.getLogger(__name__)
184+
185+
producer = KafkaProducer(bootstrap_servers="localhost:9092")
186+
187+
row = FeatureRow()
188+
189+
fields = [
190+
Field(name="driver_id", value=Value(int64_val=1234)),
191+
Field(name="city", value=Value(string_val="JAKARTA")),
192+
]
193+
row.fields.MergeFrom(fields)
194+
195+
timestamp = Timestamp()
196+
timestamp.FromJsonString(
197+
pendulum.now("UTC").to_iso8601_string()
198+
)
199+
row.event_timestamp.CopyFrom(timestamp)
200+
201+
# The format is [FEATURE_NAME]:[VERSION]
202+
row.feature_set = "driver:1"
203+
204+
producer.send("feast-driver-features", row.SerializeToString())
205+
producer.flush()
206+
logger.info(row)
207+
EOF
208+
209+
# Check that the ingested feature rows can be retrieved from Feast serving
210+
grpc_cli call localhost:6566 GetOnlineFeatures '
211+
feature_sets {
212+
name: "driver"
213+
version: 1
214+
}
215+
entity_dataset {
216+
entity_names: "driver_id"
217+
entity_dataset_rows {
218+
entity_ids {
219+
int64_val: 1234
220+
}
221+
}
222+
}
223+
'
224+
```
225+
226+
#### Tips for quickly running Postgres, Redis and Kafka locally with Docker
227+
228+
This guide assumes you are running Docker service on a bridge network (which
229+
is usually the case if you're running Linux). Otherwise, you may need to
230+
use different network options than shown below.
231+
232+
> `--net host` usually only works as expected when you're running Docker
233+
> service in bridge networking mode.
234+
235+
```
236+
# Start Postgres
237+
docker run --name postgres --rm -it -d --net host -e POSTGRES_DB=postgres -e POSTGRES_USER=postgres \
238+
-e POSTGRES_PASSWORD=password postgres:12-alpine
239+
240+
# Start Redis
241+
docker run --name redis --rm -it --net host -d redis:5-alpine
242+
243+
# Start Zookeeper (needed by Kafka)
244+
docker run --rm \
245+
--net=host \
246+
--name=zookeeper \
247+
--env=ZOOKEEPER_CLIENT_PORT=2181 \
248+
--detach confluentinc/cp-zookeeper:5.2.1
249+
250+
# Start Kafka
251+
docker run --rm \
252+
--net=host \
253+
--name=kafka \
254+
--env=KAFKA_ZOOKEEPER_CONNECT=localhost:2181 \
255+
--env=KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \
256+
--env=KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
257+
--detach confluentinc/cp-kafka:5.2.1
258+
```
259+
3260
## Code reviews
4261

5262
Code submission to Feast (including submission from project maintainers) requires review and approval.
@@ -11,9 +268,9 @@ Please submit a **pull request** to initiate the code review process. We use [pr
11268

12269
We conform to the [java google style guide](https://google.github.io/styleguide/javaguide.html)
13270

14-
If using intellij please import the code styles:
271+
If using Intellij please import the code styles:
15272
https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml
16273

17274
### Go
18275

19-
Make sure you apply `go fmt`.
276+
Make sure you apply `go fmt`.

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ It aims to:
1313

1414
## High Level Architecture
1515

16-
![Feast Architecture](arch.png)
16+
![Feast Architecture](docs/assets/arch.png)
1717

1818
The Feast platform is broken down into the following functional areas:
1919

core/src/main/resources/application.yml

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,12 @@ feast:
4141
# Warehouse store type. For more information on the available types
4242
# see Store.proto. To provision feast without a warehouse store,
4343
# leave this value blank.
44-
warehouse-type: ""
44+
warehouse-type: BIGQUERY
45+
warehouse-options:
46+
projectId: my-google-project-id
47+
datasetId: my-bigquery-dataset-id
48+
# Same as feast.store.serving-options.subscriptions
49+
subscriptions: ".*:>0"
4550
jobs:
4651
# Runner type for feature population jobs. Currently supported runner types are
4752
# DirectRunner and DataflowRunner.
File renamed without changes.
80.9 KB
Loading

serving/src/main/java/feast/serving/configuration/ServingServiceConfig.java

Lines changed: 7 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -35,13 +35,6 @@ public class ServingServiceConfig {
3535
public ServingServiceConfig(FeastProperties feastProperties) {
3636
feastStoreName = feastProperties.getStoreName();
3737
jobStagingLocation = feastProperties.getJobStagingLocation();
38-
if (!jobStagingLocation.contains("://")) {
39-
throw new IllegalArgumentException(
40-
String.format("jobStagingLocation is not a valid URI: %s", jobStagingLocation));
41-
}
42-
if (jobStagingLocation.endsWith("/")) {
43-
jobStagingLocation = jobStagingLocation.substring(0, jobStagingLocation.length() - 1);
44-
}
4538
}
4639

4740
@Bean
@@ -91,6 +84,13 @@ public ServingService servingService(
9184
BigQueryConfig bqConfig = store.getBigqueryConfig();
9285
BigQuery bigquery = BigQueryOptions.getDefaultInstance().getService();
9386
Storage storage = StorageOptions.getDefaultInstance().getService();
87+
if (!jobStagingLocation.contains("://")) {
88+
throw new IllegalArgumentException(
89+
String.format("jobStagingLocation is not a valid URI: %s", jobStagingLocation));
90+
}
91+
if (jobStagingLocation.endsWith("/")) {
92+
jobStagingLocation = jobStagingLocation.substring(0, jobStagingLocation.length() - 1);
93+
}
9494
if (!this.jobStagingLocation.startsWith("gs://")) {
9595
throw new IllegalArgumentException(
9696
"Store type BIGQUERY requires job staging location to be a valid and existing Google Cloud Storage URI. Invalid staging location: "

serving/src/main/java/feast/serving/service/RedisServingService.java

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@
2424
import feast.core.CoreServiceProto.GetFeatureSetsRequest.Filter;
2525
import feast.core.FeatureSetProto.EntitySpec;
2626
import feast.core.FeatureSetProto.FeatureSetSpec;
27+
import feast.serving.ServingAPIProto.FeastServingType;
2728
import feast.serving.ServingAPIProto.GetBatchFeaturesResponse;
2829
import feast.serving.ServingAPIProto.GetFeastServingTypeRequest;
2930
import feast.serving.ServingAPIProto.GetFeastServingTypeResponse;
@@ -72,8 +73,9 @@ public RedisServingService(JedisPool jedisPool, SpecService specService, Tracer
7273
@Override
7374
public GetFeastServingTypeResponse getFeastServingType(
7475
GetFeastServingTypeRequest getFeastServingTypeRequest) {
75-
// return GetFeastServingTypeResponse.newBuilder().setType().build();
76-
return null;
76+
return GetFeastServingTypeResponse.newBuilder()
77+
.setType(FeastServingType.FEAST_SERVING_TYPE_ONLINE)
78+
.build();
7779
}
7880

7981
/** {@inheritDoc} */

0 commit comments

Comments
 (0)