11# Contributing Guide
22
3+ ## Getting Started
4+
5+ The following guide will help you quickly run Feast in your local machine.
6+
7+ The main components of Feast are:
8+ - ** Feast Core** handles FeatureSpec registration, starts and monitors Ingestion
9+ jobs and ensures that Feast internal metadata is consistent.
10+ - ** Feast Ingestion** subscribes to streams of FeatureRow and writes the feature
11+ values to registered Stores.
12+ - ** Feast Serving** handles requests for features values retrieval from the end users.
13+
14+ ![ Feast Components Overview] ( docs/assets/feast-components-overview.png )
15+
16+ ** Pre-requisites**
17+ - Java SDK version 8
18+ - Python version 3.6 (or above) and pip
19+ - Access to Postgres database (version 11 and above)
20+ - Access to [ Redis] ( https://redis.io/topics/quickstart ) instance (tested on version 5.x)
21+ - Access to [ Kafka] ( https://kafka.apache.org/ ) brokers (tested on version 2.x)
22+ - [ Maven ] ( https://maven.apache.org/install.html ) version 3.6.x
23+ - [ grpc_cli] ( https://github.com/grpc/grpc/blob/master/doc/command_line_tool.md )
24+ is useful for debugging and quick testing
25+ - An overview of Feast specifications and [ protos] ( ./protos/feast )
26+
27+ > ** Assumptions:**
28+ >
29+ > 1 . Postgres is running in "localhost:5432" and has a database called "postgres" which
30+ > can be accessed with credentials user "postgres" and password "password".
31+ > To use different database name and credentials, please update
32+ > "$FEAST_HOME/core/src/main/resources/application.yml"
33+ > or set these environment variables: DB_HOST, DB_USERNAME, DB_PASSWORD.
34+ > 2 . Redis is running locally and accessible from "localhost:6379"
35+ > 3 . Feast has admin access to BigQuery.
36+
37+
38+ ```
39+ # Clone Feast branch 0.3-dev
40+ # $FEAST_HOME will refer to be the root directory of this Feast Git repository
41+
42+ git clone -b 0.3-dev https://github.com/gojek/feast
43+ cd feast
44+ ```
45+
46+ #### Starting Feast Core
47+
48+ ```
49+ cd $FEAST_HOME/core
50+
51+ # Please check the default configuration for Feast Core in
52+ # "$FEAST_HOME/core/src/main/resources/application.yml" and update it accordingly.
53+ #
54+ # Start Feast Core GRPC server on localhost:6565
55+ mvn spring-boot:run
56+
57+ # If Feast Core starts successfully, verify the correct Stores are registered
58+ # correctly, for example by using grpc_cli.
59+ grpc_cli call localhost:6565 GetStores ''
60+
61+ # Should return something similar to the following.
62+ # Note that you should change BigQuery projectId and datasetId accordingly
63+ # in "$FEAST_HOME/core/src/main/resources/application.yml"
64+
65+ store {
66+ name: "SERVING"
67+ type: REDIS
68+ subscriptions {
69+ name: ".*"
70+ version: ">0"
71+ }
72+ redis_config {
73+ host: "localhost"
74+ port: 6379
75+ }
76+ }
77+ store {
78+ name: "WAREHOUSE"
79+ type: BIGQUERY
80+ subscriptions {
81+ name: ".*"
82+ version: ">0"
83+ }
84+ bigquery_config {
85+ project_id: "my-google-project-id"
86+ dataset_id: "my-bigquery-dataset-id"
87+ }
88+ }
89+ ```
90+
91+ #### Starting Feast Serving
92+
93+ Feast Serving requires administrators to provide an ** existing** store name in Feast.
94+ An instance of Feast Serving can only retrieve features from a ** single** store.
95+ > In order to retrieve features from multiple stores you must start ** multiple**
96+ instances of Feast serving. If you start multiple Feast serving on a single host,
97+ make sure that they are listening on different ports.
98+
99+ ```
100+ cd $FEAST_HOME/serving
101+
102+ # Start Feast Serving GRPC server on localhost:6566 with store name "SERVING"
103+ mvn spring-boot:run -Dspring-boot.run.arguments='--feast.store-name=SERVING'
104+
105+ # To verify Feast Serving starts successfully
106+ grpc_cli call localhost:6566 GetFeastServingType ''
107+
108+ # Should return something similar to the following.
109+ type: FEAST_SERVING_TYPE_ONLINE
110+ ```
111+
112+
113+ #### Registering a FeatureSet
114+
115+ Create a new FeatureSet on Feast by sending a request to Feast Core. When a
116+ feature set is successfully registered, Feast Core will start an ** ingestion** job
117+ that listens for new features in the FeatureSet. Note that Feast currently only
118+ supports source of type "KAFKA", so you must have access to a running Kafka broker
119+ to register a FeatureSet successfully.
120+
121+ ```
122+ # Example of registering a new driver feature set
123+ # Note the source value, it assumes that you have access to a Kafka broker
124+ # running on localhost:9092
125+
126+ grpc_cli call localhost:6565 ApplyFeatureSet '
127+ feature_set {
128+ name: "driver"
129+ version: 1
130+
131+ entities {
132+ name: "driver_id"
133+ value_type: INT64
134+ }
135+
136+ features {
137+ name: "city"
138+ value_type: STRING
139+ }
140+
141+ source {
142+ type: KAFKA
143+ kafka_source_config {
144+ bootstrap_servers: "localhost:9092"
145+ }
146+ }
147+ }
148+ '
149+
150+ # To check that the FeatureSet has been registered correctly.
151+ # You should also see logs from Feast Core of the ingestion job being started
152+ grpc_cli call localhost:6565 GetFeatureSets ''
153+ ```
154+
155+
156+ #### Ingestion and Population of Feature Values
157+
158+ ```
159+ # Produce FeatureRow messages to Kafka so it will be ingested by Feast
160+ # and written to the registered stores.
161+ # Make sure the value here is the topic assigned to the feature set
162+ # ... producer.send("feast-driver-features" ...)
163+ #
164+ # Install Python SDK to help writing FeatureRow messages to Kafka
165+ cd $FEAST_HOME/sdk/python
166+ pip3 install -e .
167+ pip3 install pendulum
168+
169+ # Produce FeatureRow messages to Kafka so it will be ingested by Feast
170+ # and written to the corresponding store.
171+ # Make sure the value here is the topic assigned to the feature set
172+ # ... producer.send("feast-test_feature_set-features" ...)
173+ python3 - <<EOF
174+ import logging
175+ import pendulum
176+ from google.protobuf.timestamp_pb2 import Timestamp
177+ from kafka import KafkaProducer
178+ from feast.types.FeatureRow_pb2 import FeatureRow
179+ from feast.types.Field_pb2 import Field
180+ from feast.types.Value_pb2 import Value, Int32List, BytesList
181+
182+ logging.basicConfig(level=logging.INFO)
183+ logger = logging.getLogger(__name__)
184+
185+ producer = KafkaProducer(bootstrap_servers="localhost:9092")
186+
187+ row = FeatureRow()
188+
189+ fields = [
190+ Field(name="driver_id", value=Value(int64_val=1234)),
191+ Field(name="city", value=Value(string_val="JAKARTA")),
192+ ]
193+ row.fields.MergeFrom(fields)
194+
195+ timestamp = Timestamp()
196+ timestamp.FromJsonString(
197+ pendulum.now("UTC").to_iso8601_string()
198+ )
199+ row.event_timestamp.CopyFrom(timestamp)
200+
201+ # The format is [FEATURE_NAME]:[VERSION]
202+ row.feature_set = "driver:1"
203+
204+ producer.send("feast-driver-features", row.SerializeToString())
205+ producer.flush()
206+ logger.info(row)
207+ EOF
208+
209+ # Check that the ingested feature rows can be retrieved from Feast serving
210+ grpc_cli call localhost:6566 GetOnlineFeatures '
211+ feature_sets {
212+ name: "driver"
213+ version: 1
214+ }
215+ entity_dataset {
216+ entity_names: "driver_id"
217+ entity_dataset_rows {
218+ entity_ids {
219+ int64_val: 1234
220+ }
221+ }
222+ }
223+ '
224+ ```
225+
226+ #### Tips for quickly running Postgres, Redis and Kafka locally with Docker
227+
228+ This guide assumes you are running Docker service on a bridge network (which
229+ is usually the case if you're running Linux). Otherwise, you may need to
230+ use different network options than shown below.
231+
232+ > ` --net host ` usually only works as expected when you're running Docker
233+ > service in bridge networking mode.
234+
235+ ```
236+ # Start Postgres
237+ docker run --name postgres --rm -it -d --net host -e POSTGRES_DB=postgres -e POSTGRES_USER=postgres \
238+ -e POSTGRES_PASSWORD=password postgres:12-alpine
239+
240+ # Start Redis
241+ docker run --name redis --rm -it --net host -d redis:5-alpine
242+
243+ # Start Zookeeper (needed by Kafka)
244+ docker run --rm \
245+ --net=host \
246+ --name=zookeeper \
247+ --env=ZOOKEEPER_CLIENT_PORT=2181 \
248+ --detach confluentinc/cp-zookeeper:5.2.1
249+
250+ # Start Kafka
251+ docker run --rm \
252+ --net=host \
253+ --name=kafka \
254+ --env=KAFKA_ZOOKEEPER_CONNECT=localhost:2181 \
255+ --env=KAFKA_ADVERTISED_LISTENERS=PLAINTEXT://localhost:9092 \
256+ --env=KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR=1 \
257+ --detach confluentinc/cp-kafka:5.2.1
258+ ```
259+
3260## Code reviews
4261
5262Code submission to Feast (including submission from project maintainers) requires review and approval.
@@ -11,9 +268,9 @@ Please submit a **pull request** to initiate the code review process. We use [pr
11268
12269We conform to the [ java google style guide] ( https://google.github.io/styleguide/javaguide.html )
13270
14- If using intellij please import the code styles:
271+ If using Intellij please import the code styles:
15272https://github.com/google/styleguide/blob/gh-pages/intellij-java-google-style.xml
16273
17274### Go
18275
19- Make sure you apply ` go fmt ` .
276+ Make sure you apply ` go fmt ` .
0 commit comments