# Feast Python SDK Quickstart Guide

## Pre-requisites

* A working Feast Cluster: Consult your Feast admin or [install your own](install.md).

## Where to get it

Binary installers for the latest released version are available at the
 [PyPI](https://pypi.org/project/feast/):

```sh
 pip install feast
```

## Getting started

### Configuring Feast client
All interaction with feast cluster happens via an instance of `feast.sdk.client.Client`. 
It should be pointed to correct core/serving urls of the feast cluster:
```python
from feast.sdk.client import Client

fs = Client(
    core_url=FEAST_CORE_URL, 
    serving_url=FEAST_SERVING_URL, 
    verbose=True)
```

If you are running a local feast cluster, then `FEAST_CORE_URL` and `FEAST_SERVING_URL`
could be something like:
```python
FEAST_CORE_URL="localhost:8080"
FEAST_SERVING_URL="localhost:8081"
```

### Registering an entity and features
Entities and features could be registered explicitly beforehand or during the data ingestion.

Register your first entity:
```python
from feast.sdk.resources.entity import Entity

entity = Entity(
    name='word',
    description='word found in shakespearean works'
)

fs.apply(entity)
```
And a feature, that belongs to this entity:
```python
from feast.sdk.resources.feature import Feature, Datastore, ValueType

word_count_feature = Feature(
    entity='word',
    name='count',
    value_type=ValueType.INT32,                      
    description='number of times the word appears',
    tags=['tag1', 'tag2'],
    owner='bob@feast.com',
    uri='https://github.com/bob/example',
    warehouse_store=Datastore(id='WAREHOUSE'),
    serving_store=Datastore(id='SERVING')    
)

fs.apply(word_count_feature)
```
Read more on entity/feature fields here: [Entity Spec](../../docs/specs.md#entity-spec)


### Ingest data for your feature
Let's create a simple [pandas](https://pandas.pydata.org/) dataframe
```python
import pandas as pd

words_df = pd.DataFrame({
    'word': ['the', 'and', 'i', 'to', 'of', 'a', 'you', 'my', 'in', 'that', 'is', 'not', 'with', 'me', 'it'],
    'count': [28944, 27317, 21120, 20136, 17181, 14945, 13989, 12949, 11513, 11488, 9545, 8855, 8293, 8043, 8003]
    })
```
And import it into the feast store:
```python
from datetime import datetime
from feast.sdk.importer import Importer

STAGING_LOCATION = 'gs://your-bucket'  

importer = Importer.from_df(words_df, 
                           entity='word', 
                           owner='bob@feast.com',
                           id_column='word',
                           timestamp_value=datetime(2018, 1, 1),
                           staging_location=STAGING_LOCATION,
                           serving_store=Datastore(id='SERVING'),
                           warehouse_store=Datastore(id='WAREHOUSE'))
    
fs.run(importer)
```
This will start an import job and ingest data into both warehouse and serving feast stores.
You can also import data from a CSV file (`Importer.from_csv(...)`) 
or a BigQuery (`Importer.from_bq(...)`)

### Query feature data for training your models
Now, when you have some data in the feast store, you may want to retrieve that 
dataset and train your model. Creating a training dataset allows you to isolate the data 
that goes into the model training step, allowing for reproduction and traceability. 

Also, it's possible to retrieve only these features required 
for training, by specifying them in a `FeatureSet`:

```python
from feast.sdk.resources.feature_set import FeatureSet

training_fs = FeatureSet(entity="word", features=["word.count"])

dataset_info = fs.create_dataset(
                                training_fs, 
                                start_date="2010-01-01", 
                                end_date="2018-01-01")

train_df = fs.download_dataset_to_df(dataset_info, staging_location=STAGING_LOCATION)

# train your model
# ...
``` 

### Query feature data for serving your models
Feast provides a means for accessing stored features in a serving environment, 
at low latency and high load. 

You have to provide IDs of entities, features of which you want to get served:
```python
serving_fs = FeatureSet(entity="word", features=["word.count"])

serving_df = fs.get_serving_data(serving_fs, entity_keys=["you", "and", "i"])
```
This will return a resulting `serving_df` as following:

|     | word | count |
| --- | ---  | ---   |
| 0   | and	 | 27317 |
| 1   | you	 | 13989 |
| 2   | i	   | 21120 |