Nucleus

https://dashboard.scale.com/nucleus

Aggregate metrics in ML are not good enough. To improve production ML, you need to understand their qualitative failure modes, fix them by gathering more data, and curate diverse scenarios.

Scale Nucleus helps you:

Visualize your data
Curate interesting slices within your dataset
Review and manage annotations
Measure and debug your model performance

Nucleus is a new way—the right way—to develop ML models, helping us move away from the concept of one dataset and towards a paradigm of collections of scenarios.

Installation

Editable mode

$ pip install -e .

As a Normal Package

$ pip install git+ssh://git@github.com/scaleapi/nucleus-python-client.git

Usage

The first step to using the Nucleus library is instantiating a client object. The client abstractions serves to authenticate the user and act as the gateway for users to interact with their datasets, models, and model runs.

Create a client object

import nucleus
client = nucleus.NucleusClient("YOUR_API_KEY_HERE")

Create Dataset

response = client.create_dataset({"name": "My Dataset"})
dataset = client.get_dataset(response["dataset_id"])

List Datasets

datasets = client.list_datasets()

Delete a Dataset

By specifying target dataset id. A response code of 200 indicates successful deletion.

client.delete_dataset("YOUR_DATASET_ID")

Append Items to a Dataset

You can append both local images and images from the web. Each image object is a dictionary with three fields:

datasetItem1 = {"image_url": "http://<my_image_url>", "reference_id": "my_image_name.jpg",
  "metadata": {"label": "0"}}

The append function expects a list of datasetItems to upload, like this:

response = dataset.append({"items": [datasetItem2]})

If you're uploading a local image, you can specify a filepath as the image_url.

datasetItem2 = {"image_url": "./data_folder/my_img_001.png", "reference_id": "my_img_001.png",
  "metadata": {"label": "1"}}
response = dataset.append({"items": [datasetItem2]}, local = True)

For particularly large item uploads, consider using one of the example scripts located in references These scripts upload items in batches for easier debugging.

Get Dataset Info

Tells us the dataset name, number of dataset items, model_runs, and slice_ids.

dataset.info

Access Dataset Items

There are three methods to access individual Dataset Items:

(1) Dataset Items are accessible by reference id

item = dataset.refloc("my_img_001.png")

(2) Dataset Items are accessible by index

item = dataset.iloc(0)

(3) Dataset Items are accessible by the dataset_item_id assigned internally

item = dataset.loc("dataset_item_id")

Add Annotations

Upload groundtruth annotations for the items in your dataset. Box2DAnnotation has same format as https://dashboard.scale.com/nucleus/docs/api#add-ground-truth

response = dataset.annotate({"annotations:" [Box2DAnnotation, ..., Box2DAnnotation]})

For particularly large payloads, please reference the accompanying scripts in references

Add Model

The model abstraction is intended to represent a unique architecture. Models are independent of any dataset.

response = client.add_model({"name": "My Model", "reference_id": "model-0.5", "metadata": {"iou_thr": 0.5}})

Create Model Run

In contrast to the model abstraction, the model run abstraction represents an experiment. A model run is associated with both a model and a dataset. A model run is meant to represent "the predictions of model y on dataset x"

Creating a model run returns a ModelRun object.

model_run = dataset.create_model_run({"reference_id": "model-0.5"})

Get ModelRun Info

Returns the associated model_id, human-readable name of the run, status, and user specified metadata.

model_run.info

Upload Predictions to ModelRun

This method populates the model_run object with predictions. Returns the associated model_id, human-readable name of the run, status, and user specified metadata. Takes a list of Box2DPredictions within the payload, where Box2DPrediction is formulated as in https://dashboard.scale.com/nucleus/docs/api#upload-model-outputs

payload = {"annotations": List[Box2DPrediction]}
model_run.predict(payload)

Accessing ModelRun Predictions

You can access the modelRun predictions for an individual dataset_item through three methods:

(1) user specified reference_id

model_run.refloc("my_img_001.png")

(2) Index

model_run.iloc(0)

(3) Internally maintained dataset_item_id

model_run.loc("dataset_item_id")

Commit ModelRun

The commit action indicates that the user is finished uploading predictions associated with this model run. Committing a model run kicks off Nucleus internal processes to calculate performance metrics like IoU. After being committed, a ModelRun object becomes immutable.

model_run.commit()

Name		Name	Last commit message	Last commit date
Latest commit History 75 Commits
nucleus		nucleus
references		references
.flake8		.flake8
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
.pylintrc		.pylintrc
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nucleus

Installation

Editable mode

As a Normal Package

Usage

Create a client object

Create Dataset

List Datasets

Delete a Dataset

Append Items to a Dataset

Get Dataset Info

Access Dataset Items

Add Annotations

Add Model

Create Model Run

Get ModelRun Info

Upload Predictions to ModelRun

Accessing ModelRun Predictions

Commit ModelRun

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Nucleus

Installation

Editable mode

As a Normal Package

Usage

Create a client object

Create Dataset

List Datasets

Delete a Dataset

Append Items to a Dataset

Get Dataset Info

Access Dataset Items

Add Annotations

Add Model

Create Model Run

Get ModelRun Info

Upload Predictions to ModelRun

Accessing ModelRun Predictions

Commit ModelRun

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages