Skip to content

Commit fbad798

Browse files
committed
Fix documentation links in README (+other fixes)
* Fix documentation links in introduction (contributing.md / operators.md etc) * Cleanup the 'tutorials and examples' section in the README * Switch to use the RTD theme * rename gpu_benchmark-criteo.ipynb to criteo-example.ipynb * Add basic requirements.txt / requirement-dev.txt files * Remove redudant section headings in docs * flake8/black/isort
1 parent b115642 commit fbad798

15 files changed

Lines changed: 153 additions & 133 deletions

README.md

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -43,17 +43,13 @@ Requirements.yml
4343

4444
### Examples and Tutorials
4545

46-
A workflow demonstrating the preprocessing and data-loading components of NVTabular can be found in the DeepLearningExamples tutorial on training Facebook's [Deep Learning Recommender Model (DLRM)](https://github.com/facebookresearch/dlrm/) on the [Criteo 1TB dataset](https://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/).
46+
An example demonstrating how to use NVTabular to preprocess the [Criteo 1TB dataset](https://labs.criteo.com/2014/02/kaggle-display-advertising-challenge-dataset/) can be found in the [criteo example notebook](examples/criteo-example.ipynb). This example also shows how to use NVTabular's data-loaders on the preprocessed data to train Facebook's [Deep Learning Recommender Model (DLRM)](https://github.com/facebookresearch/dlrm/).
4747

48-
[ DLRM Criteo Workflow ](https://developer.nvidia.com/deep-learning-examples#rec-sys)
49-
50-
We also have a simple tutorial that demonstrates similar functionality on a much smaller dataset, providing a pipeline for the [Rossman store sales dataset](https://www.kaggle.com/c/rossmann-store-sales) fed into a [fast.ai tabular data model](https://docs.fast.ai/tabular.html).
51-
52-
[ Rossman Store Sales ](examples/gpu_benchmark-rossmann.ipynb)
48+
We also have a [simple tutorial](examples/rossmann-store-sales-example.ipynb) that demonstrates similar functionality on a much smaller dataset, providing a pipeline for the [Rossman store sales dataset](https://www.kaggle.com/c/rossmann-store-sales) fed into a [fast.ai tabular data model](https://docs.fast.ai/tabular.html).
5349

5450
### Contributing
5551

56-
If you wish to contribute to the library directly please see [Contributing.md](https://github.com/nvidia/NVTabular/blob/master/CONTRIBUTING.md). We are in particular interested in contributions or feature requests for feature engineering or preprocessing operations that you have found helpful in your own workflows.
52+
If you wish to contribute to the library directly please see [Contributing.md](./CONTRIBUTING.md). We are in particular interested in contributions or feature requests for feature engineering or preprocessing operations that you have found helpful in your own workflows.
5753

5854
### Learn More
5955

docs/source/Operators.md

Lines changed: 0 additions & 1 deletion
This file was deleted.

docs/source/api/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ API Documentation
33

44
.. toctree::
55
:maxdepth: 2
6-
:caption: API Documentation:
76

87
Workflow <workflow>
98
Operators <ops/index>

docs/source/api/ops/index.rst

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@ Operators
33

44
.. toctree::
55
:maxdepth: 2
6-
:caption: Operators:
76

87
Categorify <categorify>
98
FillMissing <fillmissing>

docs/source/conf.py

Lines changed: 32 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -13,12 +13,11 @@
1313
import os
1414
import sys
1515

16-
sys.path.insert(0, os.path.abspath("../../."))
17-
18-
import recommonmark
19-
from recommonmark.transform import AutoStructify
16+
import sphinx
2017
from recommonmark.parser import CommonMarkParser
2118

19+
sys.path.insert(0, os.path.abspath("../../."))
20+
2221

2322
# -- Project information -----------------------------------------------------
2423

@@ -36,11 +35,13 @@
3635
# extensions coming with Sphinx (named 'sphinx.ext.*') or your custom
3736
# ones.
3837
extensions = [
38+
"sphinx_rtd_theme",
3939
"recommonmark",
4040
"nbsphinx",
4141
"sphinx.ext.autodoc",
4242
"sphinx.ext.coverage",
4343
"sphinx.ext.napoleon",
44+
"sphinx.ext.viewcode",
4445
]
4546

4647
# Add any paths that contain templates here, relative to this directory.
@@ -57,7 +58,7 @@
5758
# The theme to use for HTML and HTML Help pages. See the documentation for
5859
# a list of builtin themes.
5960
#
60-
html_theme = "alabaster"
61+
html_theme = "sphinx_rtd_theme"
6162

6263
# Add any paths that contain custom static files (such as style sheets) here,
6364
# relative to this directory. They are copied after the builtin static files,
@@ -67,13 +68,31 @@
6768
source_parsers = {".md": CommonMarkParser}
6869
source_suffix = [".rst", ".md"]
6970

70-
def setup(app):
71-
app.add_config_value('recommonmark_config', {
72-
'enable_math': True,
73-
'enable_eval_rst': True,
74-
'auto_code_block': True,
75-
}, True)
76-
app.add_transform(AutoStructify)
71+
nbsphinx_allow_errors = True
72+
html_show_sourcelink = False
73+
74+
# certain references in the README couldn't be autoresolved here,
75+
# hack by forcing to the either the correct documentation page (examples)
76+
# or to a blob on the repo
77+
_REPO = "https://github.com/NVIDIA/NVTabular/blob/master/"
78+
_URL_MAP = {
79+
"./examples": "examples/index",
80+
"examples/rossmann-store-sales-example.ipynb": "examples/rossmann",
81+
"examples/criteo-example.ipynb": "examples/criteo",
82+
"./CONTRIBUTING": _REPO + "/CONTRIBUTING.md",
83+
"./Operators": _REPO + "/Operators.md",
84+
}
85+
86+
87+
class GitHubDomain(sphinx.domains.Domain):
88+
def resolve_any_xref(self, env, docname, builder, target, node, contnode):
89+
resolved = _URL_MAP.get(target)
90+
print("resolver", target, resolved)
91+
if resolved:
92+
contnode["refuri"] = resolved
93+
return [("github:any", contnode)]
94+
return []
7795

7896

79-
nbsphinx_allow_errors = True
97+
def setup(app):
98+
app.add_domain(GitHubDomain)

docs/source/examples/criteo.ipynb

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
../../../examples/gpu_benchmark-criteo.ipynb
1+
../../../examples/criteo-example.ipynb

docs/source/examples/index.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,8 +2,7 @@ Examples
22
========
33

44
.. toctree::
5-
:maxdepth: 4
6-
:caption: Examples:
5+
:maxdepth: 2
76

87
Rossmann Example <rossmann>
98
Criteo Example <criteo>

docs/source/index.rst

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,12 +8,11 @@ Welcome to NVTabular's documentation!
88

99
.. toctree::
1010
:maxdepth: 3
11-
:caption: Contents:
1211

1312
Introduction <Introduction>
1413
How it Works <HowItWorks>
15-
API Documentation <api/index>
1614
Examples <examples/index>
15+
API Documentation <api/index>
1716

1817

1918
Indices and tables

examples/rossmann-store-sales-example.ipynb

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -264,7 +264,7 @@
264264
"cell_type": "markdown",
265265
"metadata": {},
266266
"source": [
267-
"#### Tensorflow\n",
267+
"### Tensorflow\n",
268268
"\n",
269269
"`KerasSequenceDataset` wraps a lightweight iterator around a `dataset` object to handle chunking, shuffling, and application of any workflows (which can be applied online as a preprocessing step). For column names, can use either a list of string names or a list of TensorFlow `feature_columns` that will be used to feed the network"
270270
]
@@ -324,8 +324,8 @@
324324
"cell_type": "markdown",
325325
"metadata": {},
326326
"source": [
327-
"#### PyTorch\n",
328-
"`workflow.ds_to_tensors` maps a symbolic dataset object to `cat_features`, `cont_features`, `labels` PyTorch tenosrs by iterating through the dataset and concatenating the results. Note that this means that the whole of the dataset is _in memory_. For larger than memory datasets, see the example in [gpu_benchmark-criteo.ipynb](./gpu_benchmark-criteo.ipynb) leveraing PyTorch `ChainDataset`s."
327+
"### PyTorch\n",
328+
"`workflow.ds_to_tensors` maps a symbolic dataset object to `cat_features`, `cont_features`, `labels` PyTorch tenosrs by iterating through the dataset and concatenating the results. Note that this means that the whole of the dataset is _in memory_. For larger than memory datasets, see the example in [criteo-example.ipynb](./criteo-example.ipynb) leveraing PyTorch `ChainDataset`s."
329329
]
330330
},
331331
{

0 commit comments

Comments
 (0)