A collection of datasets from multiple sources to be used for demonstrations in data science courses.
A data dictionary (or data description) is provided for some of the datasets in this repo. Click on the dataset of interest in the list below to learn more about the available attributes.
Click to see data dictionary for:
| Variables | Description |
|---|---|
CRIM |
Crime rate |
ZN |
Percentage of residential land zoned for lots over 25,000 ft2 |
INDUS |
Percentage of land occupied by non-retail business |
CHAS |
Does tract bound Charles River (= 1 if tract bounds river, = 0 otherwise) |
NOX |
Nitric oxide concentration (parts per 10 million) |
RM |
Average number of rooms per dwelling |
AGE |
Percentage of owner-occupied units built prior to 1940 |
DIS |
Weighted distances to five Boston employment centers |
RAD |
Index of accessibility to radial highways |
TAX |
Full-value property tax rate per $10,000 |
PTRATIO |
Pupil-to-teacher ratio by town |
LSTAT |
Percentage of lower status of the population |
MEDV |
Median value of owner-occupied homes in $1000s |
CAT.MEDV |
Is median value of owner-occupied homes in tract above $30,000 (CAT.MEDV = 1) or not (CAT.MEDV = 0) |
Click to see data dictionary for:
| Variables | Description |
|---|---|
var1 |
some description |
var2 |
some description |