-
-
Notifications
You must be signed in to change notification settings - Fork 270
Feather investigation #894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from 37 commits
5a19931
87907d9
6d2f5c3
33881ea
55743bd
1437005
ef461d7
5c27237
f61d9b5
484869e
3c513b0
0b3d781
a9becf1
aff8aff
48e2a16
19c22fe
98be055
112eb1d
99fac3d
cf3cbad
7583e88
09d6bdb
3aff927
b521534
8eb77cf
4894bbd
b6839b1
131bdad
aeb9b98
865d4dc
701496f
f689897
74f359e
19272e5
09a5469
f0da5a1
ed8ca7b
d7488f7
d09c431
bf44356
e6bc0b0
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -1316,3 +1316,33 @@ def test_list_qualities(self): | |
| qualities = openml.datasets.list_qualities() | ||
| self.assertEqual(isinstance(qualities, list), True) | ||
| self.assertEqual(all([isinstance(q, str) for q in qualities]), True) | ||
|
|
||
| def test_get_dataset_cache_format_pickle(self): | ||
| # Feather format cant be tested without installing pyarrow | ||
|
mfeurer marked this conversation as resolved.
Outdated
|
||
| # this test case checks if pickle option works | ||
| dataset = openml.datasets.get_dataset(1) | ||
| self.assertEqual(type(dataset), OpenMLDataset) | ||
| self.assertEqual(dataset.name, 'anneal') | ||
| self.assertGreater(len(dataset.features), 1) | ||
| self.assertGreater(len(dataset.qualities), 4) | ||
|
|
||
| X, y, categorical, attribute_names = dataset.get_data() | ||
| self.assertIsInstance(X, pd.DataFrame) | ||
| self.assertEqual(X.shape, (898, 39)) | ||
| self.assertEqual(len(categorical), X.shape[1]) | ||
| self.assertEqual(len(attribute_names), X.shape[1]) | ||
|
|
||
| def test_get_dataset_cache_format_feather(self): | ||
| # Feather format cant be tested without installing pyarrow | ||
| # this test case checks if pickle option works | ||
| dataset = openml.datasets.get_dataset('iris', cache_format='feather') | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you please add a test that the correct files are actually written to disk?
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm afraid this doesn't test the right thing. We change the cache directory for unit tests, so you'd need to use a slightly different data_folder here (https://github.com/openml/openml-python/blob/develop/openml/testing.py#L92) |
||
| self.assertEqual(type(dataset), OpenMLDataset) | ||
| self.assertEqual(dataset.name, 'iris') | ||
| self.assertGreater(len(dataset.features), 1) | ||
| self.assertGreater(len(dataset.qualities), 4) | ||
|
|
||
| X, y, categorical, attribute_names = dataset.get_data() | ||
| self.assertIsInstance(X, pd.DataFrame) | ||
| self.assertEqual(X.shape, (150, 5)) | ||
| self.assertEqual(len(categorical), X.shape[1]) | ||
| self.assertEqual(len(attribute_names), X.shape[1]) | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm somewhat surprised that this one isn't annotated. @Neeratyoy could you please add to your stack to figure out why this is legal given that we have mypy running?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure.
Just to confirm the task: I have to check why the missing annotation for cache_format was never caught.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's correct.