Skip to content

Commit d940e0e

Browse files
authored
Prepare release 0.14 (#1262)
* Bump version number and add changelog * Incorporate feedback from Pieter * Fix unit test * Make assert less strict * Update release notes * Fix indent
1 parent abf9506 commit d940e0e

3 files changed

Lines changed: 57 additions & 22 deletions

File tree

doc/progress.rst

Lines changed: 46 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -6,25 +6,55 @@
66
Changelog
77
=========
88

9+
0.14.0
10+
~~~~~~
11+
12+
**IMPORTANT:** This release paves the way towards a breaking update of OpenML-Python. From version
13+
0.15, functions that had the option to return a pandas DataFrame will return a pandas DataFrame
14+
by default. This version (0.14) emits a warning if you still use the old access functionality.
15+
More concretely:
16+
17+
* In 0.15 we will drop the ability to return dictionaries in listing calls and only provide
18+
pandas DataFrames. To disable warnings in 0.14 you have to request a pandas DataFrame
19+
(using ``output_format="dataframe"``).
20+
* In 0.15 we will drop the ability to return datasets as numpy arrays and only provide
21+
pandas DataFrames. To disable warnings in 0.14 you have to request a pandas DataFrame
22+
(using ``dataset_format="dataframe"``).
23+
24+
Furthermore, from version 0.15, OpenML-Python will no longer download datasets and dataset metadata
25+
by default. This version (0.14) emits a warning if you don't explicitly specifiy the desired behavior.
26+
27+
Please see the pull requests #1258 and #1260 for further information.
28+
29+
* ADD #1081: New flag that allows disabling downloading dataset features.
30+
* ADD #1132: New flag that forces a redownload of cached data.
31+
* FIX #1244: Fixes a rare bug where task listing could fail when the server returned invalid data.
32+
* DOC #1229: Fixes a comment string for the main example.
33+
* DOC #1241: Fixes a comment in an example.
34+
* MAINT #1124: Improve naming of helper functions that govern the cache directories.
35+
* MAINT #1223, #1250: Update tools used in pre-commit to the latest versions (``black==23.30``, ``mypy==1.3.0``, ``flake8==6.0.0``).
36+
* MAINT #1253: Update the citation request to the JMLR paper.
37+
* MAINT #1246: Add a warning that warns the user that checking for duplicate runs on the server cannot be done without an API key.
38+
939
0.13.1
1040
~~~~~~
1141

12-
* ADD #1081 #1132: Add additional options for (not) downloading datasets ``openml.datasets.get_dataset`` and cache management.
13-
* ADD #1028: Add functions to delete runs, flows, datasets, and tasks (e.g., ``openml.datasets.delete_dataset``).
14-
* ADD #1144: Add locally computed results to the ``OpenMLRun`` object's representation if the run was created locally and not downloaded from the server.
15-
* ADD #1180: Improve the error message when the checksum of a downloaded dataset does not match the checksum provided by the API.
16-
* ADD #1201: Make ``OpenMLTraceIteration`` a dataclass.
17-
* DOC #1069: Add argument documentation for the ``OpenMLRun`` class.
18-
* DOC #1241 #1229 #1231: Minor documentation fixes and resolve documentation examples not working.
19-
* FIX #1197 #559 #1131: Fix the order of ground truth and predictions in the ``OpenMLRun`` object and in ``format_prediction``.
20-
* FIX #1198: Support numpy 1.24 and higher.
21-
* FIX #1216: Allow unknown task types on the server. This is only relevant when new task types are added to the test server.
22-
* FIX #1223: Fix mypy errors for implicit optional typing.
23-
* MAINT #1155: Add dependabot github action to automatically update other github actions.
24-
* MAINT #1199: Obtain pre-commit's flake8 from github.com instead of gitlab.com.
25-
* MAINT #1215: Support latest numpy version.
26-
* MAINT #1218: Test Python3.6 on Ubuntu 20.04 instead of the latest Ubuntu (which is 22.04).
27-
* MAINT #1221 #1212 #1206 #1211: Update github actions to the latest versions.
42+
* ADD #1081 #1132: Add additional options for (not) downloading datasets ``openml.datasets.get_dataset`` and cache management.
43+
* ADD #1028: Add functions to delete runs, flows, datasets, and tasks (e.g., ``openml.datasets.delete_dataset``).
44+
* ADD #1144: Add locally computed results to the ``OpenMLRun`` object's representation if the run was created locally and not downloaded from the server.
45+
* ADD #1180: Improve the error message when the checksum of a downloaded dataset does not match the checksum provided by the API.
46+
* ADD #1201: Make ``OpenMLTraceIteration`` a dataclass.
47+
* DOC #1069: Add argument documentation for the ``OpenMLRun`` class.
48+
* DOC #1241 #1229 #1231: Minor documentation fixes and resolve documentation examples not working.
49+
* FIX #1197 #559 #1131: Fix the order of ground truth and predictions in the ``OpenMLRun`` object and in ``format_prediction``.
50+
* FIX #1198: Support numpy 1.24 and higher.
51+
* FIX #1216: Allow unknown task types on the server. This is only relevant when new task types are added to the test server.
52+
* FIX #1223: Fix mypy errors for implicit optional typing.
53+
* MAINT #1155: Add dependabot github action to automatically update other github actions.
54+
* MAINT #1199: Obtain pre-commit's flake8 from github.com instead of gitlab.com.
55+
* MAINT #1215: Support latest numpy version.
56+
* MAINT #1218: Test Python3.6 on Ubuntu 20.04 instead of the latest Ubuntu (which is 22.04).
57+
* MAINT #1221 #1212 #1206 #1211: Update github actions to the latest versions.
2858

2959
0.13.0
3060
~~~~~~

openml/__version__.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,4 +3,4 @@
33
# License: BSD 3-Clause
44

55
# The following line *must* be the last in the module, exactly as formatted:
6-
__version__ = "0.14.0dev"
6+
__version__ = "0.14.0"

tests/test_utils/test_utils.py

Lines changed: 10 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -22,17 +22,22 @@ def test_list_all(self):
2222

2323
def test_list_all_with_multiple_batches(self):
2424
res = openml.utils._list_all(
25-
listing_call=openml.tasks.functions._list_tasks, output_format="dict", batch_size=2000
25+
listing_call=openml.tasks.functions._list_tasks, output_format="dict", batch_size=1050
2626
)
2727
# Verify that test server state is still valid for this test to work as intended
28-
# -> If the number of results is less than 2000, the test can not test the
29-
# batching operation.
30-
assert len(res) > 2000
28+
# -> If the number of results is less than 1050, the test can not test the
29+
# batching operation. By having more than 1050 results we know that batching
30+
# was triggered. 1050 appears to be a number of tasks that is available on a fresh
31+
# test server.
32+
assert len(res) > 1050
3133
openml.utils._list_all(
3234
listing_call=openml.tasks.functions._list_tasks,
3335
output_format="dataframe",
34-
batch_size=2000,
36+
batch_size=1050,
3537
)
38+
# Comparing the number of tasks is not possible as other unit tests running in
39+
# parallel might be adding or removing tasks!
40+
# assert len(res) <= len(res2)
3641

3742
@unittest.mock.patch("openml._api_calls._perform_api_call", side_effect=mocked_perform_api_call)
3843
def test_list_all_few_results_available(self, _perform_api_call):

0 commit comments

Comments
 (0)