Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
54 commits
Select commit Hold shift + click to select a range
f16ba08
minor fixes to usage.rst (#1090)
mfeurer May 31, 2021
6717e66
Add Windows to Github Action CI matrix (#1095)
PGijsbers Jun 16, 2021
2984403
Add ChunkedError to list of retry exception (#1118)
PGijsbers Oct 27, 2021
a6c0576
Always ignore MaxRetryError but log with warning (#1119)
PGijsbers Oct 27, 2021
b4c868a
Fix/1110 (#1117)
PGijsbers Oct 28, 2021
aed5010
Add AttributeError as suspect for dependency issue (#1121)
PGijsbers Nov 3, 2021
db7bb9a
Add CITATION.cff (#1120)
PGijsbers Jan 11, 2022
493511a
Precommit update (#1129)
PGijsbers Apr 14, 2022
99a62f6
Predictions (#1128)
PGijsbers Apr 19, 2022
c911d6d
Use GET instead of POST for flow exist (#1147)
PGijsbers Jun 28, 2022
c6fab8e
pre-commit update (#1150)
PGijsbers Jul 11, 2022
a8d96d5
Replace removed file with new target for download test (#1158)
PGijsbers Aug 16, 2022
ccb3e8e
Fix outdated docstring for list_tasks function (#1149)
chadmarchand Oct 6, 2022
9ce2a6b
Improve the error message on out-of-sync flow ids (#1171)
PGijsbers Oct 7, 2022
2ed77db
Add scikit-learn 1.0 and 1.1 values for test (#1168)
PGijsbers Oct 7, 2022
2fde8d5
Update Pipeline description for >=1.0 (#1170)
PGijsbers Oct 7, 2022
2ddae0f
Update URL to reflect new endpoint (#1172)
PGijsbers Oct 7, 2022
c17704e
Remove tests which only test scikit-learn functionality (#1169)
PGijsbers Oct 7, 2022
953f84e
fix nonetype error during print for tasks without class labels (#1148)
willcmartin Oct 7, 2022
6da0aac
Flow exists GET is deprecated, use POST (#1173)
PGijsbers Oct 10, 2022
22ee9cd
Test `get_parquet` on production server (#1174)
PGijsbers Oct 11, 2022
5cd6973
Refactor out different test cases to separate tests (#1176)
PGijsbers Oct 18, 2022
e6250fa
Provide clearer error when server provides bad data description XML (…
PGijsbers Oct 24, 2022
75fed8a
Update more sklearn tests (#1175)
PGijsbers Oct 24, 2022
f37ebbe
Remove dtype checking for prediction comparison (#1177)
PGijsbers Nov 24, 2022
a909a0c
feat(minio): Allow for proxies (#1184)
eddiebergman Nov 25, 2022
1dfe398
Update __version__.py (#1189)
PGijsbers Nov 25, 2022
580b536
Download all files (#1188)
PGijsbers Nov 25, 2022
5eb84ce
Skip tests that use arff reading optimization for typecheck (#1185)
PGijsbers Nov 25, 2022
467f6eb
Update configs (#1199)
PGijsbers Feb 20, 2023
dd62f2b
Update tests for sklearn 1.2, server issue (#1200)
PGijsbers Feb 20, 2023
2a7ab17
Version bump to dev and add changelog stub (#1190)
PGijsbers Feb 20, 2023
5f72e2e
Add: dependabot checks for workflow versions (#1155)
eddiebergman Feb 20, 2023
7d069a9
Change the cached file to reflect new standard #1188 (#1203)
PGijsbers Feb 21, 2023
23755bf
Bump actions/checkout from 2 to 3 (#1206)
dependabot[bot] Feb 21, 2023
603fe60
Update docker actions (#1211)
mfeurer Feb 22, 2023
17ff086
Support new numpy (#1215)
mfeurer Feb 23, 2023
d9850be
Allow unknown task types on the server (#1216)
mfeurer Feb 23, 2023
a968288
Mark sklearn tests (#1202)
PGijsbers Feb 23, 2023
beb598c
Bump actions/setup-python from 2 to 4 (#1212)
dependabot[bot] Feb 24, 2023
c590b3a
Make OpenMLTraceIteration a dataclass (#1201)
PGijsbers Feb 24, 2023
bbf09b3
Fix: correctly order the ground truth and prediction for ARFF files i…
LennartPurucker Feb 24, 2023
b84536a
Fix documentation building (#1217)
mfeurer Feb 24, 2023
5730669
Fix CI Python 3.6 (#1218)
mfeurer Feb 24, 2023
5b2ac46
Bump docker/setup-buildx-action from 1 to 2 (#1221)
dependabot[bot] Feb 24, 2023
5dcb7a3
Update run.py (#1194)
v-parmar Feb 24, 2023
687a0f1
Refactor if-statements (#1219)
PGijsbers Mar 1, 2023
c0a75bd
Ci python 38 (#1220)
mfeurer Mar 1, 2023
ce82fd5
Add summary of locally computed metrics to representation of run (#…
LennartPurucker Mar 1, 2023
c177d39
Better Error for Checksum Mismatch (#1225)
LennartPurucker Mar 4, 2023
24cbc5e
Fix coverage (#1226)
PGijsbers Mar 4, 2023
3c00d7b
Issue 1028: public delete functions for run, task, flow and database …
Mirkazemi Mar 21, 2023
7127e9c
Update changelog and version number for new release (#1230)
mfeurer Mar 22, 2023
bb3793d
Merge pull request #1233 from openml/main
mfeurer Mar 22, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
Issue 1028: public delete functions for run, task, flow and database (#…
  • Loading branch information
Mirkazemi authored Mar 21, 2023
commit 3c00d7b05b17d248d53db40d1b437808f86e1442
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ target/
# IDE
.idea
*.swp
.vscode

# MYPY
.mypy_cache
Expand Down
4 changes: 4 additions & 0 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,7 @@ Dataset Functions
attributes_arff_from_df
check_datasets_active
create_dataset
delete_dataset
get_dataset
get_datasets
list_datasets
Expand Down Expand Up @@ -103,6 +104,7 @@ Flow Functions
:template: function.rst

assert_flows_equal
delete_flow
flow_exists
get_flow
list_flows
Expand Down Expand Up @@ -133,6 +135,7 @@ Run Functions
:toctree: generated/
:template: function.rst

delete_run
get_run
get_runs
get_run_trace
Expand Down Expand Up @@ -251,6 +254,7 @@ Task Functions
:template: function.rst

create_task
delete_task
get_task
get_tasks
list_tasks
Expand Down
4 changes: 2 additions & 2 deletions doc/progress.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,10 @@ Changelog
~~~~~~

* Add new contributions here.
* ADD#1144: Add locally computed results to the ``OpenMLRun`` object's representation.
* ADD#1028: Add functions to delete runs, flows, datasets, and tasks (e.g., ``openml.datasets.delete_dataset``).
* ADD#1144: Add locally computed results to the ``OpenMLRun`` object's representation if the run was created locally and not downloaded from the server.
* FIX #1197 #559 #1131: Fix the order of ground truth and predictions in the ``OpenMLRun`` object and in ``format_prediction``.
* FIX #1198: Support numpy 1.24 and higher.
* ADD#1144: Add locally computed results to the ``OpenMLRun`` object's representation if the run was created locally and not downloaded from the server.

0.13.0
~~~~~~
Expand Down
10 changes: 6 additions & 4 deletions openml/_api_calls.py
Original file line number Diff line number Diff line change
Expand Up @@ -351,10 +351,12 @@ def _send_request(request_method, url, data, files=None, md5_checksum=None):
xml.parsers.expat.ExpatError,
OpenMLHashException,
) as e:
if isinstance(e, OpenMLServerException):
if e.code not in [107]:
# 107: database connection error
raise
if isinstance(e, OpenMLServerException) and e.code != 107:
# Propagate all server errors to the calling functions, except
# for 107 which represents a database connection error.
# These are typically caused by high server load,
# which means trying again might resolve the issue.
raise
elif isinstance(e, xml.parsers.expat.ExpatError):
if request_method != "get" or retry_counter >= n_retries:
raise OpenMLServerError(
Expand Down
2 changes: 2 additions & 0 deletions openml/datasets/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@
list_qualities,
edit_dataset,
fork_dataset,
delete_dataset,
)
from .dataset import OpenMLDataset
from .data_feature import OpenMLDataFeature
Expand All @@ -28,4 +29,5 @@
"list_qualities",
"edit_dataset",
"fork_dataset",
"delete_dataset",
]
19 changes: 19 additions & 0 deletions openml/datasets/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1271,3 +1271,22 @@ def _get_online_dataset_format(dataset_id):
dataset_xml = openml._api_calls._perform_api_call("data/%d" % dataset_id, "get")
# build a dict from the xml and get the format from the dataset description
return xmltodict.parse(dataset_xml)["oml:data_set_description"]["oml:format"].lower()


def delete_dataset(dataset_id: int) -> bool:
"""Delete dataset with id `dataset_id` from the OpenML server.

This can only be done if you are the owner of the dataset and
no tasks are attached to the dataset.

Parameters
----------
dataset_id : int
OpenML id of the dataset

Returns
-------
bool
True if the deletion was successful. False otherwise.
"""
return openml.utils._delete_entity("data", dataset_id)
25 changes: 12 additions & 13 deletions openml/exceptions.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,15 +11,14 @@ class OpenMLServerError(PyOpenMLError):
"""class for when something is really wrong on the server
(result did not parse to dict), contains unparsed error."""

def __init__(self, message: str):
super().__init__(message)
pass


class OpenMLServerException(OpenMLServerError):
"""exception for when the result of the server was
not 200 (e.g., listing call w/o results)."""

# Code needs to be optional to allow the exceptino to be picklable:
# Code needs to be optional to allow the exception to be picklable:
# https://stackoverflow.com/questions/16244923/how-to-make-a-custom-exception-class-with-multiple-init-args-pickleable # noqa: E501
def __init__(self, message: str, code: int = None, url: str = None):
self.message = message
Expand All @@ -28,24 +27,19 @@ def __init__(self, message: str, code: int = None, url: str = None):
super().__init__(message)

def __str__(self):
return "%s returned code %s: %s" % (
self.url,
self.code,
self.message,
)
return f"{self.url} returned code {self.code}: {self.message}"


class OpenMLServerNoResult(OpenMLServerException):
"""exception for when the result of the server is empty."""
"""Exception for when the result of the server is empty."""

pass


class OpenMLCacheException(PyOpenMLError):
"""Dataset / task etc not found in cache"""

def __init__(self, message: str):
super().__init__(message)
pass


class OpenMLHashException(PyOpenMLError):
Expand All @@ -57,8 +51,7 @@ class OpenMLHashException(PyOpenMLError):
class OpenMLPrivateDatasetError(PyOpenMLError):
"""Exception thrown when the user has no rights to access the dataset."""

def __init__(self, message: str):
super().__init__(message)
pass


class OpenMLRunsExistError(PyOpenMLError):
Expand All @@ -69,3 +62,9 @@ def __init__(self, run_ids: set, message: str):
raise ValueError("Set of run ids must be non-empty.")
self.run_ids = run_ids
super().__init__(message)


class OpenMLNotAuthorizedError(OpenMLServerError):
"""Indicates an authenticated user is not authorized to execute the requested action."""

pass
10 changes: 9 additions & 1 deletion openml/flows/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@

from .flow import OpenMLFlow

from .functions import get_flow, list_flows, flow_exists, get_flow_id, assert_flows_equal
from .functions import (
get_flow,
list_flows,
flow_exists,
get_flow_id,
assert_flows_equal,
delete_flow,
)

__all__ = [
"OpenMLFlow",
Expand All @@ -11,4 +18,5 @@
"get_flow_id",
"flow_exists",
"assert_flows_equal",
"delete_flow",
]
19 changes: 19 additions & 0 deletions openml/flows/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -544,3 +544,22 @@ def _create_flow_from_xml(flow_xml: str) -> OpenMLFlow:
"""

return OpenMLFlow._from_dict(xmltodict.parse(flow_xml))


def delete_flow(flow_id: int) -> bool:
"""Delete flow with id `flow_id` from the OpenML server.

You can only delete flows which you uploaded and which
which are not linked to runs.

Parameters
----------
flow_id : int
OpenML id of the flow

Returns
-------
bool
True if the deletion was successful. False otherwise.
"""
return openml.utils._delete_entity("flow", flow_id)
2 changes: 2 additions & 0 deletions openml/runs/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
run_exists,
initialize_model_from_run,
initialize_model_from_trace,
delete_run,
)

__all__ = [
Expand All @@ -27,4 +28,5 @@
"run_exists",
"initialize_model_from_run",
"initialize_model_from_trace",
"delete_run",
]
18 changes: 18 additions & 0 deletions openml/runs/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1209,3 +1209,21 @@ def format_prediction(
return [repeat, fold, index, prediction, truth]
else:
raise NotImplementedError(f"Formatting for {type(task)} is not supported.")


def delete_run(run_id: int) -> bool:
"""Delete run with id `run_id` from the OpenML server.

You can only delete runs which you uploaded.

Parameters
----------
run_id : int
OpenML id of the run

Returns
-------
bool
True if the deletion was successful. False otherwise.
"""
return openml.utils._delete_entity("run", run_id)
2 changes: 2 additions & 0 deletions openml/tasks/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@
get_task,
get_tasks,
list_tasks,
delete_task,
)

__all__ = [
Expand All @@ -30,4 +31,5 @@
"list_tasks",
"OpenMLSplit",
"TaskType",
"delete_task",
]
19 changes: 19 additions & 0 deletions openml/tasks/functions.py
Original file line number Diff line number Diff line change
Expand Up @@ -545,3 +545,22 @@ def create_task(
evaluation_measure=evaluation_measure,
**kwargs,
)


def delete_task(task_id: int) -> bool:
"""Delete task with id `task_id` from the OpenML server.

You can only delete tasks which you created and have
no runs associated with them.

Parameters
----------
task_id : int
OpenML id of the task

Returns
-------
bool
True if the deletion was successful. False otherwise.
"""
return openml.utils._delete_entity("task", task_id)
22 changes: 21 additions & 1 deletion openml/testing.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,14 @@
import hashlib
import inspect
import os
import pathlib
import shutil
import sys
import time
from typing import Dict, Union, cast
import unittest
import pandas as pd
import requests

import openml
from openml.tasks import TaskType
Expand Down Expand Up @@ -306,4 +308,22 @@ class CustomImputer(SimpleImputer):
pass


__all__ = ["TestBase", "SimpleImputer", "CustomImputer", "check_task_existence"]
def create_request_response(
*, status_code: int, content_filepath: pathlib.Path
) -> requests.Response:
with open(content_filepath, "r") as xml_response:
response_body = xml_response.read()

response = requests.Response()
response.status_code = status_code
response._content = response_body.encode()
return response


__all__ = [
"TestBase",
"SimpleImputer",
"CustomImputer",
"check_task_existence",
"create_request_response",
]
39 changes: 36 additions & 3 deletions openml/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -172,9 +172,42 @@ def _delete_entity(entity_type, entity_id):
raise ValueError("Can't delete a %s" % entity_type)

url_suffix = "%s/%d" % (entity_type, entity_id)
result_xml = openml._api_calls._perform_api_call(url_suffix, "delete")
result = xmltodict.parse(result_xml)
return "oml:%s_delete" % entity_type in result
try:
result_xml = openml._api_calls._perform_api_call(url_suffix, "delete")
result = xmltodict.parse(result_xml)
return f"oml:{entity_type}_delete" in result
except openml.exceptions.OpenMLServerException as e:
# https://github.com/openml/OpenML/blob/21f6188d08ac24fcd2df06ab94cf421c946971b0/openml_OS/views/pages/api_new/v1/xml/pre.php
# Most exceptions are descriptive enough to be raised as their standard
# OpenMLServerException, however there are two cases where we add information:
# - a generic "failed" message, we direct them to the right issue board
# - when the user successfully authenticates with the server,
# but user is not allowed to take the requested action,
# in which case we specify a OpenMLNotAuthorizedError.
by_other_user = [323, 353, 393, 453, 594]
has_dependent_entities = [324, 326, 327, 328, 354, 454, 464, 595]
unknown_reason = [325, 355, 394, 455, 593]
if e.code in by_other_user:
raise openml.exceptions.OpenMLNotAuthorizedError(
message=(
f"The {entity_type} can not be deleted because it was not uploaded by you."
),
) from e
if e.code in has_dependent_entities:
raise openml.exceptions.OpenMLNotAuthorizedError(
message=(
f"The {entity_type} can not be deleted because "
f"it still has associated entities: {e.message}"
)
) from e
if e.code in unknown_reason:
raise openml.exceptions.OpenMLServerError(
message=(
f"The {entity_type} can not be deleted for unknown reason,"
" please open an issue at: https://github.com/openml/openml/issues/new"
),
) from e
raise


def _list_all(listing_call, output_format="dict", *args, **filters):
Expand Down
10 changes: 10 additions & 0 deletions tests/conftest.py
Original file line number Diff line number Diff line change
Expand Up @@ -185,3 +185,13 @@ def pytest_addoption(parser):
@pytest.fixture(scope="class")
def long_version(request):
request.cls.long_version = request.config.getoption("--long")


@pytest.fixture
def test_files_directory() -> pathlib.Path:
return pathlib.Path(__file__).parent / "files"


@pytest.fixture()
def test_api_key() -> str:
return "c0c42819af31e706efe1f4b88c23c6c1"
4 changes: 4 additions & 0 deletions tests/files/mock_responses/datasets/data_delete_has_tasks.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
<oml:error xmlns:oml="http://openml.org/openml">
<oml:code>354</oml:code>
<oml:message>Dataset is in use by other content. Can not be deleted</oml:message>
</oml:error>
4 changes: 4 additions & 0 deletions tests/files/mock_responses/datasets/data_delete_not_exist.xml
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
<oml:error xmlns:oml="http://openml.org/openml">
<oml:code>352</oml:code>
<oml:message>Dataset does not exist</oml:message>
</oml:error>
Loading