Skip to content

ENH HTML displays structure change to be vertical #33678

Open
DeaMariaLeon wants to merge 14 commits intoscikit-learn:mainfrom
DeaMariaLeon:structure-borders2
Open

ENH HTML displays structure change to be vertical #33678
DeaMariaLeon wants to merge 14 commits intoscikit-learn:mainfrom
DeaMariaLeon:structure-borders2

Conversation

@DeaMariaLeon
Copy link
Copy Markdown
Member

@DeaMariaLeon DeaMariaLeon commented Apr 3, 2026

Reference Issues/PRs

Towards #26595

What does this implement/fix? Explain your changes.

The objective here is the same as in #33647 but using a different method. Here I use borders to create the tree lines.

Issue can be found here:
Reorganize the HTML diagram to have a more condensed view or a more vertical view

This is what I understood:

Screenshot 2026-03-27 at 11 08 40

AI usage disclosure

I used AI assistance for:

  • Code generation (e.g., when writing an implementation or fixing a bug)
  • Test/benchmark generation
  • Documentation (including examples)
  • Research and understanding

Used Copilot to fix a few bugs

Any other comments?

@DeaMariaLeon DeaMariaLeon moved this to In progress in Labs Apr 10, 2026
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

DeaMariaLeon commented Apr 10, 2026

Example here: https://output.circle-artifacts.com/output/job/2d65d02e-fe02-4fce-b90c-4d866b6cd7bb/artifacts/0/doc/auto_examples/compose/plot_column_transformer_mixed_types.html

EDIT:
At this point the code needs to be cleaned up, but I would like to know if this is what people had in mind:

With this example import numpy as np

from sklearn.compose import ColumnTransformer
from sklearn.datasets import fetch_openml
from sklearn.feature_selection import SelectPercentile, chi2
from sklearn.impute import SimpleImputer
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import RandomizedSearchCV, train_test_split
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import OneHotEncoder, StandardScaler

np.random.seed(0)
X, y = fetch_openml("titanic", version=1, as_frame=True, return_X_y=True)

numeric_features = ["age", "fare"]
numeric_transformer = Pipeline(
steps=[("imputer", SimpleImputer(strategy="median")), ("scaler", StandardScaler())]
)

categorical_features = ["embarked", "sex", "pclass"]
categorical_transformer = Pipeline(
steps=[
("encoder", OneHotEncoder(handle_unknown="ignore")),
("selector", SelectPercentile(chi2, percentile=50)),
]
)
preprocessor = ColumnTransformer(
transformers=[
("num", numeric_transformer, numeric_features),
("cat", categorical_transformer, categorical_features),
]

)
clf = Pipeline(
steps=[("preprocessor", preprocessor), ("classifier", LogisticRegression())]
)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)

clf.fit(X_train, y_train)
clf

The first display will be:
Screenshot 2026-04-10 at 10 01 22

Second:

Screenshot 2026-04-10 at 10 01 49

Opening ColumnTransformer:
Screenshot 2026-04-10 at 10 07 13

Opening "num"
Screenshot 2026-04-10 at 10 07 54

And so on.
Output Features will need to be added when this PR is merged.
EDIT: I don't know where total output features should be, so it's clear where they come from.

@DeaMariaLeon DeaMariaLeon marked this pull request as ready for review April 10, 2026 08:02
@DeaMariaLeon DeaMariaLeon changed the title POC II - HTML displays structure change to be vertical ENH HTML displays structure change to be vertical Apr 10, 2026
@DeaMariaLeon DeaMariaLeon moved this from In progress to Blocked in Labs Apr 13, 2026
@jeremiedbb jeremiedbb moved this from Blocked to In progress in Labs Apr 13, 2026
@AnneBeyer
Copy link
Copy Markdown
Contributor

Thank you for preparing a draft for this @DeaMariaLeon!

I think it could make sense to play around with some designs outside the HTML implementation first to iterate on how the features could be integrated best. I'm not sure what the best tool for that would be. How did you create the image in the description above?
Here is a draft using Miro (which would allow collaboration, but was a bit of a pain to create, so I'm happy for any other suggestions) and the cyclic_cossin_linear_pipeline from examples/applications/plot_cyclical_feature_engineering.py. This will definitely need some more refinement, but WDYT?
image

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

DeaMariaLeon commented Apr 14, 2026

How did you create the image in the description above?

Powerpoint.

Your drawing looks good, thanks.
EDIT: @AnneBeyer, FYI Guillaume and I had already spent time working out of a drawing. The end result was something larger than what I show in the description. And yes, we were missing the output features.

@DeaMariaLeon
Copy link
Copy Markdown
Member Author

DeaMariaLeon commented Apr 15, 2026

Hi @jeremiedbb, I wonder why this PR was moved from blocked to "in progress". The way Adrin had explained it was that if I was waiting for something to advance, PRs should be under "blocked".

The reason why I set this PR as blocked is because I am waiting for Olivier. He was going to check the UX only. Also, it will stay blocked until the output features PR is merged. Because that will change things again.

I assume that it was removed from blocked because it wont be in the release or because it's not urgent (which it's normal and OK). In that case, maybe we need another label? Or am I missing something? I ask just to understand the "procedure".

@jeremiedbb jeremiedbb moved this from In progress to Blocked in Labs Apr 15, 2026
@DeaMariaLeon
Copy link
Copy Markdown
Member Author

Just to be clear: at this point the feedback I'm looking for is from the user point of view only.
The code of this PR is a work in progress and not ready to be reviewed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants