Skip to content
Closed
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
71 changes: 67 additions & 4 deletions lib/matplotlib/axes/_axes.py
Original file line number Diff line number Diff line change
Expand Up @@ -3050,7 +3050,7 @@ def broken_barh(self, xranges, yrange, align="bottom", **kwargs):
@_docstring.interpd
def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing=0,
tick_labels=None, labels=None, orientation="vertical", colors=None,
**kwargs):
hatch=None,**kwargs):
Comment thread
ilakkmanoharan marked this conversation as resolved.
Outdated
"""
Make a grouped bar plot.

Expand Down Expand Up @@ -3190,6 +3190,21 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing

If not specified, the colors from the Axes property cycle will be used.

hatch : sequence of str or None, optional
Hatch pattern(s) to apply per dataset.

- If ``None`` (default), no hatching is applied.
- If a sequence of strings is provided (e.g., ``['//', 'xx', '..']``),
the patterns are cycled across datasets.
- If the sequence contains a single element (e.g., ``['//']``),
the same pattern is repeated for all datasets.
- Single string values (e.g., ``'//'``) are **not supported**.
Comment thread
timhoffm marked this conversation as resolved.
Outdated

Raises
------
ValueError
If ``hatch`` is a single string or a non-iterable value.

Comment thread
timhoffm marked this conversation as resolved.
Outdated
**kwargs : `.Rectangle` properties

%(Rectangle:kwdoc)s
Expand Down Expand Up @@ -3318,6 +3333,45 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
# TODO: do we want to be more restrictive and check lengths?
colors = itertools.cycle(colors)

if hatch is None:
# No hatch specified: disable hatching entirely by cycling [None].
hatches = itertools.cycle([None])
# TODO: Discussion —
# Should grouped_bar() apply a default hatch pattern (e.g., '//')
# when none is provided ?
Comment thread
timhoffm marked this conversation as resolved.
Outdated

elif isinstance(hatch, str):
raise ValueError("'hatch' must be a sequence of strings "
"(e.g., ['//']) or None; "
"a single string like '//' is not allowed."
)

else:
try:
hatch_list = list(hatch)
except TypeError:
raise ValueError("'hatch' must be a sequence of strings"
"(e.g., ['//']) or None") from None

if not hatch_list:
# Empty sequence → treat as no hatch.
hatches = itertools.cycle([None])
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's raise instead. Semantically, hatch=[] does not make sense. Users should use None for "no hatch".

Copy link
Copy Markdown
Contributor Author

@ilakkmanoharan ilakkmanoharan Nov 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense semantically — None is the canonical way to disable hatching.
However, in practice, there are scenarios where hatch can be constructed programmatically (e.g., through loops, Giles, or data pipelines). In such cases, it may naturally or unintentionally evaluate to an empty list — for example, via something like config.get("hatch", []). Raising an error in these situations could make otherwise valid automation brittle.
Would it be acceptable to issue a warning and treat an empty list as None instead, to preserve robustness? This would align with Matplotlib’s general philosophy of being tolerant and user-friendly toward benign input while still informing users of the preferred approach.

if not hatch_list:
    _api.warn_external(
        "'hatch' was an empty sequence; treating as no hatch. "
        "Use None explicitly to suppress this warning."
    )
    hatches = itertools.cycle([None])

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would rather argument, that it's likely an error in the programmatic code if it returns an empty list. I don't see a practical case in which one would naturally end up with an empty list if one has a finite set of groups and want to generate hatches. config.get("hatch", []) seems contrieved. Why would one want to write this instead of config.get("hatch") which will work correctly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, in practice, empty sequences can occur naturally in data-driven workflows. For instance, configuration files often deserialize [] for optional arrays (e.g., JSON/YAML style templates), and plotting utilities may use defensive defaults like config.get("hatch", []) to avoid KeyError. Similarly, dynamically generating hatches from metadata — e.g. [meta.get("hatch") for meta in datasets] — can yield an empty list when all datasets omit that field.
In such cases, treating [] as “no hatch” (with a warning) keeps things robust and user-friendly for automated or programmatic pipelines, while still guiding users to prefer None explicitly.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@timhoffm Waiting for your response. What did you decide regarding this?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @timhoffm — I appreciate your detailed perspective. I’d like to clarify why I still believe that gracefully handling an empty sequence (with a warning) can be beneficial in the long run, both pragmatically and philosophically.

1. Defensive and user-friendly design

Matplotlib’s strength has always been its tolerance toward benign irregularities. Users rely on it as a stable visualization backend even when inputs are indirectly produced — from templates, configs, or pipelines — not always hand-written by developers.

Treating [] as equivalent to None (with a warning) preserves this spirit of robustness and continuity. It avoids brittle failures in higher-level tools that integrate Matplotlib as a rendering engine, especially when such inputs originate from sources beyond direct user control.

2. Real-world automation and serialization

Even though empty lists may seem unlikely in manual usage, they appear quite often in automated systems:
YAML/JSON serialization — missing or optional fields frequently deserialize to empty arrays rather than None.
Dynamic workflows — templated plotting pipelines or report generators (e.g., Airflow, Prefect, or CI visualizations) often construct keyword dictionaries dynamically. When an aesthetic parameter is unused, it may remain as an empty list placeholder.

Interoperability layers — frameworks like seaborn, pandas, or holoviews may forward arguments programmatically, sometimes defaulting to empty sequences when certain aesthetics are unused.
These cases are not errors per se — they are side effects of flexible, data-driven design where “no hatch” may naturally translate to an empty sequence.

3. Design philosophy — protecting the fragile parts

From a design standpoint, the absence of current breakage does not mean the absence of potential breakage.
If a construct can occur naturally in evolving ecosystems, it’s prudent to make the library resilient to it — especially when doing so is low-risk and consistent with existing semantics (None already means “no hatch”).
Designing defensively means protecting things that could break, even without hard proof that they already do. In software ecosystems, “possible” often becomes “inevitable” over time as integration layers grow.

4. Minimal cost, maximal stability

Issuing a warning — rather than raising — strikes a balanced compromise:
It informs users of the preferred canonical input (None).
It preserves existing behavior for programmatic systems that may unintentionally produce [].
It avoids introducing regressions in automated pipelines where errors may not be easy to trace back to Matplotlib.

So, while I agree that hatch=[] might not be common in hand-written code, its graceful handling can future-proof Matplotlib against subtle integration issues in complex or automated visualization systems — where inputs are often constructed indirectly rather than typed explicitly.

In essence, this isn’t about accommodating an artificial case — it’s about maintaining the robustness and user-centered flexibility that have always defined Matplotlib’s design philosophy.

I’d like to open a separate issue to discuss this more broadly, since the handling of empty sequences (like hatch=[]) may have implications beyond this specific function.

This could be an opportunity to evaluate whether Matplotlib should adopt a consistent, system-wide approach toward benign empty inputs — for example, treating them as “no-op” equivalents with a warning, rather than raising outright.

Such a discussion could help clarify the library’s general stance on defensive input handling versus strict validation, especially in the context of data-driven or programmatic pipelines that interface with Matplotlib.

Personally, as a user, it feels more natural and intuitive to assign hatch = [], since it allows me to dynamically add or remove patterns later

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recognize you are trying to make your point. However, you are not responding to my points in #30726 (comment). In particular, please stop posting AI-generated content. A generic AI argument falls short of considering the nuances and context of API design here. I am not spending time on rebutting AI gibberish. (I'm sorry if my AI assumption is wrong, but the text feels very AI-generated and scanners say with 99% likelihood it is.)

These principles found my decision:

  • In the face of ambiguity, refuse the temptation to guess.
  • Errors should never pass silently.
  • Consistency - compare stackplot(..., hatch)
  • If in doubt, keep the API minimal. - It's much simpler to later extend the API than to narrow it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just wanted to point out a small consistency angle that might be worth considering. If None is accepted to mean “no hatch,” then treating an empty list [] the same way actually keeps the API simple rather than expanding it — both essentially express “no pattern.”

For example:

config = {"color": ["red", "blue"], "hatch": []}
ax.bar(x, y, **config)

In a lot of configuration-driven setups (like JSON or YAML), optional arrays often deserialize as empty lists by default instead of None. Handling [] like None would make the behavior more predictable and tolerant in those cases, without changing the meaning or adding any new semantics.

This isn’t about guessing what the user wants — it’s more about keeping the two representations of “no hatch” consistent in data-driven workflows, where either form might appear naturally.

For instance, in dynamic plotting code, users might deliberately pass [] when they’re building visual attributes incrementally or conditionally:

hatches = []
if use_patterns:
    hatches.append('//')
ax.bar(x, y, hatch=hatches)

In this kind of logic, hatches starts as an empty list by design — it’s not an error or a missing value, just a placeholder that may or may not be populated later depending on conditions. Passing [] deliberately keeps the code clean and avoids special-case checks like:

ax.bar(x, y, hatch=None if not hatches else hatches)

So, supporting [] as equivalent to None helps in situations where users or frameworks construct arguments dynamically, keeping code simpler and more consistent without altering semantics.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There can be - IMHO rare - cases where this is convenient. But that’s not enough reason for me to be more permissive, given it potentially hides unintended behavior and is not in line with how stackplot operates (our most prominent existing multi-dataset plotting function).

I wield my decision authority as the project API lead here to end the discussion so that we can move forward.

It is possible to reconsider later, and broaden the API if strong user becomes apparent. That would be a global API decision and affect stackplot and possibly Collections as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, Tim — I appreciate the clear decision and your time in walking through the reasoning. Understood on keeping the API strict for now and prioritizing consistency with stackplot. I won’t push the point further, and I’ll make the required changes to the code soon.

elif not all(isinstance(h, str) for h in hatch_list):
raise TypeError("All entries in 'hatch' must be strings")
Comment thread
timhoffm marked this conversation as resolved.
Outdated
else:
# Sequence of hatch patterns: cycle through them as needed.
# Example: hatch=['//', 'xx', '..'] → patterns repeat across datasets.
hatches = itertools.cycle(hatch)

# TODO: Discussion —
# We may later introduce optional strict validation:
# if len(hatch) != num_datasets:
# raise ValueError(
# f"Expected {num_datasets} hatches, got {len(hatch)}"
# )
# This would enforce a strict 1:1 correspondence between
# datasets and provided hatches, preventing silent cycling.
Comment thread
timhoffm marked this conversation as resolved.
Outdated

Comment on lines +3332 to +3361
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you condense the whitespace please?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, why not just follow the pattern in hist and allow a string (since we're allowing 1d list) and therefore simplify the error checking - basically bar handles the hatch validation instead of grouped bar.

https://github.com/matplotlib/matplotlib/blob/dedfe9be48ad82cade86766ef89410844ff09b31/lib/matplotlib/axes/_axes.py#L7560C8-L7560C76

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When hatch is given as a single string (e.g. "//"), we raise a ValueError. This prevents Matplotlib from incorrectly iterating over individual characters ('/', '/') instead of treating the hatch as a single pattern.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@story645 @timhoffm

https://github.com/matplotlib/matplotlib/blob/dedfe9be48ad82cade86766ef89410844ff09b31/lib/matplotlib/axes/_axes.py#L7560C8-L7560C76 --->>>>>> the implementation cycles hatch patterns per patch, which breaks grouped bar semantics because each dataset generates multiple patches. This causes the hatch sequence to repeat incorrectly within the same dataset.
I am wondering if we need to replace the patch-level cycling with a dataset-level normalization instead? Specifically:
determine the number of datasets (num_datasets = len(heights)),
normalize hatch to one pattern per dataset, and
apply that hatch consistently to all patches for that dataset.
This would align hatch behavior with how color and label are already broadcast in grouped_bar, and it also resolves the test failures caused by per-patch cycling.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically:
determine the number of datasets (num_datasets = len(heights)),
normalize hatch to one pattern per dataset, and
apply that hatch consistently to all patches for that dataset.

Sounds good to me, I think would also then allow for easier expansion into nesting (hatch per patch per dataset) if we wanted to go that direction.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[MNT]: Discussion : Normalize hatch patterns per dataset instead of per patch in grouped_bar #30789

I’ve created a new issue to address the hatch-normalization behavior in grouped_bar (i.e., switching from per-patch hatch cycling to per-dataset hatch assignment). This allows us to track that behavioral change separately from PR #30726, which is already large and focused on adding hatch support. Splitting this out avoids scope creep, keeps the current PR reviewable, and ensures the semantic change to hatch handling receives dedicated discussion and review.

@timhoffm @story645

Copy link
Copy Markdown
Member

@story645 story645 Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ilakkmanoharan I closed that issue b/c that behavioral discussion is necessary in this PR to move this PR forward - it's not scope creep b/c it's inherent to how you're handling the hatch argument and which errors to throw. If the current behavior of grouped hatch is one color per dataset, than the behavior of hatch should also be one color per dataset. This is also consistent with stackplot. Also, that's the behavior you've documented:

Image

Until you get better intuition for what would make a good issue, I strongly recommend you ask maintainers if a discussion warrants a stand alone issue before opening a new one.

bar_width = (group_distance /
(num_datasets + (num_datasets - 1) * bar_spacing + group_spacing))
bar_spacing_abs = bar_spacing * bar_width
Expand All @@ -3331,15 +3385,24 @@ def grouped_bar(self, heights, *, positions=None, group_spacing=1.5, bar_spacing
# place the bars, but only use numerical positions, categorical tick labels
# are handled separately below
bar_containers = []
for i, (hs, label, color) in enumerate(zip(heights, labels, colors)):

# Both colors and hatches are cycled indefinitely using itertools.cycle.
# heights and labels, however, are finite (length = num_datasets).
# Because zip() stops at the shortest iterable, this loop executes exactly
# num_datasets times even though colors and hatches are infinite.
# This ensures one (color, hatch) pair per dataset
# without explicit length checks.
Comment thread
ilakkmanoharan marked this conversation as resolved.
Outdated
for i, (hs, label, color, hatch_pattern) in enumerate(
zip(heights, labels, colors, hatches)
):
lefts = (group_centers - 0.5 * group_distance + margin_abs
+ i * (bar_width + bar_spacing_abs))
if orientation == "vertical":
bc = self.bar(lefts, hs, width=bar_width, align="edge",
label=label, color=color, **kwargs)
label=label, color=color, hatch=hatch_pattern, **kwargs)
else:
bc = self.barh(lefts, hs, height=bar_width, align="edge",
label=label, color=color, **kwargs)
label=label, color=color, hatch=hatch_pattern,**kwargs)
bar_containers.append(bc)

if tick_labels is not None:
Expand Down
1 change: 1 addition & 0 deletions lib/matplotlib/axes/_axes.pyi
Original file line number Diff line number Diff line change
Expand Up @@ -287,6 +287,7 @@ class Axes(_AxesBase):
bar_spacing: float | None = ...,
orientation: Literal["vertical", "horizontal"] = ...,
colors: Iterable[ColorType] | None = ...,
hatch: Iterable[str] | None = ...,
**kwargs
) -> list[BarContainer]: ...
def stem(
Expand Down
2 changes: 2 additions & 0 deletions lib/matplotlib/pyplot.py
Original file line number Diff line number Diff line change
Expand Up @@ -3536,6 +3536,7 @@ def grouped_bar(
labels: Sequence[str] | None = None,
orientation: Literal["vertical", "horizontal"] = "vertical",
colors: Iterable[ColorType] | None = None,
hatch: Iterable[str] | None = None,
**kwargs,
) -> list[BarContainer]:
return gca().grouped_bar(
Expand All @@ -3547,6 +3548,7 @@ def grouped_bar(
labels=labels,
orientation=orientation,
colors=colors,
hatch=hatch,
**kwargs,
)

Expand Down
115 changes: 115 additions & 0 deletions lib/matplotlib/tests/test_axes.py
Original file line number Diff line number Diff line change
Expand Up @@ -2267,6 +2267,121 @@ def test_grouped_bar_return_value():
assert bc not in ax.containers


def test_grouped_bar_single_hatch_str_raises():
"""Passing a single string for hatch should raise a ValueError."""
fig, ax = plt.subplots()
x = np.arange(3)
heights = [np.array([1, 2, 3]), np.array([2, 1, 2])]
with pytest.raises(ValueError, match="must be a sequence of strings"):
ax.grouped_bar(heights, positions=x, hatch='//')


def test_grouped_bar_hatch_non_iterable_raises():
"""Non-iterable hatch values should raise a ValueError."""
fig, ax = plt.subplots()
heights = [np.array([1, 2]), np.array([2, 3])]
with pytest.raises(ValueError, match="must be a sequence of strings"):
ax.grouped_bar(heights, hatch=123) # invalid non-iterable


def test_grouped_bar_hatch_sequence():
"""Each dataset should receive its own hatch pattern when a sequence is passed."""
fig, ax = plt.subplots()
x = np.arange(2)
heights = [np.array([1, 2]), np.array([2, 3]), np.array([3, 4])]
hatches = ['//', 'xx', '..']
containers = ax.grouped_bar(heights, positions=x, hatch=hatches)

# Verify each dataset gets the corresponding hatch
for hatch, c in zip(hatches, containers.bar_containers):
for rect in c:
assert rect.get_hatch() == hatch


def test_grouped_bar_hatch_cycles_when_shorter_than_datasets():
"""When the hatch list is shorter than the number of datasets,
patterns should cycle.
"""

fig, ax = plt.subplots()
x = np.arange(2)
heights = [
np.array([1, 2]),
np.array([2, 3]),
np.array([3, 4]),
]
hatches = ['//', 'xx'] # shorter than number of datasets → should cycle
containers = ax.grouped_bar(heights, positions=x, hatch=hatches)

expected_hatches = ['//', 'xx', '//'] # cycle repeats
for gi, c in enumerate(containers.bar_containers):
for rect in c:
assert rect.get_hatch() == expected_hatches[gi]


def test_grouped_bar_hatch_none():
"""Passing hatch=None should result in bars with no hatch."""
fig, ax = plt.subplots()
x = np.arange(2)
heights = [np.array([1, 2]), np.array([2, 3])]
containers = ax.grouped_bar(heights, positions=x, hatch=None)

# All bars should have no hatch applied
for c in containers.bar_containers:
for rect in c:
assert rect.get_hatch() in (None, ''), \
f"Expected no hatch, got {rect.get_hatch()!r}"


def test_grouped_bar_hatch_mixed_orientation():
"""Ensure hatch works correctly for both vertical and horizontal orientations."""
fig, (ax1, ax2) = plt.subplots(1, 2)
x = np.arange(3)
heights = [np.array([1, 2, 3]), np.array([2, 1, 2])]
hatches = ['//', 'xx']

containers_v = ax1.grouped_bar(
heights, positions=x, hatch=hatches, orientation="vertical")
containers_h = ax2.grouped_bar(
heights, positions=x, hatch=hatches, orientation="horizontal")

for gi, (cv, ch) in enumerate(
zip(containers_v.bar_containers, containers_h.bar_containers)):
for rect in cv:
assert rect.get_hatch() == hatches[gi]
for rect in ch:
assert rect.get_hatch() == hatches[gi]


Comment thread
timhoffm marked this conversation as resolved.
Outdated
def test_grouped_bar_empty_string_disables_hatch():
"""An empty string in the hatch list should result in no hatch for that dataset."""
fig, ax = plt.subplots()
x = np.arange(3)
heights = [np.array([1, 2, 3]), np.array([2, 1, 2])]
hatches = ["", "xx"]
containers = ax.grouped_bar(heights, positions=x, hatch=hatches)
counts = [[rect.get_hatch() for rect in bc] for bc in containers.bar_containers]
assert all(h == '' or h is None for h in counts[0]) # first dataset: no hatch
assert all(h == 'xx' for h in counts[1]) # second dataset: hatched


def test_grouped_bar_dict_with_labels_forbidden():
"""Passing labels along with dict input should raise an error."""
fig, ax = plt.subplots()
data = {"a": [1, 2], "b": [2, 1]}
with pytest.raises(ValueError, match="cannot be used if 'heights' is a mapping"):
ax.grouped_bar(data, labels=["x", "y"])


def test_grouped_bar_positions_not_equidistant():
"""Passing non-equidistant positions should raise an error."""
fig, ax = plt.subplots()
x = np.array([0, 1, 3])
heights = [np.array([1, 2, 3]), np.array([2, 1, 2])]
with pytest.raises(ValueError, match="must be equidistant"):
ax.grouped_bar(heights, positions=x)


def test_boxplot_dates_pandas(pd):
# smoke test for boxplot and dates in pandas
data = np.random.rand(5, 2)
Expand Down
Loading