Clarification on LiveCodeBench v6 dataset size (454 vs 175 tasks)

Hi, thanks for the great benchmark!

I have a question regarding the evaluation setup for **LiveCodeBench v6**.

https://www.emergentmind.com/topics/livecodebench-v5-v6-pro
https://livecodebench.github.io/leaderboard.html

From the documentation, it seems that:

* LiveCodeBench v6 contains **454 problems** collected from Aug 2024 to May 2025.

However, in practice I observed that:

* The **test_v6 split on HuggingFace contains 175 problems**.
* The difficulty distribution appears to be **75 Easy / 75 Medium / 25 Hard**, which matches the commonly reported evaluation setup.

Could you clarify the intended evaluation protocol?

1. Is **454 the total dataset size**, while **175 is the standard evaluation subset**?
2. Should experiments reported in papers follow the **175-task split**?
3. Is there an official list defining this evaluation subset?

I want to make sure my evaluation setup is consistent with the intended benchmark protocol.

Thanks again for releasing LiveCodeBench!


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarification on LiveCodeBench v6 dataset size (454 vs 175 tasks) #145

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Clarification on LiveCodeBench v6 dataset size (454 vs 175 tasks) #145

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions