playground

Autoresearch Research Showcase

This folder collects small, reproducible research-style demos used to showcase omx autoresearch on harder optimization problems.

Design goals:

deterministic or seed-controlled evaluations
small code footprint
no large datasets checked into git
no heavyweight runtime artifacts committed
evaluator-driven keep/discard loops that are easy to inspect under .omx/logs/autoresearch/

Layout

playground/* — demo code and benchmark logic
missions/* — autoresearch mission contracts used by the showcase
scripts/eval-* — focused evaluator entrypoints
scripts/run-autoresearch-showcase.sh — convenience launcher for the bundled showcase missions

Naming convention:

playground demo dirs use *_demo when they are standalone benchmark implementations
mission dirs use problem/task slugs under missions/
evaluator scripts follow scripts/eval-<mission-slug>.(js|py)
the showcase runner maps short names (for example bayesopt, latent, sorting) to mission dirs

Index

Showcase	Mission	Evaluator	Status	Representative result
OMX self-optimization	`missions/in-action-cat-shellout-demo/`	`scripts/eval-in-action-cat-shellout-demo.js`	✅ completed	kept commit `99ebf16` / cherry-picked as `8478261` removing the autoresearch manifest `cat` shell-out
Kaggle-style tabular ML	`missions/ml-kaggle-model-optimization/`	`scripts/eval-ml-kaggle-model-optimization.py`	✅ completed	ROC AUC improved from `0.9458071278825997` to `0.9976939203354298`
Noisy high-dimensional Bayes-opt	`missions/noisy-bayesopt-highdim/`	`scripts/eval-noisy-bayesopt-highdim.py`	✅ completed	score improved from `2.833048700169374` to `4.75978993804531`
Latent subspace discovery	`missions/noisy-latent-subspace-discovery/`	`scripts/eval-noisy-latent-subspace-discovery.py`	✅ completed	score improved from `3.7019658949006504` to `4.176124116152444` with a compact `cem_search` strategy
Adaptive sorting optimization	`missions/adaptive-sort-optimization/`	`scripts/eval-adaptive-sort-optimization.py`	✅ completed	score improved from `2.1198297352756628` to `9.411498969440865` by switching counting sort to the observed value span

Use scripts/run-autoresearch-showcase.sh --list to see the bundled launch targets, or run one or more showcases directly with the wrapper script.

Results matrix

Showcase	Baseline	Kept / best documented result	Delta
OMX self-optimization	n/a	behavior-preserving cleanup	n/a
Kaggle-style tabular ML	0.9458071278825997 AUC	0.9976939203354298 AUC	+0.0518867924528301
Noisy high-dimensional Bayes-opt	2.833048700169374	4.75978993804531	+1.926741237875936
Latent subspace discovery	3.7019658949006504	4.176124116152444	+0.47415822125179353
Adaptive sorting optimization	2.1198297352756628	9.411498969440865	+7.291669234165202

Demos

1. OMX self-optimization

Mission: missions/in-action-cat-shellout-demo/
Evaluator: scripts/eval-in-action-cat-shellout-demo.js
What it demonstrates: a tiny self-hosted code optimization loop where autoresearch removes an unnecessary shell-out from OMX itself.

2. Kaggle-style tabular ML optimization

Demo code: playground/ml_kaggle_demo/
Mission: missions/ml-kaggle-model-optimization/
Evaluator: scripts/eval-ml-kaggle-model-optimization.py
What it demonstrates: model-family / hyperparameter search on a deterministic tabular classification benchmark with a score-improvement keep policy.

3. Noisy high-dimensional Bayes-opt demo

Demo code: playground/bayesopt_highdim_demo/
Mission: missions/noisy-bayesopt-highdim/
Evaluator: scripts/eval-noisy-bayesopt-highdim.py
What it demonstrates: a harder black-box optimization task with noise, limited evaluation budget, and curse-of-dimensionality pressure. The successful autoresearch run switched from random search to a subspace-aware fixed-kernel GP with denoised incumbent selection.

4. Latent subspace discovery demo

Demo code: playground/bayesopt_latent_discovery_demo/
Mission: missions/noisy-latent-subspace-discovery/
Evaluator: scripts/eval-noisy-latent-subspace-discovery.py
What it demonstrates: a follow-on harder variant where useful structure is mixed through latent directions rather than directly exposed as obvious coordinates.

5. Adaptive sorting optimization demo

Demo code: playground/adaptive_sort_demo/
Mission: missions/adaptive-sort-optimization/
Evaluator: scripts/eval-adaptive-sort-optimization.py
What it demonstrates: algorithm-engineering optimization over a deterministic mixed-distribution sorting benchmark using weighted comparison/move cost rather than noisy wall-clock timing.

Running the showcase

Example:

omx autoresearch missions/noisy-bayesopt-highdim

Then inspect:

RUN_ID=$(find .omx/logs/autoresearch -maxdepth 1 -mindepth 1 -type d -printf '%f\n' | sort | tail -n 1)
cat .omx/logs/autoresearch/$RUN_ID/manifest.json
cat .omx/logs/autoresearch/$RUN_ID/candidate.json
cat .omx/logs/autoresearch/$RUN_ID/iteration-ledger.json

You can also run evaluators directly without the supervisor:

node scripts/eval-in-action-cat-shellout-demo.js
python3 scripts/eval-ml-kaggle-model-optimization.py
python3 scripts/eval-noisy-bayesopt-highdim.py
python3 scripts/eval-noisy-latent-subspace-discovery.py
python3 scripts/eval-adaptive-sort-optimization.py

Repository hygiene

These showcases are meant to stay lightweight.

Please avoid committing:

downloaded datasets
large model artifacts
benchmark output dumps
generated caches like __pycache__/
runtime autoresearch logs under .omx/logs/

Keep research state in code, configs, missions, and evaluator scripts; keep bulky runtime outputs local.

Name		Name	Last commit message	Last commit date
parent directory ..
adaptive_sort_demo		adaptive_sort_demo
bayesopt_highdim_demo		bayesopt_highdim_demo
bayesopt_latent_discovery_demo		bayesopt_latent_discovery_demo
ml_kaggle_demo		ml_kaggle_demo
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Autoresearch Research Showcase

Layout

Index

Results matrix

Demos

1. OMX self-optimization

2. Kaggle-style tabular ML optimization

3. Noisy high-dimensional Bayes-opt demo

4. Latent subspace discovery demo

5. Adaptive sorting optimization demo

Running the showcase

Repository hygiene

FilesExpand file tree

playground

Directory actions

More options

Directory actions

More options

Latest commit

History

playground

Folders and files

parent directory

README.md

Autoresearch Research Showcase

Layout

Index

Results matrix

Demos

1. OMX self-optimization

2. Kaggle-style tabular ML optimization

3. Noisy high-dimensional Bayes-opt demo

4. Latent subspace discovery demo

5. Adaptive sorting optimization demo

Running the showcase

Repository hygiene