Skip to content
#

self-consistency

Here are 24 public repositories matching this topic...

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

  • Updated Dec 7, 2024
  • Jupyter Notebook

An evaluation of prompting techniques (Zero-Shot CoT, Few-Shot, Self-Consistency) on the Mistral-7B model for mathematical reasoning. This project systematically benchmarks 7 distinct methods on the GSM8K dataset.

  • Updated Nov 2, 2025
  • Python

Tactical next-action + reasoning prediction on 348 football match contexts (Shipd Project Eris). 4-component ensemble with task-coupling: DeBERTa-v3-base / large, cross-encoder MCQ scorer, zero-shot NLI, and a three-pass Qwen3.5-35B-A3B-Int4 + Gemma-4-26B-A4B-it MoE fusion with PRM rerank. W&B-instrumented. Target combined ≥ 0.80

  • Updated Apr 21, 2026
  • Python

Evaluation framework for self-hosted LLMs. Systematic prompt ablation (baseline, CoT, few-shot, self-consistency voting) on Llama 3.1 8B via lm-evaluation-harness, with Wilson CI statistical analysis, determinism validation, and load testing under concurrency. Found chain-of-thought degrades accuracy 25pp at small scale.

  • Updated Mar 9, 2026
  • Python

Research: Does multilingual self-consistency improve LLM reasoning beyond math? Empirical study across 6 benchmarks (commonsense, ethics, NLI, knowledge) and 10 languages using Qwen2.5-32B and Aya Expanse 32B on Apple Silicon (MLX). Chain-of-thought + cross-lingual prompting.

  • Updated Mar 19, 2026
  • Jupyter Notebook

Improve this page

Add a description, image, and links to the self-consistency topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the self-consistency topic, visit your repo's landing page and select "manage topics."

Learn more