Skip to content

Sumanth077/Hands-On-AI-Engineering

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

119 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Hands-On AI Engineering Banner

🚀 Hands-On AI Engineering

License: MIT PRs Welcome

A curated collection of practical, production-ready AI projects across multiple modalities, including language models, multimodal models, OCR systems, RAG pipelines, and AI agents. Each project is designed to help you learn, experiment, and build real-world AI applications.

📋 Table of Contents


🎯 Why This Repository?

  • Learn by Doing: Each project includes complete code, setup instructions, and documentation
  • Production-Ready: Projects follow best practices and are ready to be adapted for real-world use
  • Diverse Use Cases: From RAG systems to multi-agent workflows and specialized applications
  • Multiple Model Providers: Projects use OpenAI, Anthropic, Google, and open-source models
  • Active Community: Regular updates and new project additions

🗂️ Project Categories

🤖 AI Agents

Intelligent ai agents for various automation tasks.

  • Multi-Agent Financial Analyst — Team of specialized agents for comprehensive financial analysis
  • FinAgent — Financial assistant agent for stock market analysis and insights
  • Daily AI News Digest — Automated daily digest from 92 Karpathy-curated tech blogs, delivered to Telegram at 8 AM every morning. MiniMax M2.7 scores every article fetched in the last 24 hours and picks the 3 most significant stories.
  • Agentic Form Filler — Powerful agentic form-filling application using Landing AI for layout parsing and MiniMax M2.7 for multi-turn conversational data gathering.
  • AI Travel Planning Agent — Multi-agent travel planner that turns a single natural language request into a complete trip plan with flights, hotels, and a day-by-day itinerary.
  • Competitive Intelligence Agent — Multi-agent AI system that generates strategic sales battlecards by analyzing competitors through the unique lens of your own business context.
  • Multi-Agent Research Assistant (AG2) — Production-grade multi-agent research pipeline using AG2 (formerly AutoGen). Three specialists collaborate under GroupChat with LLM-driven speaker selection to research any topic and produce a structured Markdown report.
  • Self-Reflective Agentic RAG — LangGraph-driven RAG system that grades retrieved context for relevance and sufficiency, rewrites the query if needed, and only generates an answer once the context passes validation — reducing hallucinations through an iterative retrieval loop.
  • Agentic SQL Search — Natural language to SQL agent powered by Gemma 4. Ask plain-English questions about an e-commerce database and the agent writes, executes, and explains the SQL query — with full reasoning transparency in the Streamlit UI.

📸 OCR

Extracting structure and meaning from visual data and documents.

  • Image-to-Structured-Data Extractor — High-fidelity visual OCR using Mistral Large 3 and Instructor to convert images into validated, structured JSON.
  • LaTeX Formula OCR - Local vision-language OCR that extracts math formulas from images/PDFs into LaTeX and renders them instantly with KaTeX.

🎬 Multimodal

Projects combining vision, video, and language models.

  • GLM-OCR Pro — High-performance, local-first Streamlit application for structured document extraction using the GLM-OCR model via Ollama to transform images and PDFs into cleanly formatted Markdown in real-time.
  • Video Understanding Agent — Paste a YouTube URL and get an AI-powered chapter summary, key takeaways, and action items powered by Gemini Flash.

🔧 Openclaw

Projects using the Openclaw framework.

  • Eagle Eye — AI-powered GitHub PR review agent using OpenClaw

📚 RAG Applications

Retrieval-Augmented Generation systems for knowledge-enhanced AI applications.


🤝 Contributing

We welcome contributions! Whether you're adding new projects, improving existing ones, or fixing bugs, your help makes this repository better for everyone.

How to Contribute

  1. Read the guidelines: Check CONTRIBUTING.md for detailed instructions
  2. Create an issue: Propose your project or improvement
  3. Follow the structure: Use the appropriate category folder
  4. Submit a PR: One project per pull request

Project Structure Requirements

  • Each project must be in its own folder within the appropriate category
  • Must include a comprehensive README.md (use our template)
  • Must include requirements.txt or pyproject.toml
  • Must include .env.example for required API keys
  • Follow snake_case naming convention

📜 License

This repository is licensed under the MIT License. See the LICENSE file for details.


🙏 Acknowledgments

Thank you to all contributors who have helped build this collection of AI engineering projects!


Built with ❤️ by the AI Engineering Community

⬆ Back to Top

About

No description, website, or topics provided.

Resources

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages