Gemini 3.1 Flash-Lite
Stay organized with collections
Save and categorize content based on your preferences.
Gemini 3.1 Flash-Lite is our most cost-efficient Gemini model,
optimized for low latency use cases for high-volume, cost-sensitive LLM traffic.
It provides a significant quality increase over
Gemini 2.0 Flash-Lite and Gemini 2.5 Flash-Lite
models, matching Gemini 2.5 Flash performance across key capability areas:
Improved response quality: Aims to match 2.5 Flash performance.
Improved instruction following: Targeted improvements to serve as a reliable
migration path for complex chatbot and instruction-heavy workflows.
Improved audio input: Improved audio-input quality for tasks like Automated Speech
Recognition (ASR).
Expanded thinking support: You can control how much reasoning the model
performs by choosing from minimal, low, medium, or high
thinking levels. This feature
lets you balance response quality and speed for your specific use case.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-04-03 UTC."],[],[]]