Commit 3d27645
Add LightOnOCR model implementation (#41621)
* Add LightOnOCR model implementation
* fix modular docstring error
* Improve LightOnOCR documentation and exports
* Rename LightOnOCR multi-modal projector to vision projection and add tests
* fix load without lmhead in safetensor
* temp
* Refactor LightOnOCR config to use sub_configs pattern
* rename processor kwargs
* Refactor LightOnOCR processor to use effective patch size
Calculate effective_patch_size during initialization and use it throughout
the processor. Update ProcessorKwargs defaults to include patch_size in
images_kwargs. Remove redundant model_input_names property.
* Improve LightOnOCR generation support with proper KV cache handling
* add modeling tests and compile modular
* Clean up LightOnOCR code and remove unused variables
Remove unused image_features variable and model_input_names property
* Add LightOnOCR documentation and test improvements
Add model documentation page with config and class references. Update toctree to include LightOnOCR entry. Clean up test formatting and add vision/text models to private model exceptions.
* Refactor LightOnOCR to use standardized RopeParameters and consolidate shared components
* Rename LightOnOCR model classes and fix config parameter naming
- Rename LightOnOCRText -> LightOnOCRTextModel and LightOnOCRVision -> LightOnOCRVisionModel
- Fix parameter naming: image_token_index -> image_token_id
- Set tie_word_embeddings default to False
- Add special case for inherited Qwen3Config attributes in LightOnOCRTextConfig
* Add missing parameter documentation for LightOnOCR config
* Simplify LightOnOCR forward methods with decorators and fix loss function call
* Reorganize LightOnOCR components to place vision before text and remove debug print
* fixup
* Fix image token expansion logic in Processor
* Copy pixtral attention to have both pixtral and qwen eager attention forward
* remove LightOnOCRTextPreTrainedModel from modular to be able to return attention
* Support both tensor and list formats for image_sizes parameter
* Update tests/models/lightonocr/test_processor_lightonocr.py
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Update docs/source/en/model_doc/lightonocr.md
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
* Move image_sizes tensor conversion from model to processor
* Simplify weight initialization to use uniform text_config initializer_range
* rename 1 letter vars
* Get image special tokens from tokenizer attributes in processor
* Return BaseModelOutputWithPast from LightOnOCRModel forward
* Add chat template to LightOnOCR processor test setup
* rm get_output_embeddings from LightOnOCRForConditionalGeneration (not needed)
* Add OCR integration test for LightOnOCR model
Tests model can perform OCR on real receipt image and extract expected text
* Fix device/dtype handling in LightOnOCR vision processing
* Add TransformersKwargs type hints to LightOnOCR forward methods
* Make torch imports conditional and use _from_config for LightOnOCR sub-models
* Set patch_size at runtime instead of modifying class defaults in LightOnOCR processor
* type kwargs
* Remove loocr forward comments
* Add vocab_size property and fix image_token_id in LightOnOCR
- Add vocab_size property to LightOnOCRConfig that delegates to text_config
- Fix test parameter name from image_token_index to image_token_id
- Add Unpack type hint to processor __call__ kwargs
- Remove unnecessary comments from modeling forward method
* Add vocab_size setter to LightOnOCR configuration
* Fix device mismatch in vision rotary embeddings and optimize test image sizes
* Improve LightOnOCR integration test with similarity-based output validation
* Enable flex attention
* Enable flex attention
* Loocr description with blogpost
* redundant tie_word_embeddings
* remove architecture from default config
* vocab_size accessors
* remove useless tensor conversion
* remove useless conversion
* move dtype conversion to after image feature extraction
* remove useless stuff
* fixup
* export text and vision config classes
* refactor(lightonocr): remove unused weight initialization and fix tied weights mapping #0
- Remove custom _init_weights methods (handled by base class)
- Update _tied_weights_keys to dict format with explicit mapping
- Update documentation date
* fix(lightonocr): fix test failures for vocab_size access and device placement #0
- Use config.text_config.vocab_size instead of config.vocab_size for composite config
- Remove explicit device placement from attention_mask and image_sizes tensors
- Allow device_map='auto' to handle device placement in model parallelism tests
* ruff
* rebase 8/12/2025
* rebase 09/12/2025
* review zucchini
* review zucchini
* rebase 10/12/2025
* refactor(lighton_ocr): fix naming conventions to use snake_case and proper CamelCase #0
- Rename model identifier from 'lightonocr' to 'lighton_ocr' (snake_case)
- Update class names from 'LightOnOCR*' to 'LightOnOcr*' (proper CamelCase)
- Update all auto mappings, tests, and documentation accordingly
* style(lighton_ocr): remove unnecessary import guards for torch and vision #0
* style(lighton_ocr): remove unnecessary pass statement from LightOnOcrVisionConfig #0
* refactor(lighton_ocr): consolidate RMSNorm classes and use PixtralRMSNorm base #0
* refactor(lighton_ocr): import rotary pos emb functions from pixtral instead of redefining #0
- Remove duplicate vision_rotate_half and vision_apply_rotary_pos_emb functions
- Import apply_rotary_pos_emb from pixtral modeling
- Consolidate rotate_half/apply_rotary_pos_emb in generated modeling file
* refactor(lighton_ocr): remove unused LightOnOcrVisionPreTrainedModel class #0
- Remove redundant VisionPreTrainedModel class that was not used
- Add LightOnOcrVisionAttentionLayer to _no_split_modules in main PreTrainedModel
* refactor(lighton_ocr): simplify LightOnOcrAttention and clarify docstring #0
- Remove redundant __init__ that only called super()
- Update docstring to explain why class exists (avoids eager_attention_forward collision with Qwen3)
* test(lighton_ocr): remove unnecessary skipped test methods #0
* refactor(lighton_ocr): remove use_sliding_window and max_window_layers from config #0
- Use del in __init__ to explicitly remove inherited attrs from Qwen3Config
- Remove LightOnOCRTextConfig from check_config_attributes.py exception list
- Fix rms_norm_eps type annotation from int to float
* fix make fixup
* docs(lighton_ocr): add docstring to LightOnOcrTextConfig and clean up check_repo #0
- Add configuration docstring with all parameters to LightOnOcrTextConfig
- Consolidate duplicate comments in PRIVATE_MODELS
- Remove redundant entries from IGNORE_NON_TESTED and IGNORE_NON_AUTO_CONFIGURED
* chore(lighton_ocr): update copyright headers to LightOn Team #0
* refactor(lighton_ocr): clean up model files and add license headers #0
- Add Apache 2.0 license headers to generated files
- Remove unused embedding getter/setter methods from ForConditionalGeneration
- Clean up LightOnOcrTextConfig docstring and remove Qwen references
* refactor(lighton_ocr): simplify processor token access and test setup #0
- Access special tokens directly from tokenizer attributes instead of getattr with defaults
- Simplify test setup to use model_id and inherited ProcessorTesterMixin methods
- Fix return types test to handle fast image processor limitations
* refactor(lighton_ocr): unify attention functions and fix buffer registration #0
- Remove duplicate vision_eager_attention_forward, reuse eager_attention_forward from Qwen3
- Add num_key_value_groups attribute for GQA compatibility
- Register original_inv_freq as buffer instead of plain attribute
* refactor(lighton_ocr): remove vision_model property alias #0
* docs(lighton_ocr): add usage example and update release date #0
* rebase 12/01/26
* Update docs/source/en/model_doc/lighton_ocr.md
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>
* review cyril
* review cyril
* review cyril
* Remove test.py from version control
* apply modular
* update years everywhere it was not updated
* fix date
* remove Attention forward implem
* Fix all Vision prefixes instead of no prefix
* move tying to main config
* fix
* add to all
* immensely simplify
* fix test
* revert check_repo
---------
Co-authored-by: Said Taghadouini <taghadouinisaid@gmail.com>
Co-authored-by: Pablo Montalvo <39954772+molbap@users.noreply.github.com>
Co-authored-by: Cyril Vallez <cyril.vallez@huggingface.co>
Co-authored-by: Cyril Vallez <cyril.vallez@gmail.com>1 parent 77146cc commit 3d27645
17 files changed
Lines changed: 2138 additions & 4 deletions
File tree
- docs/source/en
- model_doc
- src/transformers/models
- auto
- lighton_ocr
- mistral3
- tests/models/lighton_ocr
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1115 | 1115 | | |
1116 | 1116 | | |
1117 | 1117 | | |
| 1118 | + | |
| 1119 | + | |
1118 | 1120 | | |
1119 | 1121 | | |
1120 | 1122 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
241 | 241 | | |
242 | 242 | | |
243 | 243 | | |
| 244 | + | |
244 | 245 | | |
245 | 246 | | |
246 | 247 | | |
| |||
705 | 706 | | |
706 | 707 | | |
707 | 708 | | |
| 709 | + | |
708 | 710 | | |
709 | 711 | | |
710 | 712 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
130 | 130 | | |
131 | 131 | | |
132 | 132 | | |
| 133 | + | |
133 | 134 | | |
134 | 135 | | |
135 | 136 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
240 | 240 | | |
241 | 241 | | |
242 | 242 | | |
| 243 | + | |
243 | 244 | | |
244 | 245 | | |
245 | 246 | | |
| |||
924 | 925 | | |
925 | 926 | | |
926 | 927 | | |
| 928 | + | |
927 | 929 | | |
928 | 930 | | |
929 | 931 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
101 | 101 | | |
102 | 102 | | |
103 | 103 | | |
| 104 | + | |
104 | 105 | | |
105 | 106 | | |
106 | 107 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
169 | 169 | | |
170 | 170 | | |
171 | 171 | | |
| 172 | + | |
172 | 173 | | |
173 | 174 | | |
174 | 175 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
Lines changed: 128 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
| 1 | + | |
| 2 | + | |
| 3 | + | |
| 4 | + | |
| 5 | + | |
| 6 | + | |
| 7 | + | |
| 8 | + | |
| 9 | + | |
| 10 | + | |
| 11 | + | |
| 12 | + | |
| 13 | + | |
| 14 | + | |
| 15 | + | |
| 16 | + | |
| 17 | + | |
| 18 | + | |
| 19 | + | |
| 20 | + | |
| 21 | + | |
| 22 | + | |
| 23 | + | |
| 24 | + | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
| 33 | + | |
| 34 | + | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
| 46 | + | |
| 47 | + | |
| 48 | + | |
| 49 | + | |
| 50 | + | |
| 51 | + | |
| 52 | + | |
| 53 | + | |
| 54 | + | |
| 55 | + | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
| 60 | + | |
| 61 | + | |
| 62 | + | |
| 63 | + | |
| 64 | + | |
| 65 | + | |
| 66 | + | |
| 67 | + | |
| 68 | + | |
| 69 | + | |
| 70 | + | |
| 71 | + | |
| 72 | + | |
| 73 | + | |
| 74 | + | |
| 75 | + | |
| 76 | + | |
| 77 | + | |
| 78 | + | |
| 79 | + | |
| 80 | + | |
| 81 | + | |
| 82 | + | |
| 83 | + | |
| 84 | + | |
| 85 | + | |
| 86 | + | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
| 108 | + | |
| 109 | + | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
| 116 | + | |
| 117 | + | |
| 118 | + | |
| 119 | + | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
| 123 | + | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
0 commit comments