Commit be1425b
fix: support non-Latin text in InMemoryMemoryService search
Merge #5504
Fixes #5501
### Root Cause
`_extract_words_lower` uses `re.findall(r'[A-Za-z]+', text)` which only matches ASCII letters. All non-Latin characters (Japanese, Chinese, Korean, Cyrillic, etc.) are silently discarded, making `search_memory` unable to match any non-Latin text.
### Fix
Change the regex from `[A-Za-z]+` to `\w+` with `re.UNICODE` flag, which matches all Unicode word characters (letters, digits, underscore) across all scripts.
Co-authored-by: George Weale <gweale@google.com>
PiperOrigin-RevId: 9308138081 parent ef395c7 commit be1425b
2 files changed
Lines changed: 30 additions & 1 deletion
File tree
- src/google/adk/memory
- tests/unittests/memory
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
39 | 39 | | |
40 | 40 | | |
41 | 41 | | |
42 | | - | |
| 42 | + | |
43 | 43 | | |
44 | 44 | | |
45 | 45 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
327 | 327 | | |
328 | 328 | | |
329 | 329 | | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
0 commit comments