feat: add existence filter for optional fields by ozanarmagan · Pull Request #2777 · typesense/typesense

ozanarmagan · 2026-02-14T01:57:33Z

Change Summary

Add support for filtering documents by whether optional fields are missing
using field: _missing and field: !_missing syntax.

This is enabled per field via the new track_missing_values schema property.

Usage

Create a collection with an optional indexed field that has
track_missing_values enabled:

{
  "name": "products",
  "fields": [
    {"name": "title", "type": "string"},
    {"name": "color", "type": "string", "optional": true, "track_missing_values": true},
    {"name": "points", "type": "int32"}
  ]
}

2a. Filter for documents where the field is missing:

GET /collections/products/documents/search?q=*&filter_by=color: _missing

2b. Filter for documents where the field is present:

GET /collections/products/documents/search?q=*&filter_by=color: !_missing

2c. Combine with other filters:

filter_by=color: _missing && points: >10
filter_by=color: !_missing || rating: _missing

PR Checklist

I have read and signed the Contributor License Agreement.

ozanarmagan · 2026-02-14T02:03:49Z

Fixes #790

…ed tests

happy-san · 2026-02-23T22:51:21Z

+            }
+        } else {
+            // _exists: all docs minus the missing list
+            // Get the complement of missing ids to get the existing ids.


@ozanarmagan We should also implement iterative logic in case of enable_lazy_filter like we evaluate integer filterr.

You can refer to this test for details. The crux of the iterative logic is to return the seq_ids in between of the actual matches of the iterator. So if the iterator matches 0, 2, 5, ... The matches for not equals will be 1, 3, 4 ...

Ready to review again

alangmartini · 2026-02-27T19:33:57Z

Any chance we can expand this for arrays, so we can filter for empty/non-empty arrays? Or should we create a separate issue for the next iteration?

ozanarmagan · 2026-03-02T02:42:26Z

Any chance we can expand this for arrays, so we can filter for empty/non-empty arrays? Or should we create a separate issue for the next iteration?

@alangmartini Could you create another issue for this? I will address that in another PR.

happy-san

Rest looks good!

happy-san · 2026-03-27T03:13:27Z

@kishorenc PR is ready for your review.

kishorenc · 2026-03-27T06:50:05Z

A bit of manual + automated review but I have gone through each issue carefully and verified that proposed fixes are logical. I have attached a patch. @happy-san please review and confirm.

2777_review_patch.patch

Issues found

optional_index metadata was updated from partial update payloads instead of the final merged document state. In mixed update batches this could incorrectly mark an unchanged optional field as missing, breaking field: _exists / field: !_exists.
optional_index: true was accepted on index: false fields. That schema produced no backing existence index, so _exists / !_exists could return incorrect results.
_exists parsing used substring matching, so ordinary string filters containing _exists (for example title: pre_exists_post) were misparsed as existence filters.
field::field_from_json() did not preserve optional_index, so JSON-to-field reconstruction silently dropped the flag.
Lazy _exists iterators could not be reset or materialized correctly. compute_iterators() and related paths treated the missing-id list as the final result set, so lazy _exists searches could return wrong hits.

Fix summary

Existence bookkeeping now uses each record's final document state (new_doc for updates, doc for inserts) and updates the missing-field index in one place.
Schema validation now rejects optional_index unless the field is both optional: true and index: true.
Existence parsing now only triggers on exact _exists / !_exists tokens, while normal string filters continue to work.
field::field_from_json() now carries optional_index through correctly.
Lazy _exists iteration now has dedicated reset and materialization logic, so _exists uses the complement of the missing-id list consistently in iterator and search paths.
Added focused regression tests for mixed-batch updates, string literals containing _exists, invalid optional_index schemas, JSON field reconstruction, lazy iterator materialization, and lazy search hits.

happy-san · 2026-03-27T14:43:40Z

@ozanarmagan I have listed the tests that will surface each issue:

Have a schema like,

{
  "fields": [
    {
      "name": "field",
      "type": "string"
    },
    {
      "name": "optional_field",
      "type": "string",
      "optional": true,
      "optional_index": true
    }
  ]
}

Add a document like,

{
  "field": "foo",
  "optional_field": "bar"
}

Check that filter_by: optional_field: !_exists should match no document.
Update the document with:

{
  "field": "baz"
}

filter_by: optional_field: !_exists should still match no document.
2. A simple test that should fail when creating a field with optional_index: true and index: false should work.
3. Using the schema ,

{
  "fields": [
    {
      "name": "field_exists",
      "type": "string"
    },
    {
      "name": "field",
      "type": "string"
    }
  ]
}

try passing a filter like field_exists: foo or field: value_exists.
4. A test like https://github.com/typesense/typesense/blob/v31/test/collection_join_test.cpp#L6013-L6021 will surface this issue. We must add this test when a new option is added in the field.
5. Add

    iter_exists.reset();
    ASSERT_EQ(filter_result_iterator_t::valid, iter_exists.validity);

    for (uint32_t i = 0; i < validate_ids.size(); i++) {
        ASSERT_EQ(filter_result_iterator_t::valid, iter_exists.validity);
        ASSERT_EQ(expected[i], iter_exists.is_valid(validate_ids[i]));

        if (expected[i] == 1) {
            iter_exists.next();
        }
        ASSERT_EQ(seq_ids[i], iter_exists.seq_id);
    }
    ASSERT_EQ(filter_result_iterator_t::invalid, iter_exists.validity);

after this line

kishorenc · 2026-03-28T00:58:08Z

The file I have attached to my previous comment already has the patch that contains both the code and the tests for these issues.

…-optional-index

…ypesense into v31-optional-index

happy-san

@ozanarmagan I've left some comments for lazy evaluation path of missing filter. The logic is supposed to be similar to id field evaluation for _missing lazy filter and for !_missing lazy filter, the logic will be similar to != lazy numeric filter.

Let me know if I can clarify anything further.

happy-san · 2026-03-31T09:02:53Z

+    /// Resets the iterator state from the given id list.
+    void reset_from_id_list(id_list_t* source);
+
+    /// Computes the full result from the given id list.
+    void compute_result_from_id_list(id_list_t* source);
+
+    /// Resets the iterator state for missing filters.
+    void reset_missing_iterator();
+
+    /// Advances the iterator state for missing filters.
+    void advance_missing_iterator();
+
+    /// Computes the full result for missing filters.
+    void compute_missing_result();


Let's remove these methods. The logic for missing filter iterator will be similar to that of id filter iterator.

…ypesense into v31-optional-index

…fying logic

feat: add existence filter for optional fields

9a72cb9

feat: add optional_index field to collection tests

01f45c5

happy-san reviewed Feb 18, 2026

View reviewed changes

ozanarmagan force-pushed the v31-optional-index branch from 35f8efe to 01f45c5 Compare February 19, 2026 22:09

ozanarmagan added 2 commits February 19, 2026 22:17

refactor: use only missing index for existence check and update relat…

59dc336

…ed tests

docs: clarify comment on _exists logic in filter_result_iterator

a678c15

ozanarmagan requested review from happy-san and kishorenc February 19, 2026 22:50

happy-san reviewed Feb 23, 2026

View reviewed changes

feat: add lazy evaluation for existence filter

f3f2c7d

ozanarmagan requested a review from happy-san March 2, 2026 02:36

happy-san reviewed Mar 10, 2026

View reviewed changes

Comment thread src/filter_result_iterator.cpp

ozanarmagan and others added 2 commits March 27, 2026 00:37

clarify logic for applying not equals in filter initialization

ddb5d5c

Merge branch 'v31' into v31-optional-index

21f293d

ozanarmagan added 5 commits March 31, 2026 00:33

Merge branch 'v31' of https://github.com/typesense/typesense into v31…

2822626

…-optional-index

fix bugs and rename to track_missing_valuesexist

613cf00

Add tests for missing filter on dynamic fields

3db9b14

Merge branch 'v31-optional-index' of https://github.com/ozanarmagan/t…

6fcb0a0

…ypesense into v31-optional-index

Add track_missing_values support for the fallback field

f268e98

happy-san reviewed Mar 31, 2026

View reviewed changes

ozanarmagan added 3 commits April 6, 2026 13:22

use id_list_t::iterator

047dfa6

Merge branch 'v31-optional-index' of https://github.com/ozanarmagan/t…

7913e15

…ypesense into v31-optional-index

Refactor test by removing redundant validity checks

4a60a04

ozanarmagan requested a review from happy-san April 6, 2026 13:19

ozanarmagan and others added 4 commits April 6, 2026 14:25

Refactor filter_result_iterator by removing unused methods and simpli…

0b2b66b

…fying logic

Fix validity assignment in filter_result_iterator reset method

a86ed18

remove unncessary indent

d279460

Merge branch 'v31' into v31-optional-index

bc4097d

kishorenc approved these changes Apr 10, 2026

View reviewed changes

kishorenc merged commit e6fd315 into typesense:v31 Apr 10, 2026
2 checks passed

Uh oh!

Conversation

ozanarmagan commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Change Summary

Usage

PR Checklist

Uh oh!

ozanarmagan commented Feb 14, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

happy-san Feb 23, 2026

Choose a reason for hiding this comment

Uh oh!

ozanarmagan Mar 2, 2026

Choose a reason for hiding this comment

Uh oh!

alangmartini commented Feb 27, 2026

Uh oh!

ozanarmagan commented Mar 2, 2026

Uh oh!

happy-san left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

happy-san commented Mar 27, 2026

Uh oh!

kishorenc commented Mar 27, 2026

Issues found

Fix summary

Uh oh!

happy-san commented Mar 27, 2026

Uh oh!

kishorenc commented Mar 28, 2026

Uh oh!

happy-san left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

happy-san Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ozanarmagan commented Feb 14, 2026 •

edited

Loading