Describe the bug
When multiple threads call add() in parallel, some nodes become unreachable via search() despite being correctly stored in the index. contains() and get() confirm the nodes exist with correct data, but graph traversal never visits them.
Steps to reproduce
#include <iostream>
#include <random>
#include <thread>
#include <vector>
#include <usearch/index_dense.hpp>
using namespace unum::usearch;
int main() {
constexpr auto num_sessions = 10;
constexpr auto vectors_per_session = 100;
constexpr auto total_vectors = num_sessions * vectors_per_session;
constexpr auto dimensions = 128;
auto index_result = index_dense_gt<>::make(metric_punned_t(dimensions, metric_kind_t::cos_k, scalar_kind_t::f32_k));
auto& index = index_result.index;
index.try_reserve({total_vectors, num_sessions});
// Generate vectors
auto all_vectors = std::vector<std::vector<float>>(total_vectors);
auto rng = std::mt19937(42);
auto dist = std::uniform_real_distribution<float>(-1.0f, 1.0f);
for (auto& v : all_vectors) {
v.resize(dimensions);
for (auto& x : v) x = dist(rng);
}
// Parallel insert — jthreads auto-join at scope exit
{
auto threads = std::vector<std::jthread>();
for (auto s = 0; s < num_sessions; s++) {
threads.emplace_back([&, s] {
auto start = s * vectors_per_session;
for (auto i = 0; i < vectors_per_session; i++)
index.add(start + i, all_vectors[start + i].data(), s);
});
}
}
// Verify: search for every vector requesting ALL results
auto misses = 0;
for (auto i = 0; i < total_vectors; i++) {
auto sr = index.search(all_vectors[i].data(), total_vectors);
auto keys = std::vector<std::uint64_t>(total_vectors);
auto distances = std::vector<float>(total_vectors);
auto found = sr.dump_to(keys.data(), distances.data());
auto found_self = false;
for (auto r = 0; r < found; r++)
if (keys[r] == i) { found_self = true; break; }
if (!found_self) misses++;
}
// Confirm all nodes are stored correctly
auto contains_ok = 0;
for (auto i = 0; i < total_vectors; i++)
if (index.contains(i)) contains_ok++;
std::cout << "Search misses: " << misses << " / " << total_vectors << "\n";
std::cout << "contains() confirms " << contains_ok << " / " << total_vectors << " present\n";
if (misses > 0)
std::cout << "BUG: " << misses << " nodes stored but unreachable via HNSW graph traversal\n";
else
std::cout << "PASS: all nodes reachable\n";
}
Expected behavior
Nodes to be reachable.
USearch version
v2.24.0
Operating System
Ubuntu 24.04
Hardware architecture
Arm
Which interface are you using?
C++ implementation
Contact Details
david.ivekovic@memgraph.io
Are you open to being tagged as a contributor?
Is there an existing issue for this?
Code of Conduct
Describe the bug
When multiple threads call
add()in parallel, some nodes become unreachable viasearch()despite being correctly stored in the index.contains()andget()confirm the nodes exist with correct data, but graph traversal never visits them.Steps to reproduce
Expected behavior
Nodes to be reachable.
USearch version
v2.24.0
Operating System
Ubuntu 24.04
Hardware architecture
Arm
Which interface are you using?
C++ implementation
Contact Details
david.ivekovic@memgraph.io
Are you open to being tagged as a contributor?
.githistory as a contributorIs there an existing issue for this?
Code of Conduct