Skip to content

Commit 8bb520a

Browse files
Merge master and regenerate amalgamated files
Resolved conflicts by regenerating the amalgamated single-header files (simdjson.h, simdjson.cpp, and singleheader.zip) using the amalgamate.py script after merging latest changes from master. 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>
2 parents ec8e6a6 + dd4d026 commit 8bb520a

51 files changed

Lines changed: 3934 additions & 1494 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

.gitignore

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ cmake-build-release/
3838
.history/
3939

4040
# Visual Studio artifacts
41-
/VS/
41+
/.vs/
4242

4343
# C/C++ build outputs
4444
.build/
@@ -106,4 +106,4 @@ objs
106106
!.vscode/extensions.json
107107

108108
# clangd
109-
.cache
109+
.cache

CMakeLists.txt

Lines changed: 3 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,9 @@
11
cmake_minimum_required(VERSION 3.14)
2-
cmake_policy(VERSION 3.5) # For doctest
3-
42

53
project(
64
simdjson
75
# The version number is modified by tools/release.py
8-
VERSION 3.13.0
6+
VERSION 4.0.0
97
DESCRIPTION "Parsing gigabytes of JSON per second"
108
HOMEPAGE_URL "https://simdjson.org/"
119
LANGUAGES CXX C
@@ -22,8 +20,8 @@ string(
2220
# ---- Options, variables ----
2321

2422
# These version numbers are modified by tools/release.py
25-
set(SIMDJSON_LIB_VERSION "26.0.0" CACHE STRING "simdjson library version")
26-
set(SIMDJSON_LIB_SOVERSION "26" CACHE STRING "simdjson library soversion")
23+
set(SIMDJSON_LIB_VERSION "28.0.0" CACHE STRING "simdjson library version")
24+
set(SIMDJSON_LIB_SOVERSION "28" CACHE STRING "simdjson library soversion")
2725

2826
option(SIMDJSON_BUILD_STATIC_LIB "Build simdjson_static library along with simdjson (only makes sense if BUILD_SHARED_LIBS=ON)" OFF)
2927
if(SIMDJSON_BUILD_STATIC_LIB AND NOT BUILD_SHARED_LIBS)

Doxyfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,7 @@ PROJECT_NAME = simdjson
3838
# could be handy for archiving the generated documentation or if some version
3939
# control system is used.
4040

41-
PROJECT_NUMBER = "3.13.0"
41+
PROJECT_NUMBER = "4.0.0"
4242

4343
# Using the PROJECT_BRIEF tag one can provide an optional one line description
4444
# for a project that appears at the top of each page and should give viewer a

doc/basics.md

Lines changed: 46 additions & 42 deletions
Original file line numberDiff line numberDiff line change
@@ -1,47 +1,50 @@
11
The Basics
22
==========
33

4-
An overview of what you need to know to use simdjson, with examples.
4+
An overview of what you need to know to use simdjson to parse JSON documents, with examples.
5+
[Our documentation regarding the generation (serialization) of JSON documents is in a
6+
separate document](https://github.com/simdjson/simdjson/blob/master/doc/builder.md).
57

68
- [The Basics](#the-basics)
7-
- [Requirements](#requirements)
8-
- [Including simdjson](#including-simdjson)
9-
- [Using simdjson with package managers](#using-simdjson-with-package-managers)
10-
- [Using simdjson as a CMake dependency](#using-simdjson-as-a-cmake-dependency)
11-
- [Versions](#versions)
12-
- [The basics: loading and parsing JSON documents](#the-basics-loading-and-parsing-json-documents)
13-
- [Documents are iterators](#documents-are-iterators)
14-
- [Parser, document and JSON scope](#parser-document-and-json-scope)
15-
- [string_view](#string_view)
16-
- [Avoiding pitfalls: enable development checks](#avoiding-pitfalls-enable-development-checks)
17-
- [Using the parsed JSON](#using-the-parsed-json)
18-
- [Using the parsed JSON: additional examples](#using-the-parsed-json-additional-examples)
19-
- [Adding support for custom types](#adding-support-for-custom-types)
20-
- [1. Specialize `simdjson::ondemand::value::get` to get custom types (pre-C++20)](#1-specialize-simdjsonondemandvalueget-to-get-custom-types-pre-c20)
21-
- [2. Use `tag_invoke` for custom types (C++20)](#2-use-tag_invoke-for-custom-types-c20)
22-
- [Minifying JSON strings without parsing](#minifying-json-strings-without-parsing)
23-
- [UTF-8 validation (alone)](#utf-8-validation-alone)
24-
- [JSON Pointer](#json-pointer)
25-
- [JSONPath](#jsonpath)
26-
- [Error handling](#error-handling)
27-
- [Error handling examples without exceptions](#error-handling-examples-without-exceptions)
28-
- [Disabling exceptions](#disabling-exceptions)
29-
- [Exceptions](#exceptions)
30-
- [Current location in document](#current-location-in-document)
31-
- [Checking for trailing content](#checking-for-trailing-content)
32-
- [Rewinding](#rewinding)
33-
- [Newline-Delimited JSON (ndjson) and JSON lines](#newline-delimited-json-ndjson-and-json-lines)
34-
- [Parsing numbers inside strings](#parsing-numbers-inside-strings)
35-
- [Dynamic Number Types](#dynamic-number-types)
36-
- [Raw strings from keys](#raw-strings-from-keys)
37-
- [General direct access to the raw JSON string](#general-direct-access-to-the-raw-json-string)
38-
- [Storing directly into an existing string instance](#storing-directly-into-an-existing-string-instance)
39-
- [Thread safety](#thread-safety)
40-
- [Standard compliance](#standard-compliance)
41-
- [Backwards compatibility](#backwards-compatibility)
42-
- [Examples](#examples)
43-
- [Performance tips](#performance-tips)
44-
- [Further reading](#further-reading)
9+
* [Requirements](#requirements)
10+
* [Including simdjson](#including-simdjson)
11+
* [Using simdjson with package managers](#using-simdjson-with-package-managers)
12+
* [Using simdjson as a CMake dependency](#using-simdjson-as-a-cmake-dependency)
13+
* [Versions](#versions)
14+
* [The basics: loading and parsing JSON documents](#the-basics--loading-and-parsing-json-documents)
15+
* [Documents are iterators](#documents-are-iterators)
16+
+ [Parser, document and JSON scope](#parser--document-and-json-scope)
17+
* [string_view](#string-view)
18+
* [Avoiding pitfalls: enable development checks](#avoiding-pitfalls--enable-development-checks)
19+
* [Using the parsed JSON](#using-the-parsed-json)
20+
+ [Using the parsed JSON: additional examples](#using-the-parsed-json--additional-examples)
21+
* [Adding support for custom types](#adding-support-for-custom-types)
22+
+ [1. Specialize `simdjson::ondemand::value::get` to get custom types (pre-C++20)](#1-specialize--simdjson--ondemand--value--get--to-get-custom-types--pre-c--20-)
23+
+ [2. Use `tag_invoke` for custom types (C++20)](#2-use--tag-invoke--for-custom-types--c--20-)
24+
+ [3. Using static reflection (C++26)](#3-using-static-reflection--c--26-)
25+
* [Minifying JSON strings without parsing](#minifying-json-strings-without-parsing)
26+
* [UTF-8 validation (alone)](#utf-8-validation--alone-)
27+
* [JSON Pointer](#json-pointer)
28+
* [JSONPath](#jsonpath)
29+
* [Error handling](#error-handling)
30+
+ [Error handling examples without exceptions](#error-handling-examples-without-exceptions)
31+
+ [Disabling exceptions](#disabling-exceptions)
32+
+ [Exceptions](#exceptions)
33+
+ [Current location in document](#current-location-in-document)
34+
+ [Checking for trailing content](#checking-for-trailing-content)
35+
* [Rewinding](#rewinding)
36+
* [Newline-Delimited JSON (ndjson) and JSON lines](#newline-delimited-json--ndjson--and-json-lines)
37+
* [Parsing numbers inside strings](#parsing-numbers-inside-strings)
38+
* [Dynamic Number Types](#dynamic-number-types)
39+
* [Raw strings from keys](#raw-strings-from-keys)
40+
* [General direct access to the raw JSON string](#general-direct-access-to-the-raw-json-string)
41+
* [Storing directly into an existing string instance](#storing-directly-into-an-existing-string-instance)
42+
* [Thread safety](#thread-safety)
43+
* [Standard compliance](#standard-compliance)
44+
* [Backwards compatibility](#backwards-compatibility)
45+
* [Examples](#examples)
46+
* [Performance tips](#performance-tips)
47+
* [Further reading](#further-reading)
4548

4649

4750
Requirements
@@ -826,7 +829,7 @@ There are 3 main ways provided by simdjson to deserialize a value into a custom
826829
1. Specialize `simdjson::ondemand::document::get` for the whole document
827830
2. Specialize `simdjson::ondemand::value::get` for each value
828831
2. Using `tag_invoke` *(the recommended way if your system supports C++20 or better)*
829-
3. Using static reflectioin (requires C++26 or better)
832+
3. Using static reflection (requires C++26 or better)
830833
831834
We describe all of them in the following sections. Most users who have systems compatible with
832835
C++20 or better should skip ahead to [using `tag_invoke` for custom types (C++20)](#2-use-tag_invoke-for-custom-types-c20) as it is more powerful and simpler.
@@ -1140,7 +1143,7 @@ struct Car {
11401143
};
11411144
```
11421145
1143-
Observe how we defined the class to use types that simdjson does not directly support (`float`, `int`).
1146+
Observe how we define the class to use types that simdjson does not directly support (`float`, `int`).
11441147
With C++20 support, the library grabs from the JSON the generic type (`double`, `int`) and then it
11451148
casts it automatically.
11461149
@@ -1355,11 +1358,12 @@ your code with the `SIMDJSON_STATIC_REFLECTION` macro set:
13551358
```
13561359
13571360
Then you can deserialize a type such as `Car` automatically:
1361+
13581362
```cpp
13591363
std::string json = R"( { "make": "Toyota", "model": "Camry", "year": 2018,
13601364
"tire_pressure": [ 40.1, 39.9 ] } )";
13611365
simdjson::ondemand::parser parser;
1362-
simdjson::ondemand::document doc = parser.iterate(simdjson::pad(json)).get(doc);
1366+
simdjson::ondemand::document doc = parser.iterate(simdjson::pad(json));
13631367
Car c = doc.get<Car>();
13641368
```
13651369

doc/builder.md

Lines changed: 119 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -4,16 +4,29 @@ Builder
44
Sometimes you want to generate JSON string outputs efficiently.
55
The simdjson library provides high-performance low-level facilities.
66
When using these low-level functionalities, you are responsible to
7-
define the structure of your JSON document. However, string escaping
8-
and UTF-8 validation is automated.
7+
define the structure of your JSON document. Our more advanced interface
8+
automates the process using C++26 static reflection: you get both high
9+
speed and high convenience.
10+
11+
- [Builder](#builder)
12+
* [Overview: string_builder](#overview--string-builder)
13+
* [Example: string_builder](#example--string-builder)
14+
* [C++26 static reflection](#c--26-static-reflection)
15+
+ [Without `string_buffer` instance](#without--string-buffer--instance)
16+
+ [Without `string_buffer` instance but with explicit error handling](#without--string-buffer--instance-but-with-explicit-error-handling)
917

1018
Overview: string_builder
1119
---------------------------
1220

1321
The string_builder class is a low-level utility for constructing JSON strings representing documents. It is optimized for performance, potentially leveraging kernel-specific features like SIMD instructions for tasks such as string escaping. This class supports atomic types (e.g., booleans, numbers, strings) but does not handle composed types directly (like arrays or objects).
22+
Note that JSON strings are always encoded as UTF-8.
1423

1524
An `string_builder` is created with an initial buffer capacity (e.g., 1kB). The memory
16-
is reallocated when needed. It has the following methods to add content to the string:
25+
is reallocated when needed.
26+
The efficiency of `string_builder` stems from its internal use of a resizable array or buffer. When you append data, it adds the characters to this buffer, resizing it only when necessary, typically in a way that minimizes reallocations. This approach contrasts with regular string concatenation, where each operation creates a new string, copying all previous content, leading to quadratic time complexity for repeated concatenations.
27+
28+
29+
It has the following methods to add content to the string:
1730

1831

1932
- `append(number_type v)`: Appends a number (including booleans) to the JSON buffer. Booleans are converted to the strings "false" or "true". Numbers are formatted according to the JSON standard, with floating-point numbers using the shortest representation that accurately reflects the value.
@@ -32,6 +45,9 @@ After writting the content, if you have reasons to believe that the content migh
3245

3346
- `validate_unicode()`: Checks if the content in the JSON buffer is valid UTF-8. Returns: true if the content is valid UTF-8, false otherwise.
3447

48+
You might need to do unicode validation if you have strings in your data structures containing
49+
malformed UTF-8.
50+
3551
Once you are satisfied, you can recover the string as follows:
3652

3753
- `operator std::string()`: Converts the JSON buffer to an std::string. (Might throw if an error occurred.)
@@ -44,55 +60,75 @@ Example: string_builder
4460
---------------------------
4561

4662
```C++
47-
48-
void serialize_car(const Car& car, simdjson::builder::string_builder& builder) {
49-
// start of JSON
50-
builder.start_object();
51-
52-
// "make"
53-
builder.append_key_value("make", car.make);
54-
builder.append_comma();
55-
56-
// "model"
57-
builder.append_key_value("model", car.model);
58-
builder.append_comma();
59-
60-
// "year"
61-
builder.append_key_value("year", car.year);
62-
builder.append_comma();
63-
64-
// "tire_pressure"
65-
builder.escape_and_append_with_quotes("tire_pressure");
66-
builder.append_colon();
67-
builder.start_array();
68-
// vector tire_pressure
69-
for (size_t i = 0; i < car.tire_pressure.size(); ++i) {
70-
builder.append(car.tire_pressure[i]);
71-
if (i < car.tire_pressure.size() - 1) {
72-
builder.append_comma();
73-
}
63+
struct Car {
64+
std::string make;
65+
std::string model;
66+
int64_t year;
67+
std::vector<double> tire_pressure;
68+
};
69+
70+
void serialize_car(const Car& car, simdjson::builder::string_builder& builder) {
71+
// start of JSON
72+
builder.start_object();
73+
74+
// "make"
75+
builder.append_key_value("make", car.make);
76+
builder.append_comma();
77+
78+
// "model"
79+
builder.append_key_value("model", car.model);
80+
builder.append_comma();
81+
82+
// "year"
83+
builder.append_key_value("year", car.year);
84+
builder.append_comma();
85+
86+
// "tire_pressure"
87+
builder.escape_and_append_with_quotes("tire_pressure");
88+
builder.append_colon();
89+
builder.start_array();
90+
// vector tire_pressure
91+
for (size_t i = 0; i < car.tire_pressure.size(); ++i) {
92+
builder.append(car.tire_pressure[i]);
93+
if (i < car.tire_pressure.size() - 1) {
94+
builder.append_comma();
7495
}
75-
builder.end_array();
76-
builder.end_object();
7796
}
97+
builder.end_array();
98+
builder.end_object();
99+
}
100+
101+
bool car_test() {
102+
simdjson::builder::string_builder sb;
103+
Car c = {"Toyota", "Corolla", 2017, {30.0,30.2,30.513,30.79}};
104+
serialize_car(c, sb);
105+
std::string_view p{sb};
106+
// p holds the JSON:
107+
// "{\"make\":\"Toyota\",\"model\":\"Corolla\",\"year\":2017,\"tire_pressure\":[30.0,30.2,30.513,30.79]}"
108+
return true;
109+
}
110+
```
78111

79-
bool car_test() {
80-
simdjson::builder::string_builder sb;
81-
Car c = {"Toyota", "Corolla", 2017, {30.0,30.2,30.513,30.79}};
82-
serialize_car(c, sb);
83-
std::string_view p;
84-
if(sb.view().get(p)) {
85-
return false; // there was an error
86-
}
87-
// p holds the JSON:
88-
// "{\"make\":\"Toyota\",\"model\":\"Corolla\",\"year\":2017,\"tire_pressure\":[30.0,30.2,30.513,30.79]}"
89-
return true;
112+
The `string_builder` constructor takes an optional parameter which specifies the initial
113+
memory allocation in byte. If you know approximately the size of your JSON output, you can
114+
pass this value as a parameter (e.g., `simdjson::builder::string_builder sb{1233213}`).
115+
116+
The `string_builder` might throw an exception in case of error when you cast it result to `std::string_view`. If you wish to avoid exceptions, you can use the following programming pattern:
117+
118+
```cpp
119+
std::string_view p;
120+
if(sb.view().get(p)) {
121+
return false; // there was an error
90122
}
91123
```
92124
125+
In all cases, the `std::string_view` instance depends the corresponding `string_builder` instance.
126+
93127
C++26 static reflection
94128
------------------------
95129
130+
Static reflection (or compile-time reflection) in C++26 introduces a powerful compile-time mechanism that allows a program to inspect and manipulate its own structure, such as types, variables, functions, and other program elements, during compilation. Unlike runtime reflection in languages like Java or Python, C++26’s static reflection operates entirely at compile time, aligning with C++’s emphasis on zero-overhead abstractions and high performance. It means
131+
that you can delegate much of the work to the library.
96132
If you have a compiler with support C++26 static reflection, you can compile
97133
your code with the `SIMDJSON_STATIC_REFLECTION` macro set:
98134
@@ -106,21 +142,55 @@ And then you can append your data structures to a `string_builder` instance
106142
automatically. In most cases, it should work automatically:
107143

108144
```cpp
145+
struct Car {
146+
std::string make;
147+
std::string model;
148+
int64_t year;
149+
std::vector<double> tire_pressure;
150+
};
151+
109152
bool car_test() {
110153
simdjson::builder::string_builder sb;
111154
Car c = {"Toyota", "Corolla", 2017, {30.0,30.2,30.513,30.79}};
112-
append(sb, c);
113-
std::string_view p;
114-
if(sb.view().get(p)) {
115-
return false; // there was an error
116-
}
155+
sb << c;
156+
std::string_view p{sb};
117157
// p holds the JSON:
118158
// "{\"make\":\"Toyota\",\"model\":\"Corolla\",\"year\":2017,\"tire_pressure\":[30.0,30.2,30.513,30.79]}"
119159
return true;
120160
}
121161
```
122162

123-
If you prefer, you can also create a string directly:
163+
164+
### Without `string_buffer` instance
165+
166+
In some instances, you might want to create a string directly from your own data type.
167+
You can create a string directly, without an explicit `string_builder` instance
168+
with the `simdjson::builder::to_json_string` function.
169+
(Under the hood a `string_builder` instance may still be created.)
170+
171+
```cpp
172+
struct Car {
173+
std::string make;
174+
std::string model;
175+
int64_t year;
176+
std::vector<double> tire_pressure;
177+
};
178+
179+
void f() {
180+
Car c = {"Toyota", "Corolla", 2017, {30.0,30.2,30.513,30.79}};
181+
std::string json = simdjson::builder::to_json_string(c);
182+
}
183+
```
184+
185+
If you know the output size, in bytes, of your JSON string, you may
186+
pass it as a second parameter (e.g., `simdjson::builder::to_json_string(c, 31123)`).
187+
188+
189+
190+
### Without `string_buffer` instance but with explicit error handling
191+
192+
If prefer a version without exceptions and explicit error handling, you can use the following
193+
pattern:
124194

125195
```cpp
126196
std::string json;

0 commit comments

Comments
 (0)