You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+55-81Lines changed: 55 additions & 81 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -103,7 +103,7 @@ be concerned with computed gotos.
103
103
104
104
## Thread safety
105
105
106
-
The simdjson library is mostly single-threaded. Thread safety is the responsability of the caller: it is unsafe to reuse a ParsedJson object between different threads.
106
+
The simdjson library is mostly single-threaded. Thread safety is the responsability of the caller: it is unsafe to reuse a document::parser object between different threads.
107
107
108
108
If you are on an x64 processor, the runtime dispatching assigns the right code path the first time that parsing is attempted. The runtime dispatching is thread-safe.
109
109
@@ -117,89 +117,63 @@ You will get best performance with large or huge pages. Under Linux, you can ena
117
117
118
118
Another strategy is to reuse pre-allocated buffers. That is, you avoid reallocating memory. You just allocate memory once and reuse the blocks of memory.
119
119
120
-
## Code usage and example
120
+
## Including simdjson
121
121
122
-
The main API involves populating a `ParsedJson` object which hosts a fully navigable document-object-model (DOM) view of the JSON document. The DOM can be accessed using [JSON Pointer](https://tools.ietf.org/html/rfc6901) paths, for example. The main function is `json_parse` which takes a string containing the JSON document as well as a reference to pre-allocated `ParsedJson` object (which can be reused multiple time). Once you have populated the `ParsedJson` object you can navigate through the DOM with an iterator (e.g., created by `ParsedJson::Iterator pjh(pj)`, see 'Navigating the parsed document').
123
122
124
-
// Samples:
125
-
// Load a document from a file
126
-
// Read a particular key / value from the document
127
-
// Iterate over an array of things
123
+
## Code usage and example
128
124
129
-
```c++
130
-
#include"simdjson.h"
131
-
auto doc = simdjson::document::load("myfile.json");
132
-
cout << doc;
133
-
for (auto i=doc.begin(); i<doc.end(); i++) {
134
-
cout << doc[i];
135
-
}
136
-
```
125
+
The main API involves allocating a `document::parser`, and calling `parser.parse()` to create a fully navigable document-object-model (DOM) view of a JSON document. The DOM can be accessed via [JSON Pointer](https://tools.ietf.org/html/rfc6901) paths, or as an iterator (`document::iterator(doc)`). See 'Navigating the parsed document' for more.
137
126
138
-
A slightly simpler API is available if you don't mind having the overhead
139
-
of memory allocation with each new JSON document:
127
+
All examples below use use `#include "simdjson.h"`, `#include "simdjson.cpp"` and `using namespace simdjson;`.
140
128
141
-
```C
142
-
#include"simdjson/jsonparser.h"
143
-
using namespace simdjson;
129
+
The simplest API to get started is `document::parse()`, which allocates a new parser, parses a string, and returns the DOM. This is less efficient if you're going to read multiple documents, but as long as you're only parsing a single document, this will do just fine.
144
130
145
-
document doc = document::parse("myfile.json");
146
-
cout << doc;
147
-
/...
148
-
149
-
constchar * filename = ... //
150
-
padded_string p = get_corpus(filename);
151
-
ParsedJson pj = build_parsed_json(p); // do the parsing
152
-
if( ! pj.is_valid() ) {
153
-
// something went wrong
154
-
std::cout << pj.get_error_message() << std::endl;
155
-
}
131
+
```c++
132
+
auto [doc, error] = document::parse(string("[ 1, 2, 3 ]"));
Though the `padded_string` class is recommended for best performance, you can call `json_parse` and `build_parsed_json`, passing a standard `std::string` object.
137
+
If you're using exceptions, it gets even simpler (simdjson won't use exceptions internally, so you'll only pay the performance cost of exceptions in your own calling code):
simdjson requires SIMDJSON_PADDING extra bytes at the end of a string (it doesn't matter if the bytes are initialized). The `padded_string` class is an easy way to ensure this is accomplished up front and prevent the extra allocation:
164
145
165
-
/...
166
-
std::string mystring = ... //
167
-
ParsedJson pj;
168
-
pj.allocate_capacity(mystring.size()); // allocate memory for parsing up to p.size() bytes
169
-
// std::string may not overallocate so a copy will be needed
170
-
constint res = json_parse(mystring, pj); // do the parsing, return 0 on success
171
-
// parsing is done!
172
-
if (res != 0) {
173
-
// You can use the "simdjson/simdjson.h" header to access the error message
If you're using simdjson to parse multiple documents, or in a loop, you should allocate a parser once and reuse it (allocation is slow, do it as little as possible!):
186
159
187
-
std::string mystring = ... //
188
-
// std::string may not overallocate so a copy will be needed
189
-
ParsedJson pj = build_parsed_json(mystring); // do the parsing
190
-
if( ! pj.is_valid() ) {
191
-
// something went wrong
192
-
std::cout << pj.get_error_message() << std::endl;
160
+
```c++
161
+
// Allocate a parser big enough for all files
162
+
document::parser parser;
163
+
if (!parser.allocate_capacity(1024*1024)) { exit(1); }
As needed, the `json_parse` and `build_parsed_json` functions copy the input data to a temporary buffer readable up to SIMDJSON_PADDING bytes beyond the end of the data.
197
-
198
175
## Newline-Delimited JSON (ndjson) and JSON lines
199
176
200
-
201
-
202
-
203
177
The simdjson library also support multithreaded JSON streaming through a large file containing many smaller JSON documents in either [ndjson](http://ndjson.org) or [JSON lines](http://jsonlines.org) format. We support files larger than 4GB.
204
178
205
179
**API and detailed documentation found [here](doc/JsonStream.md).**
@@ -212,14 +186,14 @@ Here is a simple example, using single header simdjson:
212
186
213
187
intparse_file(const char *filename) {
214
188
simdjson::padded_string p = simdjson::get_corpus(filename);
215
-
simdjson::ParsedJson pj;
189
+
simdjson::document::parser parser;
216
190
simdjson::JsonStream js{p};
217
191
int parse_res = simdjson::SUCCESS_AND_HAS_MORE;
218
192
219
193
while (parse_res == simdjson::SUCCESS_AND_HAS_MORE) {
220
-
parse_res = js.json_parse(pj);
194
+
parse_res = js.json_parse(parser);
221
195
222
-
//Do something with pj...
196
+
//Do something with parser...
223
197
}
224
198
}
225
199
```
@@ -230,18 +204,18 @@ See the "singleheader" repository for a single header version. See the included
230
204
file "amalgamation_demo.cpp" for usage. This requires no specific build system: just
231
205
copy the files in your project in your include path. You can then include them quite simply:
232
206
233
-
```C
207
+
```c++
234
208
#include <iostream>
235
209
#include "simdjson.h"
236
210
#include "simdjson.cpp"
237
211
using namespace simdjson;
238
212
int main(int argc, char *argv[]) {
239
213
const char * filename = argv[1];
240
214
padded_string p = get_corpus(filename);
241
-
ParsedJson pj = build_parsed_json(p); // do the parsing
242
-
if( ! pj.is_valid() ) {
215
+
document::parser parser = build_parsed_json(p); // do the parsing
In C++, given a `ParsedJson`, we can move to a node with the `move_to` method, passing a `std::string` representing the JSON Pointer query.
404
+
In C++, given a `document::parser`, we can move to a node with the `move_to` method, passing a `std::string` representing the JSON Pointer query.
431
405
432
406
## Navigating the parsed document
433
407
434
408
435
409
436
-
From a `simdjson::ParsedJson` instance, you can create an iterator (of type `simdjson::ParsedJson::Iterator` which is in fact `simdjson::ParsedJson::BasicIterator<DEFAULT_MAX_DEPTH>` ) via a constructor:
410
+
From a `simdjson::document::parser` instance, you can create an iterator (of type `simdjson::document::parser::Iterator` which is in fact `simdjson::document::parser::BasicIterator<DEFAULT_MAX_DEPTH>` ) via a constructor:
437
411
438
412
```
439
-
ParsedJson::Iterator pjh(pj); // pj is a ParsedJSON
413
+
document::parser::Iterator pjh(parser); // parser is a ParsedJSON
440
414
```
441
415
442
-
You then have access to the following methods on the resulting `simdjson::ParsedJson::Iterator` instance:
416
+
You then have access to the following methods on the resulting `simdjson::document::parser::Iterator` instance:
443
417
444
-
*`bool is_ok() const`: whether you have a valid iterator, will be false if your parent parsed ParsedJson is not a valid JSON.
418
+
*`bool is_ok() const`: whether you have a valid iterator, will be false if your parent parsed document::parser is not a valid JSON.
445
419
*`size_t get_depth() const`: returns the current depth (start at 1 with 0 reserved for the fictitious root node)
446
420
*`int8_t get_scope_type() const`: a scope is a series of nodes at the same depth, typically it is either an object (`{`) or an array (`[`). The root node has type 'r'.
447
421
*`bool move_forward()`: move forward in document order
@@ -482,17 +456,17 @@ You then have access to the following methods on the resulting `simdjson::Parsed
482
456
483
457
Here is a code sample to dump back the parsed JSON to a string:
484
458
485
-
```c
486
-
ParsedJson::Iterator pjh(pj);
459
+
```c++
460
+
document::parser::Iterator pjh(parser);
487
461
if (!pjh.is_ok()) {
488
462
std::cerr << " Could not iterate parsed result. " << std::endl;
0 commit comments