Conversation
…son into jkeiser/structural-iterator
|
Performance results (first line for each file is master, second is this PR): |
|
It's more accurate and speeds it up? Nice! |
|
@jkeiser Oh. Not really. I think. You merged your PR in the meantime. Let me update that. |
|
@jkeiser New numbers after updating my master to account for your merging... Same deal, first line for each file is for master, second line is this PR... We take a hit for canada.json, mesh.json, mesh.pretty.json, and numbers.json. |
|
@michaeleisel This might be the PR you have been asking for! |
jkeiser
left a comment
There was a problem hiding this comment.
It'll take me a longer while to grok the whole thing, and you and Michael have already been reviewing each other for accuracy here. On that score, it seems good as far as I can tell and I know you've done a ton of testing on the actual values. The comments are largely pretty grokkable, too. And compute_float_64's fast path is a thing of beauty!
Consider comments, but this looks good :)
| // 10^FASTFLOAT_LARGEST_POWER (inclusively). The mantissa is truncated, and | ||
| // never rounded up. | ||
| // Uses about 10KB. | ||
| static const components power_of_ten_components[] = { |
There was a problem hiding this comment.
Maybe the compiler will already do this, but we're compiling this once for each architecture. Will all these tables show up more than once in the executable? Does it matter? I could imagine putting them into a common top level namespace.
|
Looks like it's a wash (some you win some you lose), which is amazing for adding exact parsing as a feature. |
It is. I was highly skeptical. I will fully take into account your comments before merging, and there are few small things (in addition to what you flag) that I want to fix. |
|
|
i wonder about unit test coverage. it would be cool to have really high unit test coverage, does anyone have experience doing unit testing on gpus? i've heard of others enumerating 2^64 cases for their program using gpus |
|
I am now happy with this PR. I just want feedback from @michaeleisel about his idea how how to possibly improve the code routine where we have a good indication that we might be right in-between two floats. I think something clever can be done there... |
|
But I also think we should merge this sooner rather than later. |
|
@michaeleisel Can you take a last look at your earliest convenience? |
| // A complement from power_of_ten_components | ||
| // complete to a 128-bit mantissa. | ||
| const uint64_t mantissa_128[] = {0x419ea3bd35385e2d, | ||
| 0x52064cac828675b9, |
There was a problem hiding this comment.
it looks like we can use // clang-format on and // clang-format off to disable formatting for a section, in case you'd be interested in doing that
|
so in |
|
are we going to merge before adding unit tests from other libraries? |
|
to add to that, i don't care either way if those cases are disambiguated. the perf win doesn't seem huge |
Pull requests with more in-depth testing are invited. This being said, this PR is under as much or more testing than the existing code (in master). |
|
fair enough |
Ah. Now I am getting it. We should split up the cases into the exact ones and the rounded down ones. When it is rounded down, you always want to round up when hitting exactly the middle. When we have the exact mantissa, then it is easy. Ok. This won't change the performance but I will implement it. |
I am not claiming that there is no bug (there always are, in a sense), but we have reviewed the code thoroughly, as a team, and we have lots of tests. |
|
i'm fine with merging |
|
I'll merge and come back later with the change we discussed, as it is basically irrelevant from a performance point of view. |
| // yet we may want to be able to parse subnormal values. | ||
| // However, we do not want to tolerate NAN or infinite values. | ||
| // | ||
| // Values like infinity or NaN are not allowed in the JSON specification. |
There was a problem hiding this comment.
RapidJSON supports non-standard values like NaN and Inf. Similarly, environments such as MATLAB, Octave, Scilab, and Nelson, which use C++ libraries for JSON parsing also accept these values. For example: jsondecode('NaN') works as expected. While this behavior is not part of the official JSON standard, it is commonly accepted in legacy code and widely supported in practice...
There was a problem hiding this comment.
There was a problem hiding this comment.
It could be quite useful to support additional parsing options, even if they're not strictly standard, similar to fast_float::parse_options options { fast_float::chars_format::json_or_infnan }; :)


This PR introduces accurate number parsing. We have ULP of zero.
Fixes #242
To do:
i!=0check