There's a new paper making the rounds on the topic of converting floats to their decimal string representations that claims to be both simpler and faster than prior algorithms: https://pldi18.sigplan.org/event/pldi-2018-papers-ry-fast-float-to-string-conversion . I'm particularly interested in the simplicity aspect, since I recall some old conversations regarding our current machinery for this being somewhat subtle and complicated (for speed purposes, I imagine). If we could drastically simplify the implementation without sacrificing speed, that might be a win. Good student or intern project, methinks.
(Apologies for how wishlisty this issue is, I've no clue who might be the presiding expert on our current float->string implementation.)
cc @dtolnay in case this is of interest for serde
cc @lifthrasiir @rkruppe
The repository can be found at https://github.com/ulfjack/ryu and the paper at https://dl.acm.org/citation.cfm?id=3192369.
I did a line-by-line translation of the C implementation to unsafe Rust in https://github.com/dtolnay/ryu. It performs better than dtoa on some inputs but the output is less human readable. For example dtoa and serde_json prefer to print 0.3 as 0.3
, while Ry奴 currently prints it as 3E-1
.
| Input f64 | dtoa | Ry奴 |
|---|---|---|
| 0.1234 | 47 ns | 66 ns |
| 2.718281828459045 | 64 ns | 40 ns |
| 1.7976931348623157e308 | 73 ns | 42 ns |
Benchmark commit: https://github.com/dtolnay/dtoa/commit/655152bc0fa7ccbd9a403c78f83cc234fac7129f
@dtolnay : I would imagine that changing the output format would be trivial, no?
On the other hand, the fact that it performs worse on 0.1234
(+40%) is a tad concerning. I know that in some application domains the numbers are decimals by nature, and only floating point representation for performance reasons. If Ryu performs worse on numbers with a small number of digits, then dtoa could be preferable.
As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ry奴 generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers.
The Ry奴 benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits.
As I understand it, the fact that short numbers perform worse is an inherent aspect of the algorithm. Dtoa goes left to right and avoids generating too many noise digits, so the performance is better on short numbers as reflected in the table. Ry奴 generates an exact representation and then goes right to left removing noise digits, so the performance is better on the longest numbers.
Thank you for confirming my suspicion.
The Ry奴 benchmarks all focus on a random distribution of numbers, and those have on average a lot of digits.
I would guess that whether the number of digits is random, or not, varies by application. This makes it quite difficult to pick one algorithm or the other.
Ry奴 is seems to be slower in a few case versus dtoa
, but it seems to always be faster than the std
implementation (which use grisu3 and dragon as a fallback if I'm not mistaken).
What trade-off does dtoa
makes to be faster than std
? And wouldn't it be fairer to compare Ry奴 with the std
implementation ? As far as I understand Ry奴 should produce the same output as the std
implementation.
prefer to print 0.3 as 0.3, while Ry奴 currently prints it as 3E-1
From what I understand, both Ry奴 and the std
implementation compute the digits to print and the 10 exponent. And from that you can produce a string representation. It just happen that the implementation of the second part provided in the Ry奴 make a different choice, but using what we have now for that is very likely to be trivial.
Retagging as T-libs, T-compiler hardly has anything to do with libstd matters.
I switched from dtoa
(Grisu2) to the ryu
crate in serde_json for a nice improvement in benchmarks.
Triage: not aware of any changes here
Note: at CppCon19, STL presented the culmination of his work on C++17 <charconv>
for MSVC, resulting in Ryu being used for to_chars
. He also mentioned that he had submitted a number of small performance improvement to Ulf's repository, and thought there were still possibilities to fine-tune the implementation.
Most helpful comment
I switched from
dtoa
(Grisu2) to theryu
crate in serde_json for a nice improvement in benchmarks.