Just a question: It seems an ABI breaking release is on the (distant) horizon and I was wondering if you'll use the opportunity to overhaul std::regex or if you have practically given up on it due to the vicious cycle of small user base and bad performance.
Also tracked by Microsoft-internal VSO-110128 / AB#110128 and VSO-177627 / AB#177627.
vNext note: Resolving this issue will require breaking binary compatibility. We won't be able to accept pull requests for this issue until the vNext branch is available. See #169 for more information.
We want to overhaul regex because this will be our only opportunity to fix its longstanding correctness and performance issues for the next N years, but first we need to decide how to do so. We could:
Glad to hear it. I can't really give an informed opinion on which strategy would be the best. I'm just a user.
The C++ standard derives the ECMAScript standard for the default regex specification, so it seems like adapting the regex code from a JavaScript engine (like v8 or Chakra) for the additions and additional grammars supported sounds like a viable option.
Note that licensing is important; we can consume both Boost and Apache License v2.0 with LLVM Exception. Before attempting to consume code from a JavaScript engine, we would need to investigate its license for viability.
v8 is BSD, Chakra is MIT (Chakra being a Microsoft project)
The following Microsoft-internal bugs are associated with this issue:
Did I hear correctly, that the committee wants to deprecate std::regex? In that case it probably doesn't make sense to invest too much time into regex beyond fixing bugs.
Even if <regex> ends up deprecated in the standard there's still a lot of code out there using it that don't deserve to be bitten by (1) our multiline nonconformance, (2) our 100X+ perf penalties, (3) other bugs.
The C++ standard derives the ECMAScript standard for the default regex specification, so it seems like adapting the regex code from a JavaScript engine (like v8 or Chakra) for the additions and additional grammars supported sounds like a viable option.
Sadly ECMAScript is but one of the 7 (yes, 7) grammars supported in the standard. :(
Most helpful comment
The C++ standard derives the ECMAScript standard for the default regex specification, so it seems like adapting the regex code from a JavaScript engine (like v8 or Chakra) for the additions and additional grammars supported sounds like a viable option.