It looks the same as: https://github.com/playframework/play-json/issues/180
Reproduced by the following commit: https://github.com/plokhotnyuk/jsoniter-scala/pull/153/commits/0d53faf5093b492867b550f2cec55ff0b5cc62de
The security bug is in InstantDeserializer and DurationDeserializer of the jackson-datatype-jsr310 artifact:
protected T _fromDecimal(DeserializationContext context, BigDecimal value)
{
long seconds = value.longValue(); // <- hangs in case of 10000000e100000000
int nanoseconds = DecimalUtils.extractNanosecondDecimal(value, seconds);
return fromNanoseconds.apply(new FromDecimalArguments(
seconds, nanoseconds, getZone(context)));
}
W/A is to use custom serializers for all types that are parsed with InstantDeserializer and DurationDeserializer by registering them after (or instead of) registration of the JavaTimeModule module.
Possible solution (proposed by @h8 ) would be to use bounding check for the BigDecimal value before the longValue() call.
As example, for InstantDeserializer that bounds can be derived from Instant.MIN and Instant.MAX values: new java.math.BigDecimal(java.time.Instant.MIN.getEpochSecond()) and new java.math.BigDecimal(java.time.Instant.MAX.getEpochSecond()).
@cowtowncoder it is low-bandwidth DoS vulnerability and affects all products that uses Jackson's InstantDeserializer and DurationDeserializer for parsing of java.time.Duration, java.time.Instant, java.time.OffsetDateTime, and java.time.ZonedDateTime types so please fix it ASAP or let us know if any help is wanted.
I made a best guess. I just am throwing a parse exception if it is out of bounds. I can clean up the error message however or perform a pull request if it looks good.
Not sure if this needs to be back ported to other versions as well.
@abracadv8 would longValueExact be enough? if we want long anyway then it might be good idea to throw exception if number is too large/low, and this will do everything for you.
Also @plokhotnyuk note that similar operation is inside DecimalUtils:
https://github.com/FasterXML/jackson-modules-java8/blob/b45e632dbf4911c49cf33d2e8da5eb31113d1d75/datetime/src/main/java/com/fasterxml/jackson/datatype/jsr310/DecimalUtils.java#L101
And this will hang too.
@plokhotnyuk Yes, I would appreciate help here. At first I was thinking text-limit might help, but then realized this is not likely to work well across formats, as although it'd be possible to access numeric values as strings with JSON (and in general textual types), it would cause issues for binary formats.
I'd be happy to merge a patch for datetime datatype module, branch 2.9 (or if anyone wants to, 2.8).
@GotoFinal - No, the check for whether the BigDecimal is too big to fit into an Instant or Duration likely needs to be done before longValue is even performed. The seconds portion of an Instant.MAX is a long, but its max value is slightly smaller than Long.MAX. So if you know it is not bigger than an Instant.Max, then you know it is definitely not larger than Long.MAX and can safely be converted via longValue().
Also, for the DecimalUtils portion you described here, that extraction happens after longValue() is performed (and after I check to make sure it is no larger than an Instant.MAX).
The check needs to look something like this -- https://github.com/abracadv8/jackson-modules-java8/blob/master/datetime/src/main/java/com/fasterxml/jackson/datatype/jsr310/deser/InstantDeserializer.java#L307-#L312
@cowtowncoder ,
I can work on a pull for one of those versions later tonight if it looks good.
What about really long strings, should we be wary of those as well?
Example: https://github.com/abracadv8/jackson-modules-java8/blob/master/datetime/src/main/java/com/fasterxml/jackson/datatype/jsr310/deser/InstantDeserializer.java#L183-#L195
@cowtowncoder ,
Pull request for 2.9 - https://github.com/FasterXML/jackson-modules-java8/pull/84
If it looks good, I'll work on 2.8.
This failure mode isn't tied to time, it's tied to conversion of BigDecimal to integers, so I'm concerned that there may be other cases that are not being addressed by the proposed fix.
The root cause appears to be BigDecimal.setScale() which is used for all such conversions. This program hangs:
public static void main(String[] args)
{
BigDecimal dec = new BigDecimal("10000000e100000000");
BigDecimal scaled = dec.setScale(0, BigDecimal.ROUND_DOWN);
}
Are there other places in Jackson where decimals are converted to integers? I suspect those components are going to be equally susceptible to this problem.
Agreed -- anywhere that can accept a string to be converted to a BigDecimal where:
@abracadv8 The DoS described here isn't caused by converting a string to BigDecimal, but in converting a high-exponent BigDecimal to an integer, which results in effectively unbounded CPU consumption. That effect can be triggered via a small number of characters on the input; even 1e100000000 is enough to consume my CPU for ~longer than I'm willing to wait~ two minutes.
Quick question: does this only affect BigDecimal or also BigInteger? Asking to make sure if both decimal notation AND integer notation need to be checked, or just former.
@toddjonker Good point on real root cause; this was my assumption. But from this, instead of guarding textual notation length (which is bit suboptimal in other ways), would it be possible to use magnitude checking methods of BigDecimal to detect cases where magnitude of integral portion would be beyond range usable for conversion?
@cowtowncoder yes, parsing of BigDecimal and BigInt are also affected. Internal implementation of parsing from String has O(n^2) complexity for numbers that have 10K digits or more.
Calling longValue() on large negative exponents is also expensive, e.g. new BigDecimal("1e-100000000").longValue() took two minutes on my machine. So only bound checking a BigDecimal is not sufficient
After bound checking you can use BigDecimal#divideToIntegralValue(BigDecimal.ONE) to remove any fractional component in a cheap way
I wasn't able to reproduce that issue. I don't think the parser is set up to handle negative exponents via the parser.getDecimalValue() method.
When i try 1e1000000000 it it properly throws a parse exception from my check.
When i try 1e-1000000000, it throws an exception from parser.getDecimalValue(): https://github.com/FasterXML/jackson-modules-java8/blob/6ac19f6fc052fdc3072fe751d8f0e37b1413daf0/datetime/src/main/java/com/fasterxml/jackson/datatype/jsr310/deser/InstantDeserializer.java#L179
I added a test case to check and a parse exception is being thrown but not by me - https://github.com/FasterXML/jackson-modules-java8/pull/84/commits/6ac19f6fc052fdc3072fe751d8f0e37b1413daf0
@abracadv8 sounds good, the test proves we don't need to worry about negative exponents.
FYI I didn't test through Jackson only through instantiating the BigDecimal directly
@plokhotnyuk Hmmh. This is odd since _parsing_ (simple decoding) from textual base-10 into base-10 numbers like BigDecimal and BigInteger should not (it seems to me) be expensive. Conversion into base-2 (both double/float and potentially long/int), yes, I can see that.
From my limited testing:
new BigDecimal() via exponential notation is fast. new BigDecimal(), parser.getDecimalValue(), or parser.getLongValue() from a really long String of 100....0000 is slow. bd.longValue() from exponential notation is slow. bd.longValue() from a really long String of 100...0000 is fast.For example,
Relative Speed | new BigDecimal()
parser.getDecimalValue()
parser.getLongValue()| bd.longValue() | case | status
-- | -- | -- | -- | --
Scientific notation
(1e100000) | fast | slow | JsonTokenId.ID_STRING | added MIN/MAX checks
Really Long String
(1000.....00000) | slow | fast | JsonTokenId.ID_NUMBER_INT
JsonTokenId.ID_NUMBER_FLOAT | not covered yet
The tests and check in the pull request I made covers "100000e1000000", but does not cover a really long string of "10000......0000". I'm not sure if that case is reachable from the parser, but, I did have a failing test case for that.
I attempted to fix those as well, but folks said I was conflagrating issues, so I removed them.
For InstantDeserializer and DurationDeserializer:
case JsonTokenId.ID_STRING: @cowtowncoder Current implementations of java.math.BigInteger and java.math.BigDecimal in Java 8+ use binary (base-2) representation and during parsing translate provided decimal (base-10) representation to binary one:
Below are results of benchmarks for different JSON parsers for Scala (including Jackson-module-scala) which are parametrized by the size parameter that specifies number of significant digits in scala.math.BigInt (which is just a wrapper over java.math.BigInteger):
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark (size) Mode Cnt Score Error Units
[info] BigIntBenchmark.readAVSystemGenCodec 1 thrpt 5 14890392.293 ± 590175.129 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 10 thrpt 5 6771799.312 ± 258856.674 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 100 thrpt 5 1507693.582 ± 35178.285 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 1000 thrpt 5 56379.543 ± 1754.682 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 10000 thrpt 5 721.463 ± 22.906 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 100000 thrpt 5 7.452 ± 0.043 ops/s
[info] BigIntBenchmark.readAVSystemGenCodec 1000000 thrpt 5 0.078 ± 0.001 ops/s
[info] BigIntBenchmark.readCirce 1 thrpt 5 9128874.145 ± 206652.546 ops/s
[info] BigIntBenchmark.readCirce 10 thrpt 5 7261855.082 ± 357869.830 ops/s
[info] BigIntBenchmark.readCirce 100 thrpt 5 446646.616 ± 14351.238 ops/s
[info] BigIntBenchmark.readCirce 1000 thrpt 5 20715.338 ± 303.567 ops/s
[info] BigIntBenchmark.readCirce 10000 thrpt 5 488.270 ± 32.506 ops/s
[info] BigIntBenchmark.readCirce 100000 thrpt 5 6.606 ± 0.487 ops/s
[info] BigIntBenchmark.readJacksonScala 1 thrpt 5 5203959.437 ± 288410.893 ops/s
[info] BigIntBenchmark.readJacksonScala 10 thrpt 5 2754737.373 ± 137166.533 ops/s
[info] BigIntBenchmark.readJacksonScala 100 thrpt 5 923723.753 ± 32844.855 ops/s
[info] BigIntBenchmark.readJacksonScala 1000 thrpt 5 47777.094 ± 2031.990 ops/s
[info] BigIntBenchmark.readJacksonScala 10000 thrpt 5 670.702 ± 27.796 ops/s
[info] BigIntBenchmark.readJacksonScala 100000 thrpt 5 7.170 ± 0.159 ops/s
[info] BigIntBenchmark.readJacksonScala 1000000 thrpt 5 0.075 ± 0.001 ops/s
[info] BigIntBenchmark.readJsoniterScala 1 thrpt 5 60184573.095 ± 1329210.574 ops/s
[info] BigIntBenchmark.readJsoniterScala 10 thrpt 5 30459292.738 ± 751977.834 ops/s
[info] BigIntBenchmark.readJsoniterScala 100 thrpt 5 1384704.308 ± 11719.557 ops/s
[info] BigIntBenchmark.readNaiveScala 1 thrpt 5 29224139.625 ± 2260565.546 ops/s
[info] BigIntBenchmark.readNaiveScala 10 thrpt 5 11295523.994 ± 204324.161 ops/s
[info] BigIntBenchmark.readNaiveScala 100 thrpt 5 2024505.047 ± 17078.738 ops/s
[info] BigIntBenchmark.readNaiveScala 1000 thrpt 5 62508.755 ± 1824.931 ops/s
[info] BigIntBenchmark.readNaiveScala 10000 thrpt 5 765.462 ± 5.213 ops/s
[info] BigIntBenchmark.readNaiveScala 100000 thrpt 5 7.770 ± 0.191 ops/s
[info] BigIntBenchmark.readNaiveScala 1000000 thrpt 5 0.077 ± 0.001 ops/s
[info] BigIntBenchmark.readPlayJson 1 thrpt 5 9070015.382 ± 671907.068 ops/s
[info] BigIntBenchmark.readPlayJson 10 thrpt 5 7680490.729 ± 193593.896 ops/s
[info] BigIntBenchmark.readPlayJson 100 thrpt 5 1177518.121 ± 96448.882 ops/s
[info] BigIntBenchmark.readPlayJson 1000 thrpt 5 56525.825 ± 1110.685 ops/s
[info] BigIntBenchmark.readPlayJson 10000 thrpt 5 746.515 ± 9.304 ops/s
[info] BigIntBenchmark.readPlayJson 100000 thrpt 5 7.724 ± 0.077 ops/s
[info] BigIntBenchmark.readPlayJson 1000000 thrpt 5 0.072 ± 0.012 ops/s
To run them on your JDK:
sbt and/or ensure that it already installed properly: sbt about
jsoniter-scala repo: git clone https://github.com/plokhotnyuk/jsoniter-scala.git
cd jsoniter-scala
sbt -no-colors 'jsoniter-scala-benchmark/jmh:run -jvm /usr/lib/jvm/jdk-11/bin/java -wi 5 -i 5 .*BigIntBench.*read.*'
@plokhotnyuk First of all, thank you for extensive research here. Second: crap. I wish translation only occurred at a later point, not during parsing, since that would have made our work here easier.
So. It seems to me like we need to consider two-part (at least) approach, if I understand situation correctly.
For scientific notation the problem is coercion from very large magnitude scale BigDecimal or BigInteger into integral numbers. This can be handled by either truncating small values to zero, capping large values to maximum, or throwing exception. Discussion may be needed to define which one to do -- seems like returning zero makes sense for super small values, but I am not sure what makes most sense for large.
I do not think we need to retain overflow from large to negative, however; although existing behavior I can't see it being useful one.
Regardless, this part could be handled from datatype module(s): at this point this means jsr310 one.
The trickier part would then be decoding large (long) decimal numbers. If we can limit this to non-scientific numbers that is probably good, as heuristics may be easier. Another good thing is that we will typically use String as intermediate form anyway. Further, checks are needed by dataformat modules and need not support from higher levels, I think.
Check itself may not be very tricky, except for one thing: unlike with coercion, where we can figure out bounds, how do we define "reasonable" limits here? We may have to start with something rather big.
And here, again, we need to figure out what to do with too-big/too-small cases: truncate or exception?
And finally... should some of this be configurable? I don't think we can do much for 2.9, but for Jackson 2.10 we may be able to actually allow configurable handler for BigDecimal (and, I assume BigInteger). Come to think of this, we could actually implement handler itself with 2.9, but could not expose configurability (both due to patch-cant-add-to-API and since only 2.10 has Builder-based alternative construction).
Anyway: with 2.10 it will be possible to construct JsonFactory builder style, and this makes it much easier to extend configurability from simple features to safely registering handlers.
So the idea would be that there is the default BigNumberConverter (or whatever we call it), but one that may be replaced with different one for those who wanted stricter/looser limits, or different logic altogether (exception <-> truncation; maybe logging).
@cowtowncoder More over, it seems that during parsing of any JSON object it is possible to DoS the Jackson parser by just adding a field with the big number.
Here is a PR which initially reproduced it for Play-JSON parser:
https://github.com/plokhotnyuk/jsoniter-scala/pull/168/files
Below are results of parametrized benchmarks where the size parameter specifies a number of digits in the value of that additional field:
[info] REMEMBER: The numbers below are just data. To gain reusable insights, you need to follow up on
[info] why the numbers are the way they are. Use profilers (see -prof, -lprof), design factorial
[info] experiments, perform baseline and negative tests that provide experimental control, make sure
[info] the benchmarking environment is safe on JVM/OS/HW level, ask for reviews from the domain experts.
[info] Do not assume the numbers tell you what you want them to tell.
[info] Benchmark (size) Mode Cnt Score Error Units
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 1 thrpt 5 3415309.675 ± 358732.424 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 10 thrpt 5 3429891.871 ± 135960.100 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 100 thrpt 5 686235.263 ± 832769.728 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 1000 thrpt 5 194588.852 ± 8165.939 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 10000 thrpt 5 28335.193 ± 911.581 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 100000 thrpt 5 2948.038 ± 128.163 ops/s
[info] ExtractFieldsBenchmark.readAVSystemGenCodec 1000000 thrpt 5 649.088 ± 199.346 ops/s
[info] ExtractFieldsBenchmark.readCirce 1 thrpt 5 1181495.148 ± 302987.993 ops/s
[info] ExtractFieldsBenchmark.readCirce 10 thrpt 5 1277915.025 ± 179880.016 ops/s
[info] ExtractFieldsBenchmark.readCirce 100 thrpt 5 1277950.564 ± 256709.663 ops/s
[info] ExtractFieldsBenchmark.readCirce 1000 thrpt 5 816515.741 ± 15529.900 ops/s
[info] ExtractFieldsBenchmark.readCirce 10000 thrpt 5 146038.446 ± 2585.134 ops/s
[info] ExtractFieldsBenchmark.readCirce 100000 thrpt 5 16825.855 ± 468.669 ops/s
[info] ExtractFieldsBenchmark.readCirce 1000000 thrpt 5 1693.840 ± 59.649 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 1 thrpt 5 11703558.471 ± 196574.764 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 10 thrpt 5 10418348.204 ± 125349.933 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 100 thrpt 5 4854474.847 ± 335999.431 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 1000 thrpt 5 833174.664 ± 22787.464 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 10000 thrpt 5 88047.329 ± 894.533 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 100000 thrpt 5 9037.421 ± 97.407 ops/s
[info] ExtractFieldsBenchmark.readDslJsonJava 1000000 thrpt 5 918.420 ± 13.473 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 1 thrpt 5 2533509.752 ± 75854.375 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 10 thrpt 5 2521299.318 ± 37344.857 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 100 thrpt 5 868736.640 ± 19590.367 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 1000 thrpt 5 54637.764 ± 1922.976 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 10000 thrpt 5 723.644 ± 14.444 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 100000 thrpt 5 7.254 ± 0.414 ops/s
[info] ExtractFieldsBenchmark.readJacksonScala 1000000 thrpt 5 0.077 ± 0.001 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 1 thrpt 5 17357927.186 ± 105168.663 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 10 thrpt 5 15509884.192 ± 599007.176 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 100 thrpt 5 10557719.687 ± 82797.425 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 1000 thrpt 5 2306588.382 ± 15014.663 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 10000 thrpt 5 252999.473 ± 2013.190 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 100000 thrpt 5 24022.123 ± 490.780 ops/s
[info] ExtractFieldsBenchmark.readJsoniterScala 1000000 thrpt 5 2042.339 ± 118.757 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 1 thrpt 5 928062.700 ± 35964.755 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 10 thrpt 5 908324.771 ± 41278.052 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 100 thrpt 5 538588.245 ± 58035.196 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 1000 thrpt 5 52739.058 ± 5124.015 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 10000 thrpt 5 743.426 ± 6.226 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 100000 thrpt 5 7.351 ± 0.030 ops/s
[info] ExtractFieldsBenchmark.readPlayJson 1000000 thrpt 5 0.067 ± 0.018 ops/s
[info] ExtractFieldsBenchmark.readUPickle 1 thrpt 5 3340922.046 ± 246892.139 ops/s
[info] ExtractFieldsBenchmark.readUPickle 10 thrpt 5 3483490.433 ± 39971.435 ops/s
[info] ExtractFieldsBenchmark.readUPickle 100 thrpt 5 2494567.445 ± 71404.382 ops/s
[info] ExtractFieldsBenchmark.readUPickle 1000 thrpt 5 814753.180 ± 30787.779 ops/s
[info] ExtractFieldsBenchmark.readUPickle 10000 thrpt 5 101384.553 ± 1049.347 ops/s
[info] ExtractFieldsBenchmark.readUPickle 100000 thrpt 5 10380.287 ± 43.464 ops/s
[info] ExtractFieldsBenchmark.readUPickle 1000000 thrpt 5 991.119 ± 60.797 ops/s
Step to reproduce are same as before, except the names of branch and benchmark:
sbt and/or ensure that it already installed properly: sbt about
jsoniter-scala repo: git clone https://github.com/plokhotnyuk/jsoniter-scala.git
play-json-DoS-using-big-number branch:cd jsoniter-scala
chackout play-json-DoS-using-big-number
sbt -no-colors 'jsoniter-scala-benchmark/jmh:run -jvm /usr/lib/jvm/jdk-11/bin/java -wi 5 -i 5 .*ExtractFieldBench.*read.*'
It gets better, there's also #2157. So I think problem does need to be resolved at streaming parser level.
My main concern here is simply that:
BigDecimal / BigInteger are only used temporarily due to magnitude of the number, butBigIntegers) are used, and should remain usable.@cowtowncoder More over, it seems that during parsing of any JSON object it is possible to DoS the Jackson parser by just adding a field with the big number.
What does "big number" mean? Does it a value with a large magnitude, or JSON text with many characters? Those are two independent, and very different, concerns. This ticket is expressly about the former, and the latter should be handled separately so as not to confuse the situation further.
IMO we should focus on the large-exponent case, and patch that problem ASAP. We can do so without changing any semantics or error cases. I submitted a PR last week with test coverage for the relevant edges, but it's not garnered any response.
i post a test case about "big number" in this page:
https://groups.google.com/forum/#!topic/jackson-user/2n5FrQ6NYSw
static String input = Strings.repeat("1", 100_0000);
@wujimin That's a different failure mode than is described in this issue.
Very long input text presents a well-known DoS vector for any library or application, and is generally mitigated by limiting the overall length of input documents. That's an entirely different problem than a very short input text that results in unbounded processing time, as is the case with this issue and the short text 10000000e100000000.
@toddjonker yes, "big number" problem traced in https://github.com/FasterXML/jackson-databind/issues/2157
agree to you, better to limit the overall length.
but our customer save picture bytes in a POJO field, and the POJO has other fields like int/long and so on, for compatible reason they can not change the model definition; so attackers can use "big number" to attack.
Most helpful comment
@abracadv8 would
longValueExactbe enough? if we want long anyway then it might be good idea to throw exception if number is too large/low, and this will do everything for you.Also @plokhotnyuk note that similar operation is inside DecimalUtils:
https://github.com/FasterXML/jackson-modules-java8/blob/b45e632dbf4911c49cf33d2e8da5eb31113d1d75/datetime/src/main/java/com/fasterxml/jackson/datatype/jsr310/DecimalUtils.java#L101
And this will hang too.