Openj9: NativeChaCha20 decrypt crashes with large data inputs

Created on 12 May 2020  路  8Comments  路  Source: eclipse/openj9

Originally described by @keithc-ca in #9429 comment.

NativeChaCha20Cipher.EngineAEADDec appears to capture the entire input in a ByteArrayOutputStream: if the input is large (think gigabytes) this is not reasonable.

To elaborate:

  • When a client calls cipher.engineUpdate or cipher.engineFinal with a ByteBuffer as input, bufferCrypt will be called (as shown here and here).
  • In case of ChaCha20, bufferCrypt calls one of the engineUpdate and engineDoFinal functions in NativeChaCha20Cipher.java.
  • Those functions call the underlying doUpdate call. In case of AEAD decryption, the doUpdate copies the input to a buffer
  • Hence calling cipher.engineUpdate continuously with large input sizes (gigabytes) will cause a crash.
openssl

Most helpful comment

limit the size of ByteArrayOutputStream. If the buffer overflows, doFinal is invoked no matter doFinal is called or not

In case the buffer overflows, you would need to call doUpdate and not doFinal. This is because doFinal will finalize the operation so you can't call doUpdate afterwards. So for example, consider a case when the internal buffer size is 10KB and the user is trying to encrypt 1MB of data (divided into 1KB chunks). The buffer will overflow after few calls for doUpdate. When the buffer overflows, you should not call doFinal because if you do, the cipher will be finalized and it can't encrypt the rest of the data.

As we discussed with @ashbm5 today, a better approach for now is to encrypt the data as soon as the doUpdate receives them. No need to buffer them.

However, it does nothing about inputing a large size of data at one time.

Yes that is still a valid concern.

All 8 comments

looked into the code.

  • doUpdate returns 0 or a 0 byte array. doFinal returns cumulated input size or a byte array with encrypted output.
  • Currently, an input data is stored in a ByteArrayOutputStream buffer when invoke doUpdate, and encrypt all inputs until calling doFinal.
  • have to store/cumulate the input data somewhere before calling doFinal, which means no mater what container, the same issue is reproduced when input is large.

ideas about to solve it

  • limit the size of ByteArrayOutputStream. If the buffer overflows, doFinal is invoked no matter doFinal is called or not.
  • Return a non-zero size or a encrypted output to acknowledge user that previous input has been encrypted and outputted.

The idea should be helpful for a cumulated large input, such as calling doUpdate multiple times with small inputs. However, it may have no help about inputing a large size of data at one time.

May i have any suggestions, thanks.

limit the size of ByteArrayOutputStream. If the buffer overflows, doFinal is invoked no matter doFinal is called or not

In case the buffer overflows, you would need to call doUpdate and not doFinal. This is because doFinal will finalize the operation so you can't call doUpdate afterwards. So for example, consider a case when the internal buffer size is 10KB and the user is trying to encrypt 1MB of data (divided into 1KB chunks). The buffer will overflow after few calls for doUpdate. When the buffer overflows, you should not call doFinal because if you do, the cipher will be finalized and it can't encrypt the rest of the data.

As we discussed with @ashbm5 today, a better approach for now is to encrypt the data as soon as the doUpdate receives them. No need to buffer them.

However, it does nothing about inputing a large size of data at one time.

Yes that is still a valid concern.

We're coming up on the v0.21.0 code split on June 7. Will this be ready and merged before then?

@DanHeidinga @keithc-ca We actually have the change ready for review. However, the implementation problem pointed out by Keith - that the input data is stored in a buffer and not processed immediately, which can cause out of memory problems, is also shared with OpenJDK java (non-OpenSSL) implementation.
https://github.com/openjdk/jdk/blob/master/src/java.base/share/classes/com/sun/crypto/provider/ChaCha20Cipher.java#L1339

This behavior is expected in a ChaCha20 unit test so we will also need to modify a ChaCha20. Strictly speaking, it should be fine to modify since the crypto API allows for both types of behaviors (immediate return or storing in a buffer and returning at some other point in time). However, since this is a change in external API behavior, it may break something unexpected at some future point.

This behavior is expected in a ChaCha20 unit test

So both OpenJDK's implementation and our implementation have the same limitation of needing to buffer the input until it's complete? Sounds like something we shouldn't be changing unless there's a compelling reason to, unless I've missed something

Or we should make the change in OpenJDK and contribute it first.

This is an enhancement. Since it's not delivered before the 0.21 branch, deferring to the next release.

Moving to the backlog.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

sophia-guo picture sophia-guo  路  52Comments

dcendents picture dcendents  路  53Comments

Thihup picture Thihup  路  51Comments

AlenBadel picture AlenBadel  路  106Comments

fjeremic picture fjeremic  路  62Comments