Dubbo: DO NOT USE org.apache.dubbo.common.utils.IOUtils.write() for your own project

Created on 1 Feb 2019  ·  4Comments  ·  Source: apache/dubbo

  • [x] I have searched the issues of this repository and believe that this is not a duplicate.
  • [x] I have checked the FAQ of this repository and believe that this is not a duplicate.

Environment

  • Dubbo version: 2.7.1-SNAPSHOT

org.apache.dubbo.common.utils.IOUtils 这个类看起来像是一个公共类,但实际上里面有坑,不要在你自己的项目里面使用这个类!

问题描述

该类有一个 write() 方法,内容如下:

public static long write(InputStream is, OutputStream os, int bufferSize) throws IOException {
    int read;
    long total = 0;
    byte[] buff = new byte[bufferSize];
    while (is.available() > 0) {   // !!!
        read = is.read(buff, 0, buff.length);
        if (read > 0) {
            os.write(buff, 0, read);
            total += read;
        }
    }
    return total;
}

其中的 while 循环使用了 InputStream.available() 的返回值来作为结束判断条件,这个方法在网络环境当中是不能作为读取结束的依据的,因为当网络传输出现延迟时,该方法立刻会返回 0,导致循环结束。测试代码如下:

@Test
public void testWriteURL() throws Exception {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.dubbo.common.utils.IOUtils.write(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

这个测试通常会失败,write() 方法在读取了部分内容后就结束循环并返回了。

正确的写法请参考 org.apache.commons.io.IOUtils#copy(java.io.InputStream, java.io.OutputStream) 的实现,对应的测试方法如下:

public static void main(String[] args) throws IOException {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.commons.io.IOUtils.copy(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

上面这个方法就不会运行失败。

为什么这个问题一直没有暴露出来呢?因为这个方法在整个 dubbo 项目中只有一个地方用到了:org.apache.dubbo.common.io.Bytes#unzip(),而这个方法只用来处理内存中的字节串,而与网络无关。万一 dubbo 项目将来在某个地方用它来读取网络内容,问题就会暴露出来。

Most helpful comment

  • [x] I have searched the issues of this repository and believe that this is not a duplicate.
  • [x] I have checked the FAQ of this repository and believe that this is not a duplicate.

Environment

  • Dubbo version: 2.7.1-SNAPSHOT

org.apache.dubbo.common.utils.IOUtils it seems that this is a class for common use but there is a pitfall in it. DO NOT USE IT IN YOUR OWN PROJECT!

Problem Description

There is a method called write() in the class:

public static long write(InputStream is, OutputStream os, int bufferSize) throws IOException {
    int read;
    long total = 0;
    byte[] buff = new byte[bufferSize];
    while (is.available() > 0) {   // !!!
        read = is.read(buff, 0, buff.length);
        if (read > 0) {
            os.write(buff, 0, read);
            total += read;
        }
    }
    return total;
}

which uses the returned value of InputStream.available() as the break condition of the while loop, in network environment, it returns 0 immediately in case of network latency and thus is not applicable to work as the EOF mark of the stream. Test code as follows:

@Test
public void testWriteURL() throws Exception {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.dubbo.common.utils.IOUtils.write(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

The test will usually fail, method write() returns after reading part of the content and the loop breaks.

for a correct implementation please refer to org.apache.commons.io.IOUtils#copy(java.io.InputStream, java.io.OutputStream), and corresponding test code is:

public static void main(String[] args) throws IOException {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.commons.io.IOUtils.copy(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

the test above passes

why this issue didn't expose up to now? Because it is only used once in org.apache.dubbo.common.io.Bytes#unzip() within the whole project, and is only used to deal with data in memory. If we use it to read data from network in the future in some places, problem occurs.
@khanimteyaz

All 4 comments

Thanks for the issue, would you please submit a pr ?

@tswstarplanet OK I'll create a PR.

could some one translate this for me in english. I tried but failed to it. Thanks

  • [x] I have searched the issues of this repository and believe that this is not a duplicate.
  • [x] I have checked the FAQ of this repository and believe that this is not a duplicate.

Environment

  • Dubbo version: 2.7.1-SNAPSHOT

org.apache.dubbo.common.utils.IOUtils it seems that this is a class for common use but there is a pitfall in it. DO NOT USE IT IN YOUR OWN PROJECT!

Problem Description

There is a method called write() in the class:

public static long write(InputStream is, OutputStream os, int bufferSize) throws IOException {
    int read;
    long total = 0;
    byte[] buff = new byte[bufferSize];
    while (is.available() > 0) {   // !!!
        read = is.read(buff, 0, buff.length);
        if (read > 0) {
            os.write(buff, 0, read);
            total += read;
        }
    }
    return total;
}

which uses the returned value of InputStream.available() as the break condition of the while loop, in network environment, it returns 0 immediately in case of network latency and thus is not applicable to work as the EOF mark of the stream. Test code as follows:

@Test
public void testWriteURL() throws Exception {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.dubbo.common.utils.IOUtils.write(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

The test will usually fail, method write() returns after reading part of the content and the loop breaks.

for a correct implementation please refer to org.apache.commons.io.IOUtils#copy(java.io.InputStream, java.io.OutputStream), and corresponding test code is:

public static void main(String[] args) throws IOException {
    String url = "https://pan.baidu.com/static/images/16new/bg2.jpg";
    long expectedSize = 77_949L;

    InputStream inputStream = new URL(url).openStream();
    ByteArrayOutputStream bos = new ByteArrayOutputStream();
    long writeSize = org.apache.commons.io.IOUtils.copy(inputStream, bos);
    assertThat(writeSize, equalTo(expectedSize)); // org.hamcrest.MatcherAssert
}

the test above passes

why this issue didn't expose up to now? Because it is only used once in org.apache.dubbo.common.io.Bytes#unzip() within the whole project, and is only used to deal with data in memory. If we use it to read data from network in the future in some places, problem occurs.
@khanimteyaz

Was this page helpful?
0 / 5 - 0 ratings

Related issues

yangpancode picture yangpancode  ·  4Comments

1061235166 picture 1061235166  ·  3Comments

caijunjun picture caijunjun  ·  3Comments

GeekDaniel picture GeekDaniel  ·  4Comments

jiangmin168168 picture jiangmin168168  ·  3Comments