Eclipse: Version: 2019-09 R (4.13.0), Build id: 20190917-1200
EclEmma 3.1.2.201903112331 org.eclipse.eclemma.feature.feature.group Eclipse EclEmma
JUnit 4.13
Windows 10 x64
Java 1.8u241
I've run into an odd error while running coverage on an unit test having some methods use Google Guava's CacheBuilder in Eclipse. The coverage fails with the error code 5013 and error report indicating java.net.SocketException: Connection reset. If I remove these methods, the coverage succeeds.
If I revert back to JUnit 4.12, both the test file and the individual test succeeds in producing the coverage data.
This is where the cache is created:
https://github.com/ReactiveX/RxJava/blob/5f6aafcbaa97f330ad2007f4ad02c5e33db8914b/src/test/java/io/reactivex/rxjava3/internal/operators/flowable/FlowableGroupByTest.java#L2194
Please advise.
Could you please provide a stacktrace and/or Gradle build scan?
There is no JVM crash file or failure stacktrace. Only an Eclipse error about losing connection (java.net.SocketException: Connection). I don't have any issues when I run the test(s) in Gradle from command line so not sure how that would help.
In that case, I suspect an Eclipse or EclEmma issue. Could you please bring it up with them?
Okay. One last question. Could it be that the JUnit 4.13 integration has some trouble with Eclipse?
Well, there鈥榮 always a chance we broke something. But we didn鈥榯 change any APIs use by integrations as far as I know.
When I searched for the error message on StackOverflow I found https://stackoverflow.com/a/42904311/95725 which indicates that it might be a memory problem. Maybe with the upgrade to 4.13 you cross some kind of threshold memory-wise.
There is also always the possibility that you have multiple versions of JUnit on your classpath.
Yes I tried that before posting to no avail. Also it is quite peculiar to fail only at those cache-methods and not for the other 12000+ tests. I've yet to hear back from the EclEmma people.
Hi. I got some feedback from the EclEmma people: https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jacoco/SHw3R6cvZSs/yaFgd7XUAQAJ
Looks like shutdown hooks are not executed the same way in 4.13 as it used to happen in 4.12 and thus the coverage failure.
Here is a JUnit demonstrating the issue (similar to how EclEmma would do it) that sometimes prints "hook executed", sometimes doesn't with 4.13:
import java.util.concurrent.TimeUnit;
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.Timeout;
import com.google.common.cache.CacheBuilder;
public class ExampleTest {
static {
System.err.println("registering hook");
Thread thread = new Thread(new Runnable() {
@Override
public void run() {
System.err.println("hook executed");
}
});
Runtime.getRuntime().addShutdownHook(thread);
System.err.println(thread.getThreadGroup());
}
@Rule
public Timeout timeout = new Timeout(5, TimeUnit.MINUTES);
@Test
public void test() {
CacheBuilder.newBuilder() //
.<Integer, Object>build();
}
}
Hi guys 馃憢
@akarnokd thank you for forwarding here my responses from JaCoCo Mailing List.
@marcphilipp I think that the following happens in the case of the above example (https://github.com/junit-team/junit4/issues/1652#issuecomment-593559646).
After execution of static block there is one shutdown hook in java.lang.ApplicationShutdownHooks.hooks
{Thread@1552} "Thread[Thread-0,5,main]" -> {Thread@1552} "Thread[Thread-0,5,main]"
And after execution of CacheBuilder.newBuilder()
there are two
{LogManager$Cleaner@1551} "Thread[Logging-Cleaner,5,FailOnTimeoutGroup]" -> {LogManager$Cleaner@1551} "Thread[Logging-Cleaner,5,FailOnTimeoutGroup]"
{Thread@1552} "Thread[Thread-0,5,main]" -> {Thread@1552} "Thread[Thread-0,5,main]"
The LogManager$Cleaner@1551
is created by java.util.logging.LogManager
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/util/logging/LogManager.java#L253) because com.google.common.cache.CacheBuilder
uses java.util.logging
(https://github.com/google/guava/blob/v28.2/guava/src/com/google/common/cache/CacheBuilder.java#L226).
Note that it also belongs to FailOnTimeoutGroup
created by org.junit.rules.Timeout
(https://github.com/junit-team/junit4/blob/r4.13/src/main/java/org/junit/internal/runners/statements/FailOnTimeout.java#L123).
So start of this hook after the test throws IllegalThreadStateException
(at https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/Thread.java#L709), because the group is already destroyed (https://github.com/junit-team/junit4/blob/r4.13/src/main/java/org/junit/internal/runners/statements/FailOnTimeout.java#L140).
This exception stops the execution of subsequent hooks (https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L101-L103) and silently ignored (https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/Shutdown.java#L115-L129).
The sequence in which hooks start depends on keySet
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L97) of IdentityHashMap
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L39) and that's why varies from run to run.
This doesn't happen in case of JUnit 4.12, because destroy of FailOnTimeoutGroup
was added in JUnit 4.13 (https://github.com/junit-team/junit4/commit/63fd27710ee2bab6fdf00965f41ef12b79120c70). And this change also explains the difference between 4.12 and 4.13 observed during investigations (https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jacoco/SHw3R6cvZSs/yaFgd7XUAQAJ) for the following example:
import org.junit.Rule;
import org.junit.Test;
import org.junit.rules.Timeout;
import java.util.concurrent.TimeUnit;
public class ExampleTest {
@Rule
public Timeout globalTimeout = new Timeout(5, TimeUnit.MINUTES);
@Test
public final void test() {
Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
@Override
public void run() {
System.err.println("hook executed");
}
}));
}
}
The same can be demonstrated without JUnit as following
import java.util.logging.LogManager;
public class Example {
public static void main(String[] args) throws Exception {
Runtime.getRuntime().addShutdownHook(new Thread(new Runnable() {
@Override
public void run() {
System.err.println("example hook");
}
}));
ThreadGroup group = new ThreadGroup("example group");
Thread thread = new Thread(group, new Runnable() {
@Override
public void run() {
LogManager.getLogManager();
}
}, "example thread");
thread.start();
thread.join();
group.destroy();
}
}
So I was also thinking about creation of a ticket in OpenJDK, and while thinking about what can be suggested in such ticket, found already existing https://bugs.openjdk.java.net/browse/JDK-6704979
Nevertheless wondering if there is a chance to do something about this in JUnit.
@Godin Thanks for the detailed analysis! 馃憤
@kcooney Do you know/remember what the intention of using a ThreadGroup
here was?
@marcphilipp my guess is this breakage was caused by https://github.com/junit-team/junit4/commit/63fd27710 which was proposed by @elrodro83
The ThreadGroup was added in https://github.com/junit-team/junit4/commit/1768d08d0
@akarnokd As a workaround, have you tried initializing JUL early, e.g. in a static initializer of a base test class?
No. I rarely run the full project coverage locally and so far, Travis CI managed to produce coverage data. I worked around by commenting out those CacheBuilder tests while I needed to work with the class being tested.
caused by 63fd277
I.e., #1517, for the backlink.
Hey, everyone. @jglick suggested in #1517 that we roll back #1517 or make it opt-in.
My experience with the ThreadGroup.destroy()
of PR #1517 isn't positive as well.
In Eclipse Californium it causes unit tests to fail randomly .
The randomness is caused by a "race" of terminating threads and the call of ThreadGroup.destroy()
.
And there is also a not to obvious order condition between the Timeout rule and static ThreadGroups. If the group is instantiated outside the Timeout rule, it's preserved, otherwise destroyed.
In my opinion, if such a feature is required, please
As temporary workaround I try to ensure, that all "static" thread-groups are initialized before any Timeout rule.
But that's not too easy to see.
I think an opt-out flag as a system property might suffice given the relatively low number of reports of this problem.
@kcooney What do you suggest we do?
The Timeout
rule has already a Builder
. FMPOV it's easier to use, if a parameter is added there.
It's possible there are low reports of this problem because of low usage of 4.13. I don't think we had many reports of the ThreadGroup leakages either, and JUnit as been using ThreadGroup since 4.12 (see https://github.com/junit-team/junit4/commit/1768d08).
If we hadn't released 4.13, I would suggest making the new behavior opt-in. If we make it opt-in now, then that seems like a visible enough change that we should make the new release 4.14.
If we make it opt in or opt out, I agree that we should use the Builder. JUnit 4.x rarely uses system properties to control behavior, and David Saff has objected to adding them in the past, so I'm not sure if that's a good option.
I wonder if instead of destroying the ThreadGroup we should just mark it as a daemon group. I think that would resolve the original issue (child ThreadGroups inherit the parent's daemon status) and we could do that change in a bug fix release.
I think making it a daemon thread group is a good idea. Does someone here have time to work on a PR?
Can we somehow vote for 4.13.2 to be released sooner because of this? This is somewhat blocking me because of the security issues < 4.13.1. Thanks much for the work here.
FYI my minimum reproducible example can be found on this stackoverflow post and this github repo.
@j256 Thanks for the feedback. We haven't discussed timelines yet. I've started associating things with the 4.13.2 milestone.
The following workaround works for me:
/** Workaround for the already fixed JUnit 4 issue #1652 (<https://github.com/junit-team/junit4/issues/1652>):
* Can be used in JUnit 4.13 and 4.13.1 in tests with a timeout to prevent the thread group from being destroyed
* and thus JaCoCo (used e.g. by EclEmma) loosing its collected data. */
public static void workaround_for_junit4_issue_1652() {
String version = junit.runner.Version.id();
if (!"4.13.1".equals(version) && !"4.13".equals(version))
fail("Workaround for JUnit 4 issue #1652 required for JUnit 4.13 and 4.13.1 only; actual version: "
+ version);
Thread thread = Thread.currentThread();
if (!"Time-limited test".equals(thread.getName())) fail("Workaround only required for a test with a timeout.");
new Thread(thread.getThreadGroup(), new Runnable() {
@Override
public void run() {
try {
while (!thread.isInterrupted()) Thread.sleep(100);
} catch (InterruptedException e) {}
}
}).start();
}
@Test(timeout = 42)
public void test() {
workaround_for_junit4_issue_1652();
// ...
}
I'm still waiting for 4.13.2 with this fix. Any update @kcooney? Again, because of the security problems before 4.13.1, I'm stuck with now with a lot of projects that aren't publishing CI unit test information because of this bug. Thanks.
@j256 4.13.2 is now released.
Thanks much @marcphilipp !
Most helpful comment
Hi guys 馃憢
@akarnokd thank you for forwarding here my responses from JaCoCo Mailing List.
@marcphilipp I think that the following happens in the case of the above example (https://github.com/junit-team/junit4/issues/1652#issuecomment-593559646).
After execution of static block there is one shutdown hook in
java.lang.ApplicationShutdownHooks.hooks
And after execution of
CacheBuilder.newBuilder()
there are twoThe
LogManager$Cleaner@1551
is created byjava.util.logging.LogManager
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/util/logging/LogManager.java#L253) becausecom.google.common.cache.CacheBuilder
usesjava.util.logging
(https://github.com/google/guava/blob/v28.2/guava/src/com/google/common/cache/CacheBuilder.java#L226).Note that it also belongs to
FailOnTimeoutGroup
created byorg.junit.rules.Timeout
(https://github.com/junit-team/junit4/blob/r4.13/src/main/java/org/junit/internal/runners/statements/FailOnTimeout.java#L123).So start of this hook after the test throws
IllegalThreadStateException
(at https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/Thread.java#L709), because the group is already destroyed (https://github.com/junit-team/junit4/blob/r4.13/src/main/java/org/junit/internal/runners/statements/FailOnTimeout.java#L140).This exception stops the execution of subsequent hooks (https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L101-L103) and silently ignored (https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/Shutdown.java#L115-L129).
The sequence in which hooks start depends on
keySet
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L97) ofIdentityHashMap
(https://github.com/openjdk/jdk/blob/jdk8-b120/jdk/src/share/classes/java/lang/ApplicationShutdownHooks.java#L39) and that's why varies from run to run.This doesn't happen in case of JUnit 4.12, because destroy of
FailOnTimeoutGroup
was added in JUnit 4.13 (https://github.com/junit-team/junit4/commit/63fd27710ee2bab6fdf00965f41ef12b79120c70). And this change also explains the difference between 4.12 and 4.13 observed during investigations (https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!msg/jacoco/SHw3R6cvZSs/yaFgd7XUAQAJ) for the following example:The same can be demonstrated without JUnit as following
So I was also thinking about creation of a ticket in OpenJDK, and while thinking about what can be suggested in such ticket, found already existing https://bugs.openjdk.java.net/browse/JDK-6704979
Nevertheless wondering if there is a chance to do something about this in JUnit.