in some circumstances, too many threads will be created and thus process will suffer from OOM.Here is the related issue https://github.com/apache/incubator-dubbo/issues/1932
After analyzing the problem, I found that the consumer used a cached thread pool which has no limited thread size, which is the root cause. So a choice for users to limit the max size of consumer thread is needed.
Here is one case to reproduce the problem:
consumer A call service from provider B in a very big tps, but provider B is not responding very quickly in some times due to some pause, and the calls timeout, but after a short time, provider B becomes quick so the large number of responses send back to consumer A nearly in the same time. In this case, you will find many many consumer threads are created in consumer side due to the current design.
You can easily reproduce this issue with the following provider sample to simulate the provider problem which timeout response send back in the same time to consumer side, and then consumer created many threads in this case.
public class MockImpl implements Mock {
private static final Logger logger = LoggerFactory.getLogger(MockImpl.class);
public void sleep(long ms) {
long span = computeNextMinuteSpan();
logger.info("begin to sleep "+ms+" ms");
try {
Thread.sleep(ms);
} catch (InterruptedException e) {
e.printStackTrace();
}
logger.info("after sleep " + ms + " ms");
}
public void sleepToNextMinute() {
long span = computeNextMinuteSpan();
sleep(span);
}
public static long computeNextMinuteSpan() {
long now = System.currentTimeMillis();
Calendar cal = Calendar.getInstance();
cal.setTimeInMillis(System.currentTimeMillis());
cal.add(Calendar.MINUTE, 1);
cal.set(Calendar.SECOND, 0);
cal.set(Calendar.MILLISECOND, 0);
return cal.getTimeInMillis() - now;
}
}
sleepToNextMinute in a single thread or in thread pool.Here is a sample for consumer
logger.info("sleeping till next minute ......");
Thread.sleep(computeNextMinuteSpan());
Executors.newScheduledThreadPool(1).scheduleAtFixedRate(new Runnable() {
@Override
public void run() {
int i = 0;
Set<Thread> threadSet = Thread.getAllStackTraces().keySet();
for (Thread t : threadSet) {
if (t.getName().startsWith("DubboClientHandler")) {
logger.info("dubbo thread: {}",t.getName());
i++;
}
}
logger.info("=================================Dubbo Thread {}===================================",i);
}
},0,1000, TimeUnit.MILLISECONDS);
logger.info("mocking...");
for (int i =0;i<10000;i++) {
try {
mock.sleepToNextMinute();
} catch (Exception e) {
//logger.error(e.getMessage());
}
}
logger.info("mocking ends");
You can easily find that the consumer threads increase heavily in a very short time even if you are calling the provider service with only one thread
+1 for a limited thread pool on consumer side.
So a choice for users to limit the max size of consumer thread is needed.
Agree to left the choice to users, the best threadpool policies may vary in different scenarios. Would you mind to enable the consumer side threadpool configuration and send a PR?
Consider another perspective, you may need to scale your cluster on the consumer side to make sure it can cope with the blazingly amount of QPS.
when can this issue be implemented @chickenlj
I also hit this issue.Could I know when you can fix it please. @chickenlj @lovepoem
I want to solve the problem, and I want to add a tag 'threadpool', but what complex tag should I add the tag to?
@tswstarplanet Please feel free to go ahead and submit your pull request! I think the tag 'threadpool' is too fine-grained, which will result in too much tags.

I have a roughly drawn threadpool structure of Dubbo, hope it can help you understand how it works.
@ralf0131 Sorry, I understand your reply incompletely. Do you mean that I should not add a "threadpool" tag to any complex tag? Instead I should solve this problem by other methods?
@tswstarplanet Maybe I misunderstood your meaning, I thought you were going to add a label called 'theadpool' :) So what do you actually mean?
@ralf0131 Yes, I think adding a label "threadpool" can solve the problem. But maybe my English is a little poor, I don't understand your reply published this morning. I'm not sure you say yes or no to my plan.
@chickenlj what is the bug fix pull request? is it merged into master, why is this issue closed now
@Jaskey https://github.com/apache/incubator-dubbo/pull/2114, which is included in 2.6.3 release. Please check whether it will solve your problem.
Most helpful comment
I have a roughly drawn threadpool structure of Dubbo, hope it can help you understand how it works.