1. 基本測試
Workspace 出現unstable tag後,無法進入workspace做刪除或重啟動作。

偶爾會遇到workspace無法重啟之情況,目前尚未找出原因重現(單獨重啟broker或worker皆沒問題)

當pipeline上有topic,無法重啟broker



Workspace 出現unstable tag後,無法進入workspace做刪除或重啟動作。
This is the desired result, please see #5385
當pipeline上有topic,無法重啟broker
This looks like the exact problem to #5402 🤔 , would you share the testing steps?
當pipeline上有topic,無法重啟broker
This looks like the exact problem to #5402 🤔 , would you share the testing steps?
My testing step
當pipeline上有topic,無法重啟broker
一直遇到類似的問題,只要有 topic,不管是重啟 broker 或是 workspace 時,Start topic 一定會失敗 :(
一直遇到類似的問題,只要有 topic,不管是重啟 broker 或是 workspace 時,Start topic 一定會失敗 :(
錯誤訊息呢?
一直遇到類似的問題,只要有 topic,不管是重啟 broker 或是 workspace 時,Start topic 一定會失敗 :(
錯誤訊息呢?
測試步驟:



這感覺跟worker的故事蠻像的?
Restart broker 成功後,重新整理畫面,pipeline 的 toolbox 無法正常顯示
測試步驟:
worker container logs:
[2020-07-21 02:48:01,699] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:01,766] INFO AbstractConfig values:
(org.apache.kafka.common.config.AbstractConfig:347)
[2020-07-21 02:48:01,806] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:01,915] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:02,034] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
.... 經過 30 秒
[2020-07-21 02:48:26,897] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:26,998] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:27,099] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:27,200] WARN [Consumer clientId=consumer-7755f77964cd41f0b57e5b6ef-1, groupId=7755f77964cd41f0b57e5b6ef] 1 partitions have leader brokers without a matching listener, including [connect.offset5f73c54bedef40a5938b3ced5-0] (org.apache.kafka.clients.NetworkClient:1063)
[2020-07-21 02:48:27,201] ERROR [Worker clientId=connect-1, groupId=7755f77964cd41f0b57e5b6ef] Uncaught exception in herder work thread, exiting: (org.apache.kafka.connect.runtime.distributed.DistributedHerder:297)
org.apache.kafka.common.errors.TimeoutException: Failed to get offsets by times in 30000ms
[2020-07-21 02:48:27,206] INFO Stopped http_41325@63538bb4{HTTP/1.1,[http/1.1]}{0.0.0.0:41325} (org.eclipse.jetty.server.AbstractConnector:380)
[2020-07-21 02:48:27,207] INFO node0 Stopped scavenging (org.eclipse.jetty.server.session:158)
[2020-07-21 02:48:27,214] INFO Kafka Connect stopping (org.apache.kafka.connect.runtime.Connect:67)
[2020-07-21 02:48:27,214] INFO Stopping REST server (org.apache.kafka.connect.runtime.rest.RestServer:321)
[2020-07-21 02:48:27,215] INFO REST server stopped (org.apache.kafka.connect.runtime.rest.RestServer:338)
[2020-07-21 02:48:27,215] INFO [Worker clientId=connect-1, groupId=7755f77964cd41f0b57e5b6ef] Herder stopping (org.apache.kafka.connect.runtime.distributed.DistributedHerder:616)
[2020-07-21 02:48:32,215] INFO [Worker clientId=connect-1, groupId=7755f77964cd41f0b57e5b6ef] Herder stopped (org.apache.kafka.connect.runtime.distributed.DistributedHerder:636)
[2020-07-21 02:48:32,215] INFO Kafka Connect stopped (org.apache.kafka.connect.runtime.Connect:72)
Delete workspace 成功後,如果沒有勾選 close after finish,Dialog 不應該被關閉。
Add a JDBC connector (without JDBC connector plugin), when user forgot to add the plugin jar will cause the pipeline dead.
Can't remove any component(topic or connector) and pipeline can't delete.
And the connector should show failed status(red) but still show stopped status(gray), user have to refresh the page to see the real status.
Can't restart Workspace, Broker, Worker
Tracked in #5501 and #5502
Thanks @chuntsekevin !
@oharastream/frontend Does anyone take a look?
It's a bug, I've opened a new issue here
Add a JDBC connector (without JDBC connector plugin), when user forgot to add the plugin jar will cause the pipeline dead.
Can't remove any component(topic or connector) and pipeline can't delete.
And the connector should show failed status(red) but still show stopped status(gray), user have to refresh the page to see the real status.
Can't restart Workspace, Broker, Worker
A couple of bug we have discovered while trying to repro this issue:
重複建立到相同的 workspace name 會造成 Pipeline Toolbox 的 Source 和 Sink 無法顯示
測試步驟如下:
建立完成 workspace1 之後,把 configurator 和 manager 關閉掉。但是忘記刪除之前建立 workspace1 的 zookeeper, broker 和 worker container
第二次啟動 configurator 和 manager 之後,再一次建立 workspace1
完成 workspace1 的建立之後,再去建立 pipeline1,打開 pipeline1 就會看到如下的畫面:

以上的問題主要是沒有將 configurator container 存放的資料 (rockdb) 保留到 host 裡,才會造成 configurator 重啟時所有 workspace 的資訊都不見,使用以下指令可以把 configurator container 的資料保留到 host 裡:
docker run -d --rm -v /tmp/rockdb:/home/ohara/configurator -p 12345:12345 oharastream/configurator:0.11.0-SNAPSHOT --port 12345 --hostname ohara-jenkins-it-00 --folder /home/ohara/configurator
重新 Restart Broker 之後 worker 的 container 會收到如下的 warn 訊息:
[2020-07-29 06:22:01,852] WARN [Consumer clientId=consumer-ed51f67fb39846ebb2a483dfd-1, groupId=ed51f67fb39846ebb2a483dfd] 1 partitions have leader brokers without a matching listener, including [connect.offsetbbdbd649eacb41f1ae8031065-0] (org.apache.kafka.clients.NetworkClient:1063)
測試步驟如下:
這跟上面 @eechih 所敘述的問題應該是一樣的問題
[2020-07-29 06:22:01,852] WARN [Consumer clientId=consumer-ed51f67fb39846ebb2a483dfd-1, groupId=ed51f67fb39846ebb2a483dfd] 1 partitions have leader brokers without a matching listener, including [connect.offsetbbdbd649eacb41f1ae8031065-0] (org.apache.kafka.clients.NetworkClient:1063)
以上的錯誤確定是因為目前,前端沒有整合 volume 的部份,所以在 restart broker 的 container 時也需要將 zookeeper container 也一起刪掉重啟。確保 zookeeper container 沒有存放到 restart 之前 broker container 的系統資料,這樣才能讓 broker container 和 worker container 正常執行。
0.11.0 已經 release 了,我先把此議題關閉。之後有發現問題再來 reopen 此議題
Most helpful comment
It's a bug, I've opened a new issue here