Kubernetes-client: Major breakage detected in master (4.11-SNAPSHOT) due to #2292

Created on 18 Sep 2020  Â·  4Comments  Â·  Source: fabric8io/kubernetes-client

the fix for #2292 (rev 2652139f) is breaking our solution as Kubernetes is not reliably returning HTTP 409, but at times can also respond with error 500 with a message to please retry. This is confirmed with Kubernetes v1.16.7 up to v1.19.1.

2020-09-17 15:57:36.963+0200 |  | ::: | .175.80.197:6443/... |  WARN | .i.WatchConnectionManager |  | Exec Failure2020-09-17 15:57:36.963+0200 |  | ::: | .175.80.197:6443/... |  WARN | .i.WatchConnectionManager |  | Exec Failureio.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.175.80.197:6443/apis/networking.istio.io/v1beta1/namespaces/dx-system/gateways. Message: The POST operation against Gateway.networking.istio.io could not be completed at this time, please try again.. Received status: Status(apiVersion=v1, code=500, details=StatusDetails(causes=[], group=networking.istio.io, kind=Gateway, name=POST, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=The POST operation against Gateway.networking.istio.io could not be completed at this time, please try again., metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=ServerTimeout, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:589) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:528) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:492) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:252) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:841) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:332) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:402) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:82) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:396) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:82) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.mergePortsFromServiceEntry(EgressGatewayReconciler.java:156) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.lambda$onServiceEntryChanged$3(EgressGatewayReconciler.java:62) at java.base/java.util.Optional.ifPresentOrElse(Optional.java:201) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.findGatewayAndRunOrElse(EgressGatewayReconciler.java:102) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.onServiceEntryChanged(EgressGatewayReconciler.java:61) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.performSubordinateIstioUpdates(ServiceEntryReconciler.java:273) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.mergeToExistingServiceEntry(ServiceEntryReconciler.java:265) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.lambda$createNewOrMergeWithServiceEntry$2(ServiceEntryReconciler.java:159) at java.base/java.util.Optional.ifPresentOrElse(Optional.java:201) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.createNewOrMergeWithServiceEntry(ServiceEntryReconciler.java:155) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.lambda$realizeServiceEntryChanges$1(ServiceEntryReconciler.java:148) at java.base/java.util.ArrayList.forEach(ArrayList.java:1540) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.realizeServiceEntryChanges(ServiceEntryReconciler.java:144) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.onTICChange(ServiceEntryReconciler.java:100) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.onTICUpdated(TICReconciler.java:101) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.lambda$onResourceAddedOrModified$5(OldBaseReconciler.java:87) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.onEvent(OldBaseReconciler.java:41) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.onResourceAddedOrModified(OldBaseReconciler.java:74) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.lambda$startWatch$1(TICReconciler.java:74) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler.handle(BaseResourceHandler.java:209) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler.handleActionWithCacheUpdate(BaseResourceHandler.java:169) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.lambda$startWatch$2(TICReconciler.java:68) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$1.eventReceived(BaseResourceHandler.java:75) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$1.eventReceived(BaseResourceHandler.java:69) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$WrappedWatcher.eventReceived(BaseResourceHandler.java:250) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$WrappedWatcher.eventReceived(BaseResourceHandler.java:235) at io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49) at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:235) at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323) at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219) at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105) at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)

We will locally make a change to unblock, but this needs a fix upstream IMHO

bug

Most helpful comment

I’d consider it done

Rohan Kumar notifications@github.com schrieb am Di. 17. Nov. 2020 um
10:23:

Since PR sent by Florian #2501
https://github.com/fabric8io/kubernetes-client/pull/2501 is merged and
available in v4.13.0, can we consider this issue closed? Or is there
something missing?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/fabric8io/kubernetes-client/issues/2499#issuecomment-728800870,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAK6BBLJZAEYVPT4V7JGZPDSQI6KVANCNFSM4RRTIKBQ
.

All 4 comments

quick hack for lines 397 onwards (as of 1832b75d)with Thread.sleep and no timeout...

final CompletableFuture<T> future = new CompletableFuture<>();
    while (!future.isDone()) {
      try {
        // Create
        KubernetesResourceUtil.setResourceVersion(itemToCreateOrReplace, null);
        future.complete(create(itemToCreateOrReplace));
      } catch (KubernetesClientException exception) {
        final T itemFromServer;
        if (exception.getCode() == HttpURLConnection.HTTP_INTERNAL_ERROR) {
          itemFromServer = fromServer().get();
          if (itemFromServer == null) {
            try {
              Thread.sleep(200);
            } catch (InterruptedException e) {
              Thread.currentThread().interrupt();
            }
            continue;
          }
        } else if (exception.getCode() != HttpURLConnection.HTTP_CONFLICT) {
          throw exception;
        } else {
          itemFromServer = fromServer().get();
        }

        // Conflict; Do Replace
        KubernetesResourceUtil.setResourceVersion(itemToCreateOrReplace, KubernetesResourceUtil.getResourceVersion(itemFromServer));
        future.complete(replace(itemToCreateOrReplace));
      }
    }
    return future.join();

A lesser problem, but this also breaks tests using the KubernetesCrudDispatcher as the dispatcher is not implemented with the same behavior as is now required by the createOrReplace method. The dispatcher just blindly adds the object to the map, so it is random which you will get in any subsequent request.

Since PR sent by Florian https://github.com/fabric8io/kubernetes-client/pull/2501 is merged and available in v4.13.0, can we consider this issue closed? Or is there something missing?

I’d consider it done

Rohan Kumar notifications@github.com schrieb am Di. 17. Nov. 2020 um
10:23:

Since PR sent by Florian #2501
https://github.com/fabric8io/kubernetes-client/pull/2501 is merged and
available in v4.13.0, can we consider this issue closed? Or is there
something missing?

—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/fabric8io/kubernetes-client/issues/2499#issuecomment-728800870,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAK6BBLJZAEYVPT4V7JGZPDSQI6KVANCNFSM4RRTIKBQ
.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

dixingxing0 picture dixingxing0  Â·  5Comments

yangjinlogic picture yangjinlogic  Â·  4Comments

Tammo0987 picture Tammo0987  Â·  3Comments

yaakua picture yaakua  Â·  3Comments

if6was9 picture if6was9  Â·  5Comments