the fix for #2292 (rev 2652139f) is breaking our solution as Kubernetes is not reliably returning HTTP 409, but at times can also respond with error 500 with a message to please retry. This is confirmed with Kubernetes v1.16.7 up to v1.19.1.
2020-09-17 15:57:36.963+0200 |Â | ::: | .175.80.197:6443/... |Â WARN | .i.WatchConnectionManager |Â | Exec Failure2020-09-17 15:57:36.963+0200 |Â | ::: | .175.80.197:6443/... |Â WARN | .i.WatchConnectionManager |Â | Exec Failureio.fabric8.kubernetes.client.KubernetesClientException: Failure executing: POST at: https://10.175.80.197:6443/apis/networking.istio.io/v1beta1/namespaces/dx-system/gateways. Message: The POST operation against Gateway.networking.istio.io could not be completed at this time, please try again.. Received status: Status(apiVersion=v1, code=500, details=StatusDetails(causes=[], group=networking.istio.io, kind=Gateway, name=POST, retryAfterSeconds=null, uid=null, additionalProperties={}), kind=Status, message=The POST operation against Gateway.networking.istio.io could not be completed at this time, please try again., metadata=ListMeta(_continue=null, remainingItemCount=null, resourceVersion=null, selfLink=null, additionalProperties={}), reason=ServerTimeout, status=Failure, additionalProperties={}). at io.fabric8.kubernetes.client.dsl.base.OperationSupport.requestFailure(OperationSupport.java:589) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.assertResponseCode(OperationSupport.java:528) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:492) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleResponse(OperationSupport.java:451) at io.fabric8.kubernetes.client.dsl.base.OperationSupport.handleCreate(OperationSupport.java:252) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.handleCreate(BaseOperation.java:841) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.create(BaseOperation.java:332) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:402) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:82) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:396) at io.fabric8.kubernetes.client.dsl.base.BaseOperation.createOrReplace(BaseOperation.java:82) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.mergePortsFromServiceEntry(EgressGatewayReconciler.java:156) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.lambda$onServiceEntryChanged$3(EgressGatewayReconciler.java:62) at java.base/java.util.Optional.ifPresentOrElse(Optional.java:201) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.findGatewayAndRunOrElse(EgressGatewayReconciler.java:102) at com.oracle.cx.verticals.dx4c.config.k8s.istio.EgressGatewayReconciler.onServiceEntryChanged(EgressGatewayReconciler.java:61) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.performSubordinateIstioUpdates(ServiceEntryReconciler.java:273) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.mergeToExistingServiceEntry(ServiceEntryReconciler.java:265) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.lambda$createNewOrMergeWithServiceEntry$2(ServiceEntryReconciler.java:159) at java.base/java.util.Optional.ifPresentOrElse(Optional.java:201) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.createNewOrMergeWithServiceEntry(ServiceEntryReconciler.java:155) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.lambda$realizeServiceEntryChanges$1(ServiceEntryReconciler.java:148) at java.base/java.util.ArrayList.forEach(ArrayList.java:1540) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.realizeServiceEntryChanges(ServiceEntryReconciler.java:144) at com.oracle.cx.verticals.dx4c.config.k8s.istio.ServiceEntryReconciler.onTICChange(ServiceEntryReconciler.java:100) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.onTICUpdated(TICReconciler.java:101) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.lambda$onResourceAddedOrModified$5(OldBaseReconciler.java:87) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.onEvent(OldBaseReconciler.java:41) at com.oracle.cx.verticals.dx4c.config.k8s.OldBaseReconciler.onResourceAddedOrModified(OldBaseReconciler.java:74) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.lambda$startWatch$1(TICReconciler.java:74) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler.handle(BaseResourceHandler.java:209) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler.handleActionWithCacheUpdate(BaseResourceHandler.java:169) at com.oracle.cx.verticals.dx4c.config.k8s.targetinstanceconfig.TICReconciler.lambda$startWatch$2(TICReconciler.java:68) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$1.eventReceived(BaseResourceHandler.java:75) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$1.eventReceived(BaseResourceHandler.java:69) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$WrappedWatcher.eventReceived(BaseResourceHandler.java:250) at com.oracle.cx.verticals.dx4c.config.k8s.resources.BaseResourceHandler$WrappedWatcher.eventReceived(BaseResourceHandler.java:235) at io.fabric8.kubernetes.client.utils.WatcherToggle.eventReceived(WatcherToggle.java:49) at io.fabric8.kubernetes.client.dsl.internal.WatchConnectionManager$1.onMessage(WatchConnectionManager.java:235) at okhttp3.internal.ws.RealWebSocket.onReadMessage(RealWebSocket.java:323) at okhttp3.internal.ws.WebSocketReader.readMessageFrame(WebSocketReader.java:219) at okhttp3.internal.ws.WebSocketReader.processNextFrame(WebSocketReader.java:105) at okhttp3.internal.ws.RealWebSocket.loopReader(RealWebSocket.java:274) at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:214) at okhttp3.RealCall$AsyncCall.execute(RealCall.java:203) at okhttp3.internal.NamedRunnable.run(NamedRunnable.java:32) at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128) at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628) at java.base/java.lang.Thread.run(Thread.java:834)
We will locally make a change to unblock, but this needs a fix upstream IMHO
quick hack for lines 397 onwards (as of 1832b75d)with Thread.sleep and no timeout...
final CompletableFuture<T> future = new CompletableFuture<>();
while (!future.isDone()) {
try {
// Create
KubernetesResourceUtil.setResourceVersion(itemToCreateOrReplace, null);
future.complete(create(itemToCreateOrReplace));
} catch (KubernetesClientException exception) {
final T itemFromServer;
if (exception.getCode() == HttpURLConnection.HTTP_INTERNAL_ERROR) {
itemFromServer = fromServer().get();
if (itemFromServer == null) {
try {
Thread.sleep(200);
} catch (InterruptedException e) {
Thread.currentThread().interrupt();
}
continue;
}
} else if (exception.getCode() != HttpURLConnection.HTTP_CONFLICT) {
throw exception;
} else {
itemFromServer = fromServer().get();
}
// Conflict; Do Replace
KubernetesResourceUtil.setResourceVersion(itemToCreateOrReplace, KubernetesResourceUtil.getResourceVersion(itemFromServer));
future.complete(replace(itemToCreateOrReplace));
}
}
return future.join();
A lesser problem, but this also breaks tests using the KubernetesCrudDispatcher as the dispatcher is not implemented with the same behavior as is now required by the createOrReplace method. The dispatcher just blindly adds the object to the map, so it is random which you will get in any subsequent request.
Since PR sent by Florian https://github.com/fabric8io/kubernetes-client/pull/2501 is merged and available in v4.13.0, can we consider this issue closed? Or is there something missing?
I’d consider it done
Rohan Kumar notifications@github.com schrieb am Di. 17. Nov. 2020 um
10:23:
Since PR sent by Florian #2501
https://github.com/fabric8io/kubernetes-client/pull/2501 is merged and
available in v4.13.0, can we consider this issue closed? Or is there
something missing?—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
https://github.com/fabric8io/kubernetes-client/issues/2499#issuecomment-728800870,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAK6BBLJZAEYVPT4V7JGZPDSQI6KVANCNFSM4RRTIKBQ
.
Most helpful comment
I’d consider it done
Rohan Kumar notifications@github.com schrieb am Di. 17. Nov. 2020 um
10:23: