Origin: Unable to access to haproxy stats web interface

Created on 24 Oct 2017 · 8Comments · Source: openshift/origin

I'm unable to access to haproxy stats web interface:

http://192.168.42.208:1936/metrics
http://192.168.42.208:1936/haproxy?stats
http://192.168.42.208:1936

Error is the same:
Forbidden

If i try using the http basic auth method it don't work:
http://admin:[email protected]:1936

Chrome also warns me that I'm trying to send auth even though the web server did not request it.

Anyway if I try to use curl (instead of Browser) it seems working:

$ curl -u admin:SECRET 192.168.42.208:1936/metrics|head
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0# HELP etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds Bucketed histogram of db compaction pause duration.
# TYPE etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds histogram
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="1"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="2"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="4"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="8"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="16"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="32"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="64"} 0
etcd_debugging_mvcc_db_compaction_pause_duration_milliseconds_bucket{le="128"} 0

Version

$ oc version
oc v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7
features: Basic-Auth GSSAPI Kerberos SPNEGO

Server https://192.168.42.208:8443
openshift v3.6.0+c4dd4cf
kubernetes v1.6.1+5115d708d7

docker@minishift:~$ docker images|grep openshift
openshift/origin-sti-builder            v3.6.0              45f63c22354a        11 weeks ago        974.3 MB
openshift/origin-deployer               v3.6.0              ad03ec44312c        11 weeks ago        974.3 MB
openshift/origin-docker-registry        v3.6.0              ec456625b2a0        11 weeks ago        1.062 GB
openshift/origin-haproxy-router         v3.6.0              75e805233369        11 weeks ago        995.3 MB
openshift/origin                        v3.6.0              25e1f260f8ae        11 weeks ago        974.3 MB
openshift/origin-pod                    v3.6.0              fb52c4c8f037        11 weeks ago        213.4 MB

Steps To Reproduce

I've started minishift and it deployed a standalone router. Additional info below

Current Result

"Forbidden", unable to access

Expected Result

Expecting to access to standard haproxy stats page

Additional Information

You'll find below some infos on the deploymentconfig:

$ oc describe dc router
Name:       router
Namespace:  default
Created:    8 hours ago
Labels:     router=router
Annotations:    <none>
Latest Version: 11
Selector:   router=router
Replicas:   1
Triggers:   Config
Strategy:   Rolling
Template:
Pod Template:
  Labels:       router=router
  Service Account:  router
  Containers:
   router:
    Image:  openshift/origin-haproxy-router:v3.6.0
    Ports:  80/TCP, 443/TCP, 1936/TCP
    Requests:
      cpu:  100m
      memory:   256Mi
    Liveness:   http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Readiness:  http-get http://localhost:1936/healthz delay=10s timeout=1s period=10s #success=1 #failure=3
    Environment:
      DEFAULT_CERTIFICATE_DIR:          /etc/pki/tls/private
      DEFAULT_CERTIFICATE_PATH:         /etc/pki/tls/private/tls.crt
      ROUTER_CIPHERS:               
      ROUTER_EXTERNAL_HOST_HOSTNAME:        
      ROUTER_EXTERNAL_HOST_HTTPS_VSERVER:   
      ROUTER_EXTERNAL_HOST_HTTP_VSERVER:    
      ROUTER_EXTERNAL_HOST_INSECURE:        false
      ROUTER_EXTERNAL_HOST_INTERNAL_ADDRESS:    
      ROUTER_EXTERNAL_HOST_PARTITION_PATH:  
      ROUTER_EXTERNAL_HOST_PASSWORD:        
      ROUTER_EXTERNAL_HOST_PRIVKEY:     /etc/secret-volume/router.pem
      ROUTER_EXTERNAL_HOST_USERNAME:        
      ROUTER_EXTERNAL_HOST_VXLAN_GW_CIDR:   
      ROUTER_LISTEN_ADDR:           0.0.0.0:1936
      ROUTER_METRICS_TYPE:          haproxy
      ROUTER_SERVICE_HTTPS_PORT:        443
      ROUTER_SERVICE_HTTP_PORT:         80
      ROUTER_SERVICE_NAME:          router
      ROUTER_SERVICE_NAMESPACE:         default
      ROUTER_SUBDOMAIN:             
      STATS_PASSWORD:               ***********
      STATS_PORT:               1936
      STATS_USERNAME:               admin
    Mounts:
      /etc/pki/tls/private from server-certificate (ro)
  Volumes:
   server-certificate:
    Type:   Secret (a volume populated by a Secret)
    SecretName: router-certs
    Optional:   false

Deployment #11 (latest):
    Name:       router-11
    Created:    7 hours ago
    Status:     Complete
    Replicas:   1 current / 1 desired
    Selector:   deployment=router-11,deploymentconfig=router,router=router
    Labels:     openshift.io/deployment-config.name=router,router=router
    Pods Status:    1 Running / 0 Waiting / 0 Succeeded / 0 Failed
Deployment #10:
    Created:    8 hours ago
    Status:     Complete
    Replicas:   0 current / 0 desired
Deployment #9:
    Created:    8 hours ago
    Status:     Complete
    Replicas:   0 current / 0 desired

Events: <none>

componenrouting kinquestion prioritP3

Source

alezzandro

All 8 comments

@alezzandro @knobunc I was having the same issue and was able to get HAProxy stats by removing the ROUTER_LISTEN_ADDR and ROUTER_METRICS_TYPE environment variables from the DeploymentConfig for the router. These were the two environment variables that existed in a fresh 3.6 install and did not exist in my cluster that I upgraded to 3.6 from 1.5.

kincl on 31 Oct 2017

👍1

I confirm what @kincl was saying! Issue was solved by removing the two env variables.

@knobunc any idea why we got this "regression"?

Thanks

alezzandro on 22 Nov 2017

We also tried same solution proposed by @kincl and it solved the issue (openshift v3.6.173)
I'm interested to understand the reason of this kind of regression. Any idea ?

Thanks

rdemarinis on 23 Nov 2017

👍1

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot on 26 Feb 2018

Removing the ROUTER_METRICS_TYPE env var from the deployment config was enough for me.

ctron on 5 Mar 2018

having the same issue in 3.7. The workaround works.
/remove-lifecycle stale

hakanisaksson on 26 Mar 2018

We changed the metrics so that they reported in prometheus format so that we can integrate with the overall cluster metrics.

However, as you noticed, we did not regress the behavior since if you have a config that worked before the change, it will continue to produce the old metrics from haproxy directly. As @ctron noted, you just need to remove the ROUTER_METRICS_TYPE from the environment variables to get the old behavior.

knobunc on 26 Mar 2018

For any 3.11 users (like myself), executing this leads to pod startup failure with error "Readiness probe failed: HTTP probe failed with statuscode: 401".

In that case, you also need to update the readiness check to the old metrics:

oc patch dc router -p '"spec": {"template": {"spec": {"containers": [{"name": "router","readinessProbe": {"httpGet": {"path": "healthz"}}}]}}}'