Version of your agent? 2.144.0/2.144.1/...
Agent name: 'Hosted Agent'
Agent machine name: 'fv-az712'
Current agent version: '2.164.8'
Current image version: '20200211.1'
Agent running as: 'vsts'
Prepare build directory.
Set build variables.
Download all required tasks.
dev.azure.com
https://dev.azure.com/manpremo
Please include error messages and screenshots.
Starting since some days ago a working pipeline as stopped to work because. Inside my tasj I'm using a postgres service and now the service is not more resolved at network level.
I have create a simple pipeline where I try to connect to postgres and I get this error:
psql: could not translate host name "postgres" to address: Name or service not known
resources:
containers:
- container: postgres
image: postgres:latest
- container: u18
image: ubuntu:18.04
options: '-v /usr/bin/sudo:/usr/bin/sudo -v /usr/lib/sudo/libsudo_util.so.0:/usr/lib/sudo/libsudo_util.so.0 -v /usr/lib/sudo/sudoers.so:/usr/lib/sudo/sudoers.so -v /etc/sudoers:/etc/sudoers'
stages:
- stage: xxx
jobs:
- job: yyy
container: u18
services:
postgres: postgres
steps:
- script: |
sudo apt update
sudo apt install -y postgresql-client
psql --host=postgres --username=postgres --command="SELECT 1;"
Hello, Can anybody help me here?
@gpad - I will dig in. Was this working previously? If you do not run the job on the host itself instead of in a container can you resolve the hostname 'postgres'?
It stopped working and it works like a charm for months. What do you mean
with “not run the job on the host itself”?
Il giorno mar 3 mar 2020 alle 21:04 Tommy Petty notifications@github.com
ha scritto:
@gpad https://github.com/gpad - I will dig in. Was this working
previously? If you do not run the job on the host itself instead of in a
container can you resolve the hostname 'postgres'?—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/microsoft/azure-pipelines-agent/issues/2803?email_source=notifications&email_token=AAE427QNBGJJM563WQSLKHTRFVPGHA5CNFSM4KXGYUX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOENU6JQY#issuecomment-594142403,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAE427V5CYYUREVUYXOGJCDRFVPGHANCNFSM4KXGYUXQ
.
@gpad - Sorry, I did not word that properly. What I was trying to ask is if the job runs on the host and not inside a container, are you able to resolve the hostname 'postgres'? I am not aware of any change in the agent itself around how containers are setup that would cause a change in behavior and am wondering if the change was in docker itself not making the host alias visible inside another container. Is this on a hosted or private agent?
@gpad - I was able to setup a repro. Let me dig in and find out what is going on.
Ok, I thought I had a repro, but I think I was doing something incorrectly.
I have this pipeline and it works fine for me. (I had to use node:10.17 instead of ubunutu because I am on a Mac an needed an image with node pre-installed to work in the pipeline)
resources:
containers:
- container: u18
image: node:10.17
- container: nginx
image: nginx:1.17.6-alpine
ports:
- 80
stages:
- stage: xxx
jobs:
- job: yyy
container: u18
services:
mynginx: nginx
steps:
- script: |
curl mynginx:80
Is it possible that it’s related to Postgres? I have attached an example in
the issue wher Postgres doesn’t work.
Can you substitute ngnix with Postgres and try to ping it just to see if it
resolve the name?
Il giorno mer 4 mar 2020 alle 22:11 Tommy Petty notifications@github.com
ha scritto:
Ok, I thought I had a repro, but I think I was doing something incorrectly.
I have this pipeline and it works fine for me. (I had to use node:10.17
instead of ubunutu because I am on a Mac an needed an image with node
pre-installed to work in the pipeline)resources:
containers:
- container: u18
image: node:10.17- container: nginx
image: nginx:1.17.6-alpine
ports:
- 80
stages:
- stage: xxx
jobs:
- job: yyy
container: u18
services:
mynginx: nginx
steps:
- script: |
curl mynginx:80
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/microsoft/azure-pipelines-agent/issues/2803?email_source=notifications&email_token=AAE427WNSQHSVA2533KMD3LRF27Y7A5CNFSM4KXGYUX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEN2J4TY#issuecomment-594845263,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAE427SPWXWYJVZOWZWW2FTRF27Y7ANCNFSM4KXGYUXQ
.
Ok, I tried this:
resources:
containers:
- container: u18
image: node:10.17
- container: nginx
image: nginx:1.17.6-alpine
ports:
- 80
- container: postgres
image: postgres:latest
ports:
- 5432
stages:
- stage: xxx
jobs:
- job: yyy
container: u18
services:
mynginx2: nginx
mypostgres: postgres
steps:
- script: |
curl mynginx2:80
ping -c 1 mynginx2
ping -c 1 mypostgres
I got this result:
64 bytes from nginx_nginx1176alpine_b7b37e.vsts_network_a51b35fbbc9c4ffc96131039c7b1adef (172.31.0.3): icmp_seq=1 ttl=64 time=0.104 ms
--- mynginx2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.104/0.104/0.104/0.000 ms
ping: mypostgres: Temporary failure in name resolution
Ok so it doesn't work only with postgres ?!?!
Il giorno mer 4 mar 2020 alle ore 22:35 Tommy Petty
notifications@github.com ha scritto:
>
Ok, I tried this:
resources:
containers:
- container: u18
image: node:10.17- container: nginx
image: nginx:1.17.6-alpine
ports:
- 80
- container: postgres
image: postgres:latest
ports:
- 5432
stages:
- stage: xxx
jobs:
- job: yyy
container: u18
services:
mynginx2: nginx
mypostgres: postgres
steps:
- script: |
curl mynginx2:80
ping -c 1 mynginx2
ping -c 1 mypostgres
I got this result:
64 bytes from nginx_nginx1176alpine_b7b37e.vsts_network_a51b35fbbc9c4ffc96131039c7b1adef (172.31.0.3): icmp_seq=1 ttl=64 time=0.104 ms
--- mynginx2 ping statistics ---
1 packets transmitted, 1 received, 0% packet loss, time 0ms
rtt min/avg/max/mdev = 0.104/0.104/0.104/0.000 ms
ping: mypostgres: Temporary failure in name resolution—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
I get the same issue here. It doesn't seem to matter whether I try to connect from the host or from another container, the alias doesn't get recognised. I ended up having to specify the IP address directly, which used to show in the pipeline output, but that's not a good solution.
@gpad - as a slight aside ours stopped working on a recent image update and we had to specify the postgres password as an environment variable the container:
- container: postgres
image: postgres:latest
ports:
- 5432:5432
env:
POSTGRES_PASSWORD: 'postgres'
as per https://hub.docker.com/_/postgres
@jtpetty - any ideas why the alias doesn't work for the Postgres container?
@rnphilp - in my local dev environment, I had docker dump out all the info about the container and I can see the network alias is set properly (according to docker), so I do not think this is a bug with the agent. It is possible this is a bug with docker, but that would not explain why it works for nginx but not Postgres in the same pipeline.
Is it possible for you to use an older tag of the Postgres container (something not 'latest') that would point to a previously working version? That would tell us if the issue is related to something specific to the Postgres container.
Ours points at postgres:11. However, I've tried going back to the 11.5 release. The ip now displays in initialize containers again, under the command:
/usr/bin/docker port e477790f988520840af51be6841ce6a7e3d3239713ebc187d3562ec3177f5038
Unlike the latest build of 11. However, the alias still isn't recognised.
Also tried postgres:10.0 but get similar results.
I also have tried with an old tag and it doesn’t work.
I have used this configuration since last September and it works until 1
month ago.
It starts to don’t work without any apparent reason I think that it’s
related to something changed inside the azure platform because nothing else
changed.
Il giorno ven 6 mar 2020 alle 18:12 Rob Philp notifications@github.com ha
scritto:
Ours points at postgres:11. However, I've tried going back to the 11.5
release. The ip now displays in initialize containers again, under the
command:
/usr/bin/docker port
e477790f988520840af51be6841ce6a7e3d3239713ebc187d3562ec3177f5038
Unlike the latest build of 11. However, the alias still isn't recognised.
Also tried postgres:10.0 but get similar results.—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/microsoft/azure-pipelines-agent/issues/2803?email_source=notifications&email_token=AAE427XU7FIAKTLXIXYVOLDRGEVHLA5CNFSM4KXGYUX2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEOCDYAY#issuecomment-595868675,
or unsubscribe
https://github.com/notifications/unsubscribe-auth/AAE427UWBJYARFN5CH23JXLRGEVHLANCNFSM4KXGYUXQ
.
I'm running a docker-compose inside a pipeline with a postgres image and have been suffering from a similar issue. Switching from postgres:11.7-alpine to postgres:11.5-alpine appears to have solved the issue, at least temporarily.
I was facing the same issues. For me using the bitami/postgesql image worked, but only if I started it from a bash step wait a couple of seconds and do the rest of work. I am using something like this as a workaround
steps:
- bash: |
docker stop test-db
docker rm test-db
docker run --name test-db -p 5432:5432 -d -e POSTGRES_PASSWORD=postgres -e POSTGRES_DB=some_db bitnami/postgresql@sha256:d9a5484c3fecd47525bd1a0775327497090295f5a31bc27b7066fac3b9cb8729
sleep 10
- bash: |
tests that need the postgres db
- bash: |
docker stop test-db
docker rm test-db
Not really a solution, but a workaround
@mjroghelia - Assigning to you for awareness. I think we have come to the conclusion that the issue lies with the Postgres container itself, but I think we should leave this issue open a little longer to collect more information.
This issue has had no activity in 180 days. Please comment if it is not actually stale
Wanted to let people that find this know that this seems to work now. Hostname was properly assigned to a postgis container (postgres+spacial types), however we ran into a dependency issue #3130
I ran into the issue as well. I was able get getent hosts pg12 working:
- container: pg12
image: postgres:12-alpine
env:
POSTGRES_DB: "xxxx"
POSTGRES_USER: xxxx"
POSTGRES_PASSWORD: "xxxxx"
PGDATA: "/data/postgres"
options: --name pg12
My unit tests do not find it, though. I will investigate further.
It could be that the container is crashing immediately.
Anyhow, this is how I got it working:
trigger: none
resources:
containers:
- container: python38
image: "python:3.8"
- container: pg12
image: "postgres:12"
env:
POSTGRES_USER: <snip>
POSTGRES_PASSWORD: <snip>
PGDATA: "/data/postgres"
ports:
- 5432
jobs:
- job: "Run_Test_Suite"
pool:
vmImage: 'ubuntu-20.04'
services:
pg12: pg12
steps:
- script: |
printenv
displayName: "print environment variables"
target:
container: python38
# workaround for pg12 container not being visible when running tests. stop and start again
- task: Docker@2
inputs:
command: 'stop'
container: 'pg12'
displayName: "stop pg12 container"
- task: Docker@2
inputs:
command: 'start'
container: 'pg12'
displayName: "start pg12 container again"
- script: |
getent hosts pg12
displayName: "show ips of docker service containers"
target:
container: python38
# this is here just for debugging purposes if the tests fail because of name resolution errors
- script: |
docker logs pg12
displayName: "fetch pg12 logs before test"
- script: |
<snip>
displayName: "running tests"
target:
container: python38
This issue is still persistent with postgres:13 and its more a problem of readiness i would say, because docker container is started only right before the first tag? is executed and not yet ready. Why does azure devops not make sure that the host is registered in the network and listening on exposed port before continued.