Temurin-build: make-adopt-build-farm API query fail

Created on 12 Jun 2020  路  11Comments  路  Source: adoptium/temurin-build

Platform:
https://ci.adoptopenjdk.net/job/build-scripts/job/jobs/job/jdk/job/jdk-windows-x64-openj9/24/console

I'm hoping this is just a temporary glitch. If it isn't, the purpose of this issue is to fix and print a more helpful error. Otherwise, just the latter is fine

13:26:13  + ./build-farm/make-adopt-build-farm.sh
13:26:13  This appears to be JDK Head. Querying the Adopt API to get the JDK HEAD Number (https://api.adoptopenjdk.net/v3/info/available_releases)...
13:26:13  Failed to query or parse the adopt api
bug

Most helpful comment

@karianna that log is from a new pipeline that was kicked off by me today. All platforms were kicked off together - not all of them failed. My PR should help us diagnose exactly what went wrong with the call as it looks to me as though the amount of data receieved by the API operation (2909) exceeded what I would expect the API to return (308 based on running it just now).

My PR above should hopefully show us what error code the API call we really made to retrieve the data came back with ...

All 11 comments

manually curling that URL works for me.

The strange part is that curl didn't even run. The expected output should be

13:26:13  + ./build-farm/make-adopt-build-farm.sh
13:26:13  This appears to be JDK Head. Querying the Adopt API to get the JDK HEAD Number (https://api.adoptopenjdk.net/v3/info/available_releases)...
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100   308  100   308    0     0    376      0 --:--:-- --:--:-- --:--:--   376
13:26:13  Failed to query or parse the adopt api

The numbers are missing

Seen a few pipelines running without this problem over the weekend. I'm happy to close this under the assumption it was a glitch and we now have a more useful error message if it reoccurs

Reoccured on https://ci.adoptopenjdk.net/view/Failing%20Builds/job/build-scripts/job/jobs/job/jdk/job/jdk-linux-s390x-hotspot/153/console

14:30:59  This appears to be JDK Head. Querying the Adopt API to get the JDK HEAD Number (https://api.adoptopenjdk.net/v3/info/available_releases)...
14:30:59    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
14:30:59                                   Dload  Upload   Total   Spent    Left  Speed
14:31:00  
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0
100  2909    0  2909    0     0   7571      0 --:--:-- --:--:-- --:--:--  7575
14:31:00  Failed to query or parse the adopt api. Dumping headers via curl -v https://api.adoptopenjdk.net/v3/info/available_releases...
14:31:00    % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
14:31:00                                   Dload  Upload   Total   Spent    Left  Speed
14:31:00  
  0     0    0     0    0     0      0      0 --:--:-- --:--:-- --:--:--     0* About to connect() to api.adoptopenjdk.net port 443 (#0)
14:31:00  *   Trying 104.17.159.60...
14:31:00  * Connected to api.adoptopenjdk.net (104.17.159.60) port 443 (#0)
14:31:00  * Initializing NSS with certpath: sql:/etc/pki/nssdb
14:31:00  *   CAfile: /etc/pki/tls/certs/ca-bundle.crt
14:31:00    CApath: none
14:31:00  * SSL connection using TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256
14:31:00  * Server certificate:
14:31:00  *     subject: CN=adoptopenjdk.net,O="Cloudflare, Inc.",L=San Francisco,ST=CA,C=US
14:31:00  *     start date: Mar 19 00:00:00 2020 GMT
14:31:00  *     expire date: Oct 09 12:00:00 2020 GMT
14:31:00  *     common name: adoptopenjdk.net
14:31:00  *     issuer: CN=CloudFlare Inc ECC CA-2,O="CloudFlare, Inc.",L=San Francisco,ST=CA,C=US
14:31:00  > GET /v3/info/available_releases HTTP/1.1
14:31:00  > User-Agent: curl/7.29.0
14:31:00  > Host: api.adoptopenjdk.net
14:31:00  > Accept: */*
14:31:00  > 
14:31:00  < HTTP/1.1 200 OK
14:31:00  < Date: Mon, 15 Jun 2020 13:31:00 GMT
14:31:00  < Content-Type: application/json
14:31:00  < Content-Length: 308
14:31:00  < Connection: keep-alive
14:31:00  < Set-Cookie: __cfduid=d5e9223cea9bcb827960882d376ba23881592227860; expires=Wed, 15-Jul-20 13:31:00 GMT; path=/; domain=.adoptopenjdk.net; HttpOnly; SameSite=Lax
14:31:00  < Access-Control-Allow-Origin: *
14:31:00  < Set-Cookie: b7b892882bae631693e1ea44963ef628=064339b8dcebf139e60c125241755208; path=/; HttpOnly; Secure
14:31:00  < Cache-control: private
14:31:00  < CF-Cache-Status: DYNAMIC
14:31:00  < cf-request-id: 0359c626550000c5fc84352200000001
14:31:00  < Expect-CT: max-age=604800, report-uri="https://report-uri.cloudflare.com/cdn-cgi/beacon/expect-ct"
14:31:00  < Server: cloudflare
14:31:00  < CF-RAY: 5a3ca61d593ac5fc-EWR
14:31:00  < alt-svc: h3-27=":443"; ma=86400
14:31:00  < 
14:31:00  { [data not shown]
14:31:00  
100   308  100   308    0     0    596      0 --:--:-- --:--:-- --:--:--   596
100   308  100   308    0     0    596      0 --:--:-- --:--:-- --:--:--   596
14:31:00  * Connection #0 to host api.adoptopenjdk.net left intact
14:31:00  
14:31:00  {
14:31:00      "available_lts_releases": [
14:31:00          8,
14:31:00          11
14:31:00      ],
14:31:00      "available_releases": [
14:31:00          8,
14:31:00          9,
14:31:00          10,
14:31:00          11,
14:31:00          12,
14:31:00          13,
14:31:00          14
14:31:00      ],
14:31:00      "most_recent_feature_release": 14,
14:31:00      "most_recent_feature_version": 15,
14:31:00      "most_recent_lts": 11,
14:31:00      "tip_version": 16
14:31:00  }
[Pipeline] }
[Pipeline] // withEnv
[Pipeline] }
[Pipeline] // stage
[Pipeline] }
[Pipeline] // node
[Pipeline] }
[Pipeline] // stage
[Pipeline] echo
14:31:00  Execution error: script returned exit code 1

I wonder if we're being rate-limited by the API as this is becoming an ever more intermittent problem. @johnoliver Is there any rate limit protections enabled on the API?

The API was down a few hours ago, so I suspect this was just bad luck again.

The API was down a few hours ago, so I suspect this was just bad luck again.

I don't think so. I can view the API in browser and locally using curl. This build was only executed 10mins ago too

Hang on though, that call is returning a 200 OK - so is it the step after that's failing?

The step after is here https://github.com/AdoptOpenJDK/openjdk-build/blob/master/build-farm/make-adopt-build-farm.sh#L31-L42
The logic looks ok to me but you may spot something I haven't

@karianna that log is from a new pipeline that was kicked off by me today. All platforms were kicked off together - not all of them failed. My PR should help us diagnose exactly what went wrong with the call as it looks to me as though the amount of data receieved by the API operation (2909) exceeded what I would expect the API to return (308 based on running it just now).

My PR above should hopefully show us what error code the API call we really made to retrieve the data came back with ...

Summary

Our bash scripts are now using retryable API queries but groovy is proving to be an issue as the jenkins helper is difficult to work with in this context. I have made a second attempt via PRs on both the helper side and build script side that can be merged tomorrow morning for testing pending review.

This looks to be fine now. I haven't seen api failures in at least a week. This can be reopened if that changes

Was this page helpful?
0 / 5 - 0 ratings