For a "Deploy Manifest" stage in our CD pipelines, we define notifications to be sent to slack for stage start,fail, or complete. Recently, after some changes to the pipelines that did not modify notification settings, the stage start notification fires as expected, but when the stage completes, no notification is sent.
Google Cloud Platform
GKE, Kubernetes v2 Provider, Spinnaker 1.13.6
@spinnaker/ui-ux-team):Notifications, Pipelines
Expected behavior - all configured notifications for stage start, fail and complete fire when appropriate.
Actual behavior -
The issue appears to only affect the last stage in the pipeline, 'Deploy to Release', a Deploy Manifest stage. All notifications for previous stages work as expected, as do the "Stage start" notification for this stage. This final stage also still notifies correctly in other pipelines that come from the same base template, the only difference between the affected pipelines and the unaffected are the recent inclusion of a deleteManifest stage and a checkPreconditions Stage that execute prior to the affected stage. It is unclear if this is a causal relationship, or merely coincidence.
Echo logs and do not show a failure in sending the notification to Slack, rather, no attempt is made (or at least, not logged).
Manually trigger pipeline or via PubSub message.
Json of the offending stage:
{
"account": "gke-cluster",
"cloudProvider": "kubernetes",
"manifestArtifactAccount": "github-artifact-account",
"manifestArtifactId": "service-deployment-manifest",
"moniker": {
"app": "service"
},
"name": "Deploy Release",
"notifications": [
{
"address": "spinnaker",
"level": "stage",
"message": {
"stage.failed": {
"text": "Commit: somegithubdetailshere....}"
},
"stage.starting": {
"text": "Github Diff: somemoregithubdetailshere}"
}
},
"type": "slack",
"when": [
"stage.starting",
"stage.complete",
"stage.failed"
]
}
],
"requiredArtifactIds": [
"service-image"
],
"sendNotifications": true,
"source": "artifact",
"stage": "release",
"target-namespace": "${parameters[\"namespace-release\"]}",
"target-stage": "Production",
"type": "deployManifest"
}
Json of the recently included stages in the broken pipelines:
{
"account": "gke-cluster",
"cloudProvider": "kubernetes",
"kinds": [
"Job"
],
"location": "${parameters[\"namespace-release\"]}",
"manifestName": "Job ${#stage(\"Predeploy Job job-name\").outputs.artifacts.?[type == \"kubernetes/job\"][0].reference}",
"name": "Cleanup Job job-name",
"options": {
"cascading": true
},
"refId": "service-pre-deploy-0-cleanup",
"requisiteStageRefIds": [
"service-pre-deploy-0"
],
"type": "deleteManifest"
},
{
"name": "Check job-name Success",
"preconditions": [
{
"context": {
"expression": "${#stage(\"Predeploy Job job-name\").status.toString() == 'SUCCEEDED'}"
},
"failPipeline": true,
"type": "expression"
}
],
"refId": "service-pre-deploy-0-gate",
"requisiteStageRefIds": [
"service-pre-deploy-0"
],
"type": "checkPreconditions"
},
Hi @noralutz , I have started working on this.
As discussed, pls let me know, whether u r still facing the issue with the latest version.
This issue hasn't been updated in 45 days, so we are tagging it as 'stale'. If you want to remove this label, comment:
@spinnakerbot remove-label stale
Most helpful comment
Hi @noralutz , I have started working on this.