Airflow: Task exited with return code 1 without any warning/error message after reboot server and restart service

Created on 2 May 2020  路  2Comments  路  Source: apache/airflow

Apache Airflow version: 1.10.10

Kubernetes version (if you are using kubernetes) (use kubectl version): Not using Kubernetes or docker

Environment: CentOS Linux release 7.7.1908 (Core) Linux 3.10.0-1062.el7.x86_64

Python Version: 3.7.6

Executor: LocalExecutor

What happened:

I write a simple dag to clean airflow logs. Everything is OK when I use 'airflow test' command to test it, I also trigger it manually in WebUI which use 'airflow run' command to start my task, it is still OK.

But after I reboot my server and restart my webserver & scheduler service (in daemon mode), every time I trigger the exactly same dag, it still get scheduled like usual, but exit with code 1 immediately after start a new process to run task.

I also use 'airflow test' command again to check if there is something wrong with my code now, but everything seems OK when using 'airflow test', but exit silently when using 'airflow run', it is really weird.

Here's the task log when it's manually triggered in WebUI ( I've changed the log level to DEBUG, but still can't find anything useful), or you can read the attached log file: task error log.txt

Reading local file: /root/airflow/logs/airflow_log_cleanup/log_cleanup_worker_num_1/2020-04-29T13:51:44.071744+00:00/1.log
[2020-04-29 21:51:53,744] {base_task_runner.py:61} DEBUG - Planning to run as the user
[2020-04-29 21:51:53,750] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,754] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,754] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,754] {taskinstance.py:669} INFO - Dependencies all met for [2020-04-29 21:51:53,757] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,760] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,766] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,768] {taskinstance.py:686} DEBUG - [2020-04-29 21:51:53,768] {taskinstance.py:669} INFO - Dependencies all met for

[2020-04-29 21:51:53,768] {taskinstance.py:879} INFO -

[2020-04-29 21:51:53,768] {taskinstance.py:880} INFO - Starting attempt 1 of 2

[2020-04-29 21:51:53,768] {taskinstance.py:881} INFO -

[2020-04-29 21:51:53,779] {taskinstance.py:900} INFO - Executing on 2020-04-29T13:51:44.071744+00:00
[2020-04-29 21:51:53,781] {standard_task_runner.py:53} INFO - Started process 29718 to run task
[2020-04-29 21:51:53,805] {logging_mixin.py:112} INFO - [2020-04-29 21:51:53,805] {cli_action_loggers.py:68} DEBUG - Calling callbacks: []
[2020-04-29 21:51:53,818] {logging_mixin.py:112} INFO - [2020-04-29 21:51:53,817] {cli_action_loggers.py:86} DEBUG - Calling callbacks: []
[2020-04-29 21:51:58,759] {logging_mixin.py:112} INFO - [2020-04-29 21:51:58,759] {base_job.py:200} DEBUG - [heartbeat]
[2020-04-29 21:51:58,759] {logging_mixin.py:112} INFO - [2020-04-29 21:51:58,759] {local_task_job.py:124} DEBUG - Time since last heartbeat(0.01 s) < heartrate(5.0 s), sleeping for 4.98824 s
[2020-04-29 21:52:03,753] {logging_mixin.py:112} INFO - [2020-04-29 21:52:03,753] {local_task_job.py:103} INFO - Task exited with return code 1

How to reproduce it:

I really don't know how to reproduce it. because it happens suddenly, and seems like permanently??

Anything else we need to know:

I try to figure out the difference between 'airflow test' and 'airflow run', it might have something to do with process fork I guess?

What I've tried to solve this problem but all failed:

  • clear all dag/dag run/task instance info, remove all files under /root/airflow except for the config file, and restart my service

  • reboot my server again

  • uninstall airflow and install it again

bug

Most helpful comment

I finally figure out how to reproduce this bug.

When you config email in airflow.cfg and your dag contains email operator or use smtp serivce, if your smtp password contains character like "^", the first task of your dag will 100% exited with return code 1 without any error information, in my case the first task is merely a python operator.

Although I think it's my bad to mess up smtp service, there should be some reasonable hints, actually it takes me a whole week to debug this, I have to reset everything in my airflow environment and slowly change configuration to see when does this bug happens.

Hope this information is helpful

All 2 comments

Thanks for opening your first issue here! Be sure to follow the issue template!

I finally figure out how to reproduce this bug.

When you config email in airflow.cfg and your dag contains email operator or use smtp serivce, if your smtp password contains character like "^", the first task of your dag will 100% exited with return code 1 without any error information, in my case the first task is merely a python operator.

Although I think it's my bad to mess up smtp service, there should be some reasonable hints, actually it takes me a whole week to debug this, I have to reset everything in my airflow environment and slowly change configuration to see when does this bug happens.

Hope this information is helpful

Was this page helpful?
0 / 5 - 0 ratings