Hello there, everyone. :)
Apache Airflow version: 1.10.9, 1.10.10, trunk
What happened:
Password masking was added to SparkSubmitOperator (SparkSubmitHook, to be precise) in December 2019 (under AIRFLOW-6350; PR: #6917) - but it only masks passwords as long as they are in the --foo.password='value' form; i.e. it must be put in single-quotes and be joined with the argument's name via an equal sign.
What you expected to happen:
I would expect the forms a) with double-quotes or with no quotes at all b) with whitespace instead of an equal sign to also be covered by this mechanism, e.g.
--foo.password=value--foo.password="value"--foo.password 'value'--foo.password value--foo.password "value"But I may be missing something. Is there any reason the initial version only covers the single-quoted-with-equal-sign form? The regular expression used in the masking code (1.10.9 version, trunk version) looks pretty intentional:
def _mask_cmd(self, connection_cmd):
# Mask any password related fields in application args with key value pair
# where key contains password (case insensitive), e.g. HivePassword='abc'
connection_cmd_masked = re.sub(
r"(\S*?(?:secret|password)\S*?\s*=\s*')[^']*(?=')",
r'\1******', ' '.join(connection_cmd), flags=re.I)
How to reproduce it:
from airflow.contrib.operators.spark_submit_operator import SparkSubmitOperator # Airflow 1.10.9
dag = DAG(...)
SparkSubmitOperator(
...,
conf={"spark.foo.password": "this_should_get_masked_but_it_doesnt"},
dag=dag,
)
Running such a task will leak the password into Airflow logs.
Anything else we need to know:
Again, I may be missing something, e.g. sth OS-specific. I'd be happy to learn something here. :)
In case all/part of the other forms I mentioned should also get the masking treatment, I have a change ready for opening a PR.
(Note there's no JIRA issue referenced in the commit messages: I cannot create issues in Airflow's Jira for some reason)
Thanks for opening your first issue here! Be sure to follow the issue template!
@Unit03 pls raise the PR. My original PR that you mentioned was very crude (as you noticed!), but better than nothing :) Was done in that format because all my dags use that 1 specific format of sending conf
All right, then, PR opened. :)
can this be closed?
Looks like!
FYI @Unit03. You can put Closes #ISSUE in the commit message and it will close related issue at merge :).