Recently, a major vulnerability has been discovered in Apache Airflow, a popular open-source platform to programmatically author, schedule, and monitor workflows, which can lead to serious security risks. The vulnerability is related to improper neutralization of special elements used in an OS command (also known as 'OS Command Injection') in Apache Airflow Spark Provider. This blog post aims to provide a detailed analysis of this vulnerability, its potential impact, and the necessary steps for mitigation.

Background

Apache Airflow is a widely used platform for managing complex workflows and data processing pipelines. One of its key components is the Spark Provider, which allows users to execute Apache Spark jobs seamlessly within their Airflow workflows. However, due to an oversight in the handling of certain special elements, attackers can exploit this vulnerability to read arbitrary files in the task execution context without having write access to DAG files.

Exploit Details

The core issue lies in the way Spark Provider handles user-provided arguments when constructing the Spark command. This can lead to improper neutralization of special elements, allowing the attacker to inject arbitrary OS commands in the task execution context.

For example, consider the following vulnerable code snippet

# Assume we are using Apache Airflow with Spark Provider version >=1.. and <4..
from airflow.providers.apache.spark.operators.spark_submit import SparkSubmitOperator

spark_task = SparkSubmitOperator(
    task_id="spark_task",
    conn_id="spark_default",
    application="",
    additional_files="",
    executor_cores="{{ params.executor_cores }}",
    ...
)

# The executor_cores argument is vulnerable to OS Command Injection

If an attacker can control the value of params.executor_cores, they could potentially execute arbitrary OS commands in the context of the task. This is a serious security issue, as it can lead to unauthorized access to sensitive data or remote code execution.

Upgrade Instructions

1. Upgrade Apache Airflow to version 2.3. or later. You can follow the official upgrade guide here.

2. Upgrade Spark Provider to version 4.. or later. Note that you need to manually install the Spark Provider version 4.. to get rid of the vulnerability on top of Airflow 2.3.+ that has a lower version of the Spark Provider installed. You can run the following command:

pip install "apache-airflow-providers-apache-spark==4.."

Original References

- CVE-2022-40954 - NVD
- Apache Airflow Security Bulletin
- Apache Airflow Upgrade Guide

Conclusion

CVE-2022-40954 is a critical vulnerability that affects Apache Airflow Spark Provider due to improper neutralization of special elements in OS commands. It is highly recommended for users to immediately upgrade their Spark Provider and Apache Airflow installations to the latest versions to mitigate this issue and safeguard their sensitive information and resources.

Timeline

Published on: 11/22/2022 10:15:00 UTC
Last modified on: 11/28/2022 17:51:00 UTC