Airflow is a platform to programmatically author, schedule and monitor workflows (called directed acyclic graphs–DAGs–in Airflow). When we first adopted Airflow in late 2015, there were very limited security features. This meant that any user that gained access to the Airflow UI could query the metadata DB, modify globally shared objects like Connections and Variables, start or stop any DAG, mark any failed TaskInstance success and vice-versa, just to name a few. This may not seem disconcerting when you are using Airflow to schedule your morning brew; but for mission-critical jobs, we certainly did not want any user to have the ability to tamper with them. So in order to use Airflow in production, we removed UI access from all end users by guarding it with an Nginx server. Using Airflow without its UI is undoubtedly inconvenient. So the next thing we began to work on is role-based access control (RBAC) for Airflow. (During this time, several improvements were made on security, including adding an action logging feature and creating a hard-coded naive RBAC implementation. However, the action logging was passive rather than preemptive, and the native RBAC implementation still allowed read and write access to DAGs for all roles, so they didn’t address our security concerns.)

ACL vs RBAC: Security Model Showdown

There are two general security approaches when it comes to restricting user access: access control list (ACL) and role-based access control (RBAC). ACL requires explicitly defining a list of permissions to a user or object, and is commonly found in operating systems, networking, and some databases. For example, file system ACLs allow read/write/execute access to be given to specific users or groups for specific files; networking ACLs contain rules to filter traffic based on IP address and port. On the other hand, RBAC creates an abstraction between users and permissions: users are assigned to roles and roles are given permissions. Modifying the permissions of a single role will propagate to all users assigned to that role. When granular permissions are needed, ACL makes sense. However for platforms like Airflow where it is often centrally administered but widely used by many teams, RBAC is the better approach for its ease of maintenance. (There is another newer security model, Attribute-Based Access Control (ABAC), that has the benefit of RBAC but with more fine-grained access control by allowing additional constraints based on policies and attributes, but the added benefits come at a cost of additional complexity for features that we didn’t quite need at the time.)

My colleague Kalpesh Dharwadkar from our security team designed an RBAC proposal that describes two level of access control for Airflow: view-level access control and DAG-level access control. The rest of this blogpost discusses the implementation of view-level access control.

RBAC Implementation: So Many Choices, So Little Time

Airflow is written in Python and its webserver is built with Flask-Admin. To implement RBAC, there were four potential approaches being considered:

  • Migrate to Django: Django has been around much longer, it comes with a built-in user authentication system and a more mature ecosystem of extensions. However, the amount of migration work would be substantial relative to the marginal gain.

  • Implement it from scratch: This approach allows us to build a customized RBAC system tailored to Airflow, and gives us the flexibility to evolve this feature over time. This comes at the cost of a huge duplication of effort since there are a lot of libraries and frameworks out there we could have leveraged. Given that the RBAC security model we envisioned was fairly generic, implementation from scratch was not that appealing.

  • Use a Flask extension for RBAC: A few flask extensions have been written specifically to handle RBAC, such as Flask-RBAC and Flask-Principal. Both of these extensions require annotating permissions with predefined roles, but the ideal RBAC implementation should be configurable – it should allow admins to create new roles and modify permissions on existing roles according to their business needs.

  • Switch from Flask-Admin to Flask-Appbuilder: Flask-AppBuilder (FAB) is a micro-framework similar to Flask-Admin, but what makes FAB appealing is its built-in configurable RBAC system, session management, and integration with various authentication backends. FAB is currently used by Apache-Superset with proven success.

Ultimately, we decided to go with FAB. FAB would handle all the heavy lifting, while we would focus on getting the feature integrated and shipped into Airflow.

FAB’s Security Manager to the Rescue

Under the hood, FAB has a built-in Security Manager. When you instantiate an app with FAB, the default Security Manager will initialize all security-related models. The relationships of these entities are shown below:

FAB security model

Given that both Flask-Admin and FAB frameworks were inspired by Django and followed a similar convention, the migration process is straightforward in theory. Models didn’t need to be touched at all since both frameworks supported SQLAlchemy for their ORM. Most of the work were around view migration. By swapping to FAB’s BaseView and ModelView, we get all the Permissions and ViewMenus created for us via the Security Manager.

The Security Manager will insert an entry in the ViewMenu table for every class that inherits from BaseView, and an entry in the Permission table for every url routing method that is annotated with @has_access. Finally it will create a row in the PermissionView table to represent the mapping between them.

# Example of migrating a class that inherits from BaseView

# Flask-Admin
from flask_admin import BaseView, expose
Class Airflow(BaseView):
    @expose("/trigger")
    def trigger(self):
        # implementation details ...

# FAB
from flask_appbuilder import BaseView, expose, has_access
Class Airflow(BaseView):
    @has_access
    @expose("/trigger")
    def trigger(self):
        # implementation details ...

ModelViews inherits from BaseView. Aside from the ViewMenu, it automatically creates a set of CRUD methods under the hood and their corresponding permissions: can_view, can_show, can_add, can_edit, can_delete, can_download.

# Example of migrating a class that inherits from ModelView

# Flask-Admin
from flask_admin import ModelView
Class ConnectionView(ModelView):
    # implementation details ...

# FAB
from flask_appbuilder import ModelView
from flask_appbuilder.models.sqla.interface import SQLAInterface
Class ConnectionView(ModelView):
    datamodel = SQLAInterface(models.Connection)
    # implementation details ...

Migration Hiccups

Aside from swapping the BaseViews and ModelViews, a few other things had to be taken care of:

  • Converting view attributes so they are compatible with FAB. Most of these convertions were trivial since they only differ by data type or naming convention. However, a few attributes were missing in FAB, such as the ability to prefill and process forms before submissions. To reach feature parity with the existing UI, we ended up adding these features to FAB as a way of contributing back to both open-source communities.
  • Migrating all custom HTML templates so they inherit from FAB instead of Flask-Admin. Since each framework has its own set of templates and macros, we had to modify the HTML/CSS/JavaScript files to allow views to be rendered successfully.
  • Making sure views supported all existing Airflow models. We discovered that some SQLAlchemy features were missing in FAB during this process, including support for binary columns (i.e. XCom values are blobs), composite primary key (i.e. TaskInstance in Airflow has a primary key of (dag_id, task_id, and execution_date)), and SQLAlchemy Custom Types (i.e. Airflow uses a custom datetime type with timezone support). We made several patches to FAB in order to resolve these issues.

Once the patches were successfully committed and released in FAB, we had a working Airflow UI with RBAC support.

Providing Security (No Pun Intended) with Backward Compatibility

Note that up to this point, the code was all written in a separate repository as a proof-of-concept. We received a lot of valuable feedback from the community and resolved many issues and bugs in advance. Having a separate repository also allowed us to speed up our development cycle significantly. However, given how fast Airflow itself is evolving, maintaining this alternative UI as a separate repository would become a wild goose chase in the long run. It would also add unnecessary frictions to discoverability and adoption. We had to merge it back into Apache-Airflow, but do so in a thoughtful and efficient manner.

A mantra of software development is backward-compatibility. Airflow as an open-source project means we cared about this even more. We wanted to give users sufficient time to migrate over to the new UI. This means swapping one framework with another in one-shot is unlikely. Instead we want to have both UIs available in parallel for a short period before fully deprecating the legacy UI. All existing UI-related code are conveniently located in www/ directory. To avoid cluttering the code base with boolean flags on whether RBAC is enabled or not, we made all UI-related changes in a new directory called www_rbac/, and created a separate Flask app dedicated for RBAC. (During this migration, data profiling features have been removed to enhance Airflow security. There are better toolings out there for db analytics, such as Apache-Superset.)

Default Role-Permission Mapping for the Lazy

Once the migration was done, the next step was to configure what the default security setup looked like. FAB itself comes with two roles by default: public and admin, where public has no permissions at all and admin has the full set of permissions. Three additional roles were added on this spectrum.

  • Viewer: this is for users without DAG ownership. They have read access to DAGs, but cannot perform any action that could potentially change the state of the database.
  • User: this is for users with DAG ownership. They have both read and write access to DAGs, therefore can perform actions such as starting/stopping scheduled DAGs, running DAGs ad-hoc, clearing historical DagRuns or marking DagRuns as success/failure, etc.
  • Op: this is for devops that handle Airflow deployment and maintain its uptime. They have access to airflow configuration files via the UI, and can modify shared objects like Variables and Connections.

If your business needs are different, you can modify existing role-permission mappings or create new roles and assign them with custom permissions. This can be done via a custom script, or in the UI under the Security tab.

Drumroll…It’s Merged!

We are happy to announce that the RBAC feature for Airflow has been merged into master, and is planned to be rolled out as a part of the Airflow 1.10 release. Some of the authentication backend integrations that FAB supports are in the process of being tested before we feel comfortable making this feature the de-facto UI. If you are adventurous and want to test out RBAC, it is as simple as setting rbac = True in your airflow.cfg file and creating an admin user with the command airflow create_user. See Updating.md for advanced settings.

What’s next?

Aside from battle-testing RBAC to be production-ready, the main upcoming initiative is DAG-Level access control. This feature will introduce an additional level of granularity which allows DAGs to be available to roles that are only specified in the DAG files. This has been a top feature request and is being worked on by the Airflow Community. If you want to learn more or contribute, checkout us out on GitHub and join us on our dev mailing list.