Sign in
Categories
Your Saved List Partners Sell in AWS Marketplace Amazon Web Services Home Help

Apache AirFlow Quickstart v1.10.13

Cloudup | 1.10.13

Linux/Unix, Amazon Linux Amazon Linux 2 2.0 LTS - 64-bit Amazon Machine Image (AMI)

Reviews from AWS Marketplace

0 AWS reviews
  • 5 star
    0
  • 4 star
    0
  • 3 star
    0
  • 2 star
    0
  • 1 star
    0

External reviews

27 reviews
from G2

External reviews are not included in the AWS star rating for the product.


    Utilities

Best orchestrating data pipeline scheduler

  • August 16, 2021
  • Review verified by G2

What do you like best?
The best part I like it supports Python, which makes our programmatic aspect straightforward to use.
SLA and DAGs are easy to setup.
What do you dislike?
I think there is room for improvement in WEB UI.
What problems are you solving with the product? What benefits have you realized?
We are creating an automated workflow for the data pipeline by using Python and use SLA. Earlier, we were using Linux based cron, but it was later too hectic to manage and difficult to define SLA.
Recommendations to others considering the product:
Look for your use case


    Giridhar P.

Apache airflow is an amazing data extraction, processing and scheduling tool!!!

  • August 06, 2021
  • Review verified by G2

What do you like best?
Apache airflow is a freeware that executes components written in python modules. The written python modules are called DAGs. The DAGs need to configured with appropriate connections for successful execution. The connections and variables are easily configurable. The User Interface aspect of the tool makes the visualization the most attractive aspect of the tool. The triggering of the respective DAGs, status of execution, success and failure notification in green and red colors etc are some of the best aspects of Apache airflow. The configuration of variables in the form of json files are easy and straightforward. The different instances of executions along with the individual DAGs statuses historically enable the users to track the logs successfully. The ability to backtrack the logs for all these instances makes this tool a commendable one. Functions-based categorization of the functionality is flexible.
What do you dislike?
Apache airflow contains an intensive documentation that needs to be read and reviewed to ensure that the configuration works per your needs. Knowledge of python as a prerequisite is needed. Familiarity with Json file and format is required to some extent. The tool might create an impression of being a bit complex to many users who don't have any background on Python. The formatting of tasks, hooks etc could be a bit complex. Not much of examples are available on open forums and exploring the solution for an end to end functionality could be a bit challenging for new users.
What problems are you solving with the product? What benefits have you realized?
Apache airflow is being used in my organization to perform data integration in AWS S3 region. The tool has enabled me to connect successfully to a relational database, perform data extracts and compile the accumulation of extracts in multiple flat file segments. The DAG components are conducive to build and the dependency of tasks within DAGs are easy to be specified. The end to end ETL functionality was implemented successfully using Apache airflow. The tool provides flexibility to trigger different DAGs at any time during the day and ensures that the logs are traced successfully. The tool can be configured successfully on either AWS environment or docker container. It was configured in both environments successfully within my organization and this flexibility is what makes it the best.
Recommendations to others considering the product:
If you want an ETL tool that requires some programming, this tool could be your first choice. The coding in Airflow is not GUI based and hence may not be conducive to all the ETL engineers. However, it could serve all your basic needs. The ability to connect to heterogeneous databases and perform a comparison across these datasets is a provision that is yet to arrive.


    Computer Software

Amazing open source tool. Makes my work so much easier

  • May 03, 2021
  • Review verified by G2

What do you like best?
I like the ease of use, how easily I can make DAGs and the scheduler is just amazing. I have a lot of python scripts that I needed to run manually and daily. Airflow just made it seamless.
What do you dislike?
The initial setup is a bit overwhelming to me.
What problems are you solving with the product? What benefits have you realized?
We have an ElasticSearch Index and a Redis database.
We have build an app to move data from elasticsearch index to redis database. We had to do this daily or weekly, but by deploying the container again and again.

What we did with Airflow is we made this excercise into a task and setup an Airflow scheduler for this. Now it runs automatically based on our schedule.
Recommendations to others considering the product:
Go for it. Replace your cron jobs with this, you'll be fine.


    Matt P.

Stable, highly extensible platform to run DAGs from

  • September 22, 2020
  • Review verified by G2

What do you like best?
I enjoy the ease of installation, and the robustness of the platform. The UI is clean, though can suffer from performance issues depending on your workload. When internal errors occur with the code, Airflow makes it easy to track down the cause of the issue and re-run or cancel jobs as needed.
What do you dislike?
Depending on how you set it up, it is very easy to bite yourself hard down the line. Partnering with a company that configures airflow for a living can ease this pain point. They do not handle timezone changes gracefully, instead leaving it as an exercise for the user. Plan to use UTC or build your jobs to individually handle daylight savings time if it occurs in your region.
What problems are you solving with the product? What benefits have you realized?
We are using Airflow to run over 1000 DAGs. These are used to feed our multi petabyte Datalake. By using the Airflow platform, we have empowered our devs to achieve very high levels of self-sufficiency and NoOps.
Recommendations to others considering the product:
I recommend you chart out your planned usage for the next year when installing your initial configuration. While maintenance is relatively easy after initial setup, you will need to do your due diligence with regards to the platform you plan on deploying on (AWS EKS, GCP, on-prem, etc).


    Pavan C.

DAGs can be frustrating

  • July 05, 2019
  • Review provided by G2

What do you like best?
Automation is good. No need to worry about keeping your laptop on or order in which things should run. Slightly better than a cron job.
What do you dislike?
The irritating UI. The scheduler. Why would you not run the first time when I specify a start date ? Why at the end of the interval. Why is pausing and enabling two different things, why is the UI so 90s in the age of Material design and what not. The scheduler's behavior is random, sometimes tasks are scheduled quickly and run in seconds while other times I gotta wait till hours
What problems are you solving with the product? What benefits have you realized?
Scheduled runs of a specific program on daily, weekly, hourly etc. basis
Recommendations to others considering the product:
DAGs are gonna ruin your life, but it's the only way to make this thing work.


    Ina C.

Does the job perfectly

  • February 26, 2019
  • Review provided by G2

What do you like best?
With Apache Workflow it's so easy to schedule and monitor workflows! It visually shows everything in a chart or graph depending on your personal preferences which makes everything easy to analyze the workflow because you can see progress and everything. It is also versatile and you will be able to modify a lot of settings to fit your preferences.
What do you dislike?
It's pretty complex to implement initially but definitely worth it. Takes some time and patience and has a learning curve especially if you've never used similar software before.
What problems are you solving with the product? What benefits have you realized?
To schedule and monitor different workflows
Recommendations to others considering the product:
Absolutely worth it!


    Paul H.

Great program for streamlining data processing

  • February 21, 2019
  • Review provided by G2

What do you like best?
Easy to tailor work flow programs for the appropriate situation.
What do you dislike?
Potential pitfalls with data security are a concern.
What problems are you solving with the product? What benefits have you realized?
Research and development. Ease of developing workflows for incoming professionals.
Recommendations to others considering the product:
Apache Airflow will make workflow development much easier for new employees.


    Internet

excellent

  • February 21, 2019
  • Review provided by G2

What do you like best?
It's a fluid platform, easy to use/ user friendly.
What do you dislike?
Nothing to dislike. Easy to operate and runs smoothly
What problems are you solving with the product? What benefits have you realized?
Scheduling is easier


    Sarah H.

Good product for businesses

  • February 21, 2019
  • Review provided by G2

What do you like best?
I like that it makes my life more effiicient and is easy to use
What do you dislike?
I don't really dislike anything about it
What problems are you solving with the product? What benefits have you realized?
It allows us to serve our customers in a more timely and effiicent manner


    Seth E.

Apache Airflow easy to use, open source scheduler

  • February 21, 2019
  • Review provided by G2

What do you like best?
This program is hard to begin using, but after learning and using it, I can say myself it’s so useful. It’s easy to track information like job statuses and track failures. The UI is easy and simple to follow
What do you dislike?
There is no real security to this program, so that could be an issue. The learning curve could also be a downside because it takes a little while to learn and use.
What problems are you solving with the product? What benefits have you realized?
We use Apache Airflow to help organize with the use of Python. This program helps with organization
Recommendations to others considering the product:
You need to make sure people know Python