Listing Thumbnail

    Certified Apache NiFi - Calculated Systems

     Info
    Deployed on AWS
    Apache NiFi is a visually programmed software tool that automates the movement of data between systems. Easily capture and move your data into the cloud - S3, RDS, ElasticSearch, Kinesis, DynamoDB, and Redshift etc - no coding experience necessary.

    Overview

    Apache NiFi is a visually programmed software tool that automates the movement and transformation of data between systems. It enables you to easily capture, move, enrich and transform machine data, Internet of Things (IoT data) and streaming data between systems. Its drag and drop interface enables you to build data pipelines from commercial data feeds, manufacturing equipment, IoT sensors, web servers, and business reporting and moves the data into a variety of systems such as S3, EMR, SQL databases, DynamoDB, Couchbase, MongoDB, HBase, ElasticSearch, HIVE, Kinesis, Postgres MySQL, FTP Servers + even tools such as Snowflake or BigQuery.

    Calculated Systems Apache NiFi in the Cloud is a one-click deployment that automatically launches NiFi in AWS quickly and securely without any coding or complex configuration. This out-of-the-box, optimized deployment of Apache NiFi helps protect you from common pitfalls associated with open source software such as Java virtual machine (JVM) issues and logging configuration by taking care of all initialization, configuration and perimeter security needed. No need to become an expert in big data cloud architecture to migrate or manage your data.

    To learn more about Apache NiFi download our free ebook: Apache NiFI for Dummies: https://www.calculatedsystems.com/nifi-for-dummies  authored by several members of the Calculated Systems Team.

    Highlights

    • A visually programmed software tool that moves machine data/IoT data into the cloud
    • Drag and drop software to move your data into S3, EMR, SQL databases, DynamoDB, ElasticSearch, Kinesis, FTP Servers, Snowflake and more.
    • Easy, one-click installation for a fully functional Apache NiFI instance in AWS in minutes.

    Details

    Delivery method

    Delivery option
    64-bit (x86) Amazon Machine Image (AMI)

    Latest version

    Operating system
    Ubuntu 24.04

    Deployed on AWS

    Features and programs

    Financing for AWS Marketplace purchases

    AWS Marketplace now accepts line of credit payments through the PNC Vendor Finance program. This program is available to select AWS customers in the US, excluding NV, NC, ND, TN, & VT.
    Financing for AWS Marketplace purchases

    Pricing

    Certified Apache NiFi - Calculated Systems

     Info
    Pricing is based on actual usage, with charges varying according to how much you consume. Subscriptions have no end date and may be canceled any time.
    Additional AWS infrastructure costs may apply. Use the AWS Pricing Calculator  to estimate your infrastructure costs.

    Usage costs (8)

     Info
    Dimension
    Cost/hour
    m4.xlarge
    Recommended
    $0.16
    t2.2xlarge
    $0.32
    r4.xlarge
    $0.16
    t2.large
    $0.08
    r4.2xlarge
    $0.32
    m4.2xlarge
    $0.32
    m4.large
    $0.08
    t2.xlarge
    $0.16

    Vendor refund policy

    For refund information please read the eula or contact Info@calculatedsystems.com 

    How can we make this page better?

    We'd like to hear your feedback and ideas on how to improve this page.
    We'd like to hear your feedback and ideas on how to improve this page.

    Legal

    Vendor terms and conditions

    Upon subscribing to this product, you must acknowledge and agree to the terms and conditions outlined in the vendor's End User License Agreement (EULA) .

    Content disclaimer

    Vendors are responsible for their product descriptions and other product content. AWS does not warrant that vendors' product descriptions or other product content are accurate, complete, reliable, current, or error-free.

    Usage information

     Info

    Delivery details

    64-bit (x86) Amazon Machine Image (AMI)

    Amazon Machine Image (AMI)

    An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.

    Version release notes
    • This is the first release of Nifi 2.x
    • Fixed UI Elements
    • Migrated to Ubuntu

    Additional details

    Usage instructions

    A detailed launch guide can be located here - https://www.calculatedsystems.com/getting-started-aws 

    For NiFi specific usage, outside of how to start the AMI please see our ebook Apache NiFi for Dummies - https://www.calculatedsystems.com/nifi-for-dummies 

    Support

    Vendor support

    Migration, Implementation, and Support Services Are Available. Info@calculatedsystems.com 

    AWS infrastructure support

    AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

    Product comparison

     Info
    Updated weekly

    Accolades

     Info
    Top
    10
    In Industrial IoT, Streaming solutions
    Top
    50
    In Data Governance
    Top
    50
    In Data Warehouses, ELT/ETL

    Overview

     Info
    AI generated from product descriptions
    Data Pipeline Automation
    Visual drag-and-drop interface for creating complex data movement and transformation workflows without coding
    Multi-System Data Integration
    Supports data transfer and transformation across diverse systems including cloud storage, databases, streaming platforms, and data warehouses
    IoT and Machine Data Processing
    Specialized capabilities for capturing, moving, enriching, and transforming Internet of Things and machine-generated data streams
    Cloud Platform Deployment
    One-click deployment mechanism for automated, secure initialization of data processing infrastructure in cloud environments
    Data Source Connectivity
    Comprehensive connectivity options for ingesting data from commercial feeds, manufacturing equipment, web servers, sensors, and business reporting systems
    Data Flow Management
    Advanced CLI and dashboard for configuring and monitoring Apache NiFi data workflows
    Security Infrastructure
    Built-in SSL encryption and secure access controls for data protection
    Deployment Speed
    Rapid deployment capability with configuration completed in under 5 minutes
    System Vulnerability Management
    Frequent security updates and proactive vulnerability scanning
    Cloud Integration
    Optimized deployment specifically designed for AWS EC2 cloud environments
    Data Integration Methodology
    Supports both ETL and ELT data integration patterns with codeless visual development interface
    Cloud and On-Premises Connectivity
    Enables connection to hundreds of cloud and on-premises data sources including AWS services, enterprise applications, and databases
    Parallel Data Processing
    Utilizes highly scalable parallel data integration architecture for optimized data loading and processing
    Connector Ecosystem
    Provides multi-tier connectors supporting file systems, databases, cloud storage, and enterprise applications across Tier B, C, and D categories
    FedRAMP Compliance
    Offers FedRAMP-compliant integration services with specific security and regulatory requirements for government cloud environments

    Contract

     Info
    Standard contract
    No
    No

    Customer reviews

    Ratings and reviews

     Info
    5
    1 ratings
    5 star
    4 star
    3 star
    2 star
    1 star
    100%
    0%
    0%
    0%
    0%
    1 AWS reviews
    |
    6 external reviews
    Star ratings include only reviews from verified AWS customers. External reviews can also include a star rating, but star ratings from external reviews are not averaged in with the AWS customer star ratings.
    Danuphan Suwanwong

    Visual workflow offers clarity and boosts data pipeline construction

    Reviewed on Apr 02, 2025
    Review provided by PeerSpot

    What is our primary use case?

    I am implementing the ETL workflow using Apache NiFi  to prepare data and upload it to the cloud. Our use case involves importing data from on-premise and private servers to build a data hub and data mart. The data mart is then published on the cloud.

    How has it helped my organization?

    We primarily use Apache NiFi  for data preparation tasks.

    What is most valuable?

    The visual workflow aspect of Apache NiFi is an invaluable feature as it operates on a no-code platform that allows for easy drag-and-drop pipeline construction. Compared to Airflow , which requires programming before visual representation, Apache NiFi offers clarity in pipeline activities. This feature greatly aids in understanding what the pipeline is doing.

    What needs improvement?

    The logging system of Apache NiFi needs improvement. It is difficult to debug compared to Airflow , where task details and issues are clear. With Apache NiFi, I have encountered processes that die without any traceable error, which might relate to the inadequate logging system.

    For how long have I used the solution?

    I have been working with Apache NiFi for about six months.

    What do I think about the stability of the solution?

    Sometimes, when I run Apache NiFi, processes crash without any clue, which might relate to the logging system. The process can die, and the logs do not show any detail to identify the problem, impacting stability.

    What do I think about the scalability of the solution?

    For scalability, I would rate it an eight. We can run parallel pipelines simultaneously without issues unless memory is full. Scarcity of memory is the only constraint, but processing capabilities allow us to handle much simultaneously.

    How are customer service and support?

    The technical support from the official Apache team is rated a three out of ten. Issues often require self-resolution or community help, as the support isn't effectively managed.

    How would you rate customer service and support?

    Negative

    Which solution did I use previously and why did I switch?

    I have used Airflow before, which required programming first and then visual representation of the workflow.

    What about the implementation team?

    There is another team responsible for setting up Apache NiFi, so I'm not involved in the deployment process.

    What's my experience with pricing, setup cost, and licensing?

    Apache NiFi is open-source and free. Its integration with systems like Cloudera can be expensive, but Apache NiFi itself presents the best pricing as a standalone tool.

    Which other solutions did I evaluate?

    Prior to Apache NiFi, I used Airflow, which differed mainly in its approach to programming and workflow visualization.

    What other advice do I have?

    Overall, I rate Apache NiFi an eight out of ten. I am quite happy with it.

    Which deployment model are you using for this solution?

    On-premises
    Bharghava Raghavendra Beesa

    The tool enables effective data transformation and integration

    Reviewed on Jan 21, 2025
    Review provided by PeerSpot

    What is our primary use case?

    I use NiFi as a tool for ETL, which stands for extract, transform, and load. It is particularly effective for integration methodologies. 

    The tool is useful for designing ETL pipelines and is an open-source product. Data is often stored in different forms and locations. If I want to integrate and transform it, NiFi can help load data from one place to another while making transformations. 

    I can handle stream or batch data and identify various data types on different platforms. NiFi can integrate with tools like Slack  and perform required transformations before loading to the desired downstream. 

    It is primarily a pipeline-building tool with a graphical UI, however, I can also write custom JARs for specific functions. NiFi is an open-source tool effective for data migration and transformations, helping improve data quality from various sources.

    What is most valuable?

    NiFi works on data and file levels, streamlining real-time data processes. It is highly effective for handling real-time data by working with APIs for immediate and continuous data extraction. For real-time data tasks, this front-end UI-based tool is superior to back-end platforms.

    What needs improvement?

    There are some areas for improvement, particularly with record-level tasks that take a bit of time. The quality of JSON data processing could be improved, as JSON workloads require manual conversions without a specific process. 

    Enhancing features related to alerting would be helpful, including mobile alerts for pipeline issues. Integration with mobile devices for error alerts would simplify information delivery.

    What do I think about the stability of the solution?

    The product is stable for simple tasks, like using databases that are not distributed. However, for distributed environments like Hadoop  or HBase , some vulnerabilities exist. While these are not major issues, they should not be ignored.

    What do I think about the scalability of the solution?

    Scaling works well, allowing cluster expansion. However, I have never encountered very large clusters, so it's uncertain how well it supports extensive scaling.

    How was the initial setup?

    The initial setup is fast, especially for communication stabilization. Although the product is open source, it functions as a cluster. For single-node environments, installation is simple. For company-wide or enterprise-level clusters, the initial stages may present issues with authentication and access. Stabilization, such as port communication, may not be immediately effective.

    What other advice do I have?

    I recommend the product for its data privacy features. It allows secure data handling because the data is stored on my nodes. However, a skilled technician is necessary due to the reliance on Java, especially for back-end operations and error debugging. 

    Enterprise versions may offer easier troubleshooting. As an open-source solution, good support is crucial. 

    I rate the overall product as eight out of ten.

    Which deployment model are you using for this solution?

    On-premises
    Teodor Muraru

    Useful to transfer data from one service to another and is user-friendly

    Reviewed on May 22, 2024
    Review provided by PeerSpot

    What is our primary use case?

    We use the tool to transfer data from one service to another. It helps us to migrate data from one department to another. 

    What is most valuable?

    Apache NiFi is user-friendly. Its most valuable features for handling large volumes of data include its multitude of integrated endpoints and clients and the ability to create cron jobs to run tasks at regular intervals.

    What needs improvement?

    The tool should incorporate more tutorials for advanced use cases. It has tutorials for simple use cases. 

    What do I think about the stability of the solution?

    I rate the tool's stability an eight out of ten.

    How are customer service and support?

    I have relied on the documentation available on Apache NiFi's website for support. 

    How was the initial setup?

    I tried to install the tool on my work laptop, and while it worked initially, it started to run slowly after some time. The department that handles the company's databases uses Apache NiFi on proper servers. I tried using it on my laptop to see if it worked, but it ran very slowly and consumed many resources from my machine.

    What's my experience with pricing, setup cost, and licensing?

    I used the tool's free version.

    What other advice do I have?

    I rate Apache NiFi an eight out of ten. 

    Arjun Pandey

    Good monitoring, metrics capabilities and provides ability to design processors with a single click

    Reviewed on Oct 25, 2023
    Review provided by PeerSpot

    What is our primary use case?

    As a DevOps engineer, my day-to-day task is to move files from one location to another, doing some transformation along the way. For example, I might pull messages from Kafka and put them into S3 buckets. Or I might move data from a GCS bucket to another location. 

    NiFi is really good for this because it has very good monitoring and metrics capabilities. When I design a pipeline in NiFi, I can see how much data is being processed, where it is at each stage, and what the total throughput is.

    I can see all the metrics related to the complete pipeline. So, I personally like it very much.

    What is most valuable?

    The good thing about Apache NiFi is that it has a concept called a flow file, and there's something called a flow file processor. The processor is the building block of your entire job. They have close to 500 processors for each purpose. 

    For example, for reading from Kafka, Ni-Fi has a processor called "consumer Kafka". To write to S3, they have a processor called "put S3". Now, if I read from Kafka and write my own application, I'd need to ensure the library I'm using tracks my messages. I'd also need to handle any failures by rereading messages and ensuring acknowledgment. But all this complexity is already handled by Apache processor. 

    They have around 500 processors, with a community investing significant effort into developing them. I can design your processor with a single click, export the entire workflow, and import it. The format is actionable, so NiFi is immediately set up. 

    It's also distributed in nature so that I can scale it across nodes based on the workload. These nodes share their state. If one node goes down during processing, that data might be lost, but any subsequent data is safe. Such occurrences are rare. 

    In essence, if you want a quick solution, Apache NiFi is a strong contender. There are other solutions like AirFlow and some paid pipeline options. 

    AirFlow is open-source but can be complicated. For ETL or ERT solutions, there are pricier options. But if I need a pipeline that I can monitor step by step, Apache NiFi is a good choice. It integrates with Prometheus metrics, allowing me to embed them in my workflow. 

    There's also a processor for integration with Slack, and I can receive notifications when the workflow is completed or fails. 

    Another feature I appreciate is "back pressure," which NiFi handles automatically. It maintains its own queue and addresses back-pressure issues. If, for instance, an upstream entity isn't fast enough, items get stored in a queue, managed internally by NiFi's back pressure algorithm.

    What needs improvement?

    There is room for improvement in integration with SSO. For example, NiFi does not have any integration with SSO. And if I want to give some kind of rollback access control across the organization. That is not possible. 

    So I have to create a separate username and password, and then I have to share it with the individual team. So, that is the pain point to be at the enterprise level.

    For how long have I used the solution?

    I have been using it for one and a half years. 

    What do I think about the stability of the solution?

    I would rate the stability a seven out of ten because there are a lot of processes that need to be implemented.  

    What do I think about the scalability of the solution?

    It's scalable. It can easily scale on multiple nodes. Depending on the workload, it also handles that internally; like the workers, they coordinate with each other, and they share the workload with each other. So, it's pretty good in terms of scalability.

    How was the initial setup?

    The initial setup is very easy, especially for users who are familiar with EDL or EMT. 

    NiFi is one of the easiest tools on the market to learn and use. It is also a quick-win solution, which is good for first-time users who are developing data pipelines for EMT. NiFi makes it easy to track and trace the status of your pipelines, so you can be sure that they are working properly.

    What other advice do I have?

    If I were to advise someone, I would ask the user what endpoints they want to touch. If I want to read something from Kafka and I want to put this thing on the S3 bucket, what is the alternative I have? 

    I have Kafka Connect, where I can connect Kafka with one Kafka, and I can put it into an S3 bucket. Is this scalable? No. Is this monitoring No. 

    We can't monitor it. We can't scale it. It's going to be a complete black box. The person who knows Kafka Connect, or Kafka, can understand what is happening there while using Kafka Connect. But if I compare it, I literally don't need to understand what Kafka is.

    I know, "Okay, this is Kafka. These are the endpoints, and this is the URL I have to point to." That's it. My job is done. I will create a complete flow pipeline within, let's say, thirty minutes or something without having any current knowledge. I can read, I can Google it, and I can just implement it.

    For people who are new to big data technologies like Kafka and BigQuery, I would give this solution an eight out of ten. 

    Let's say you need to build a solution to read from Kafka and write to an S3 bucket. You could use Kafka Connect, but if your requirements change and you need to start reading from a database instead, Kafka Connect will not work. With Apache NiFi, you can easily modify your flow pipeline to start reading from the database instead.

    Which deployment model are you using for this solution?

    Public Cloud
    SabinaZeynalova

    Allows the creation and use of custom functions to achieve desired functionality but limitation in handling monthly transactions due to a lack of partitioning for dates

    Reviewed on Sep 21, 2023
    Review provided by PeerSpot

    What is our primary use case?

    One example is how Apache NiFi has helped us to create data pipelines to migrate data from Oracle to Postgres, Oracle to Oracle, Oracle to Minio, or other databases, such as relational databases, NoSQL databases, or object storage. We create templates for these pipelines so that we can easily reuse them for different data migration projects.

    For example, we have a template for migrating data from Oracle to Postgres. This template uses an incremental load process. The template also checks the source and destination databases for compatibility and makes any necessary data transformations.

    If our data is not more than ten terabytes, then NiFi is mostly used. But for a heavy table setup, I don't use NiFi for customers or enterprise solutions.

    What is most valuable?

    I use custom functions for specific features in Apache NiFi. I also use the processes available in NiFi. I can write custom functions to achieve the desired functionality, even if it is not explicitly available as a built-in NiFi feature.

    What needs improvement?

    Apache NiFi is slow to control and needs to be improved. I have to run many jobs and there are already large tables, which can make it difficult to control NiFi on time.

    There is no one to tell me when there is an incident and my server is down. When we manually start the NiFi process, it is not always started correctly. We can write scripts to run when a message is received from Airflow saying that the firewall is not running. This script will automatically start all servers, including the application servers. It will also check the status of all my NiFi processes and send a callback message with the results. I have written down all the processes that are monitored.

    We run many jobs, and there are already large tables. When we do not control NiFi on time, all reports fail for the day. So it's pretty slow to control, and it has to be improved.

    In future releases, there are extra features I’d like to add. For example, NiFi is not suitable for migration, and the replication in NiFi is really not good. Because when you process ten years of data, you can't manage all the transactions; it is not enough. Moreover, the handling of monthly transactions is not enough due to a lack of partitioning for dates. And, when we grade a monthly ticket, we must process all data then rerun our ETL jobs. If it's possible, enhancing the partitioning in NiFi for features would be beneficial.

    For how long have I used the solution?

    I have been working with Apache NiFi for one year. 

    What do I think about the stability of the solution?

    I would rate the stability an eight out of ten. 

    What do I think about the scalability of the solution?

    I would rate the scalability a five out of ten because, in our experience, it doesn't scale correctly, especially if you don't use a Kubernetes system. 

    If you want it to be scalable, you must use Kubernetes, but in our system, it's in VM and VM disc—external and not external. Increasing disc space is a very hard process. NiFi is not easily scalable. You can increase, but decreasing is not possible. So, it is easy to scale up, but scaling down is difficult.

    There are around ten end users in our company. We plan to increase the further usage. 

    How was the initial setup?

    The initial setup is very easy. I would rate my experience with the initial setup a ten out of ten, where one point is difficult, and ten points are easy.

    But if you want its custom mode and control, it's five out of ten. 

    For the initial setup, if you configure to custom mode, it's five points. But if you use its single-mode configuration and installation, it's ten.

    What about the implementation team?

    The deployment takes one week due to network access and some VM installation. Then, we install NiFi and deploy it. But, if you have all the scripts written automatically, it’s five minutes for us.

    One person is enough for the deployment process. It's all about script writing in CAC, and it's one-button quick for deployment.

    What's my experience with pricing, setup cost, and licensing?

    I am using it open source, so it means it's free for me to use. 

    What other advice do I have?

    If the volume is manageable, I would recommend it. Overall, I would rate the solution a six out of ten. 

    Which deployment model are you using for this solution?

    On-premises
    View all reviews