
Overview
Apache NiFi is a visually programmed software tool that automates the movement and transformation of data between systems. It enables you to easily capture, move, enrich and transform machine data, Internet of Things (IoT data) and streaming data between systems. Its drag and drop interface enables you to build data pipelines from commercial data feeds, manufacturing equipment, IoT sensors, web servers, and business reporting and moves the data into a variety of systems such as S3, EMR, SQL databases, DynamoDB, Couchbase, MongoDB, HBase, ElasticSearch, HIVE, Kinesis, Postgres MySQL, FTP Servers + even tools such as Snowflake or BigQuery.
Calculated Systems Apache NiFi in the Cloud is a one-click deployment that automatically launches NiFi in AWS quickly and securely without any coding or complex configuration. This out-of-the-box, optimized deployment of Apache NiFi helps protect you from common pitfalls associated with open source software such as Java virtual machine (JVM) issues and logging configuration by taking care of all initialization, configuration and perimeter security needed. No need to become an expert in big data cloud architecture to migrate or manage your data.
To learn more about Apache NiFi download our free ebook: Apache NiFI for Dummies: https://www.calculatedsystems.com/nifi-for-dummies authored by several members of the Calculated Systems Team.
Highlights
- A visually programmed software tool that moves machine data/IoT data into the cloud
- Drag and drop software to move your data into S3, EMR, SQL databases, DynamoDB, ElasticSearch, Kinesis, FTP Servers, Snowflake and more.
- Easy, one-click installation for a fully functional Apache NiFI instance in AWS in minutes.
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Buyer guide

Financing for AWS Marketplace purchases
Pricing
Dimension | Cost/hour |
|---|---|
m4.xlarge Recommended | $0.16 |
t2.2xlarge | $0.32 |
r4.xlarge | $0.16 |
t2.large | $0.08 |
r4.2xlarge | $0.32 |
m4.2xlarge | $0.32 |
m4.large | $0.08 |
t2.xlarge | $0.16 |
Vendor refund policy
For refund information please read the eula or contact Info@calculatedsystems.comÂ
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
- Updated Nifi to 2.6.0
Additional details
Usage instructions
A detailed launch guide can be located here - https://www.calculatedsystems.com/getting-started-awsÂ
For NiFi specific usage, outside of how to start the AMI please see our ebook Apache NiFi for Dummies - https://www.calculatedsystems.com/nifi-for-dummiesÂ
Support
Vendor support
Migration, Implementation, and Support Services Are Available. Info@calculatedsystems.comÂ
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.

Standard contract
Customer reviews
Daily workflows have integrated diverse logs and have delivered flexible data orchestration
What is our primary use case?
I have been using Apache NiFi virtually daily, as it is part of my main responsibility in my current role.
My main use case for Apache NiFi involves integrating various data sources and performing transformations to load them into mostly our NoSQL database, Elasticsearch, but sometimes into other databases as well.
For integrating and transforming data, we receive a lot of logs generated with our AWSÂ services that the company wants to collect, particularly for our security team to review those logs and ensure they can conduct their security checks and reviews to confirm there is no abnormal behavior. We use Apache NiFi to capture those logs sent to many S3Â buckets, collect those logs, decompress them with Apache NiFi, perform any necessary transformations, and send them to Elasticsearch so that end users, often from the network team or security team, can then use Elasticsearch and Kibana for data analysis.
My advice for others considering Apache NiFi is that if you are willing to, you can use it on-premises; it offers great customizability. While it is specifically designed for streaming data, it can also accommodate batch data. Moreover, it is useful for various out-of-the-box solutions, including unique uses such as email notifications, showcasing flexibility in data orchestration, ETL, and other applications.
What is most valuable?
Apache NiFi offers great flexibility in terms of whether you want to be a low-code user or a high-code user, especially if you are a Python or Java developer, thanks to the recent addition of custom-built processors in the latest versions of Apache NiFi where you can use Python or Java to create your own processors versus using the great selection of out-of-the-box processors already available in Apache NiFi to do almost anything. If you are willing to put together a complex web of processors, you can do almost any data transformation you want, but the customizability with making your own processors, again with Python or Java, has been a huge benefit for performing both what Apache NiFi is specifically made to do and some more out-of-the-box solutions, such as creating some kind of email notification system as well. This kind of use with Apache NiFi has existed even before the implementation of custom processors. You could create scripts, even putting them in Python in Apache NiFi using the execute script methods, and this has existed before, but now it has even better functionality with the latest version of Python rather than just a Jython type of hybrid. Those are some of the best things that it offers.
The flexibility of Apache NiFi has helped me in my daily work, especially because instead of utilizing a bunch of Apache NiFi processors, which we do use for most of our processes, it can be much easier to combine transformation logic within Python processors since the majority of our team prefers Python programming as our choice of language. This integration allows us to put it all in one place. We can integrate Apache NiFi with our Python processors that we host on a Git repository, which integrates very well, and we can manage the same scripts and make changes efficiently. It is great coming from a Python developer mindset shared amongst the team.
Apache NiFi has positively impacted my organization as it continually improves functionality and throughput with each iteration over the past three years. One of the big tradeoffs with open source is that how well it functions is largely dependent on the user, but that means you can adapt it to whatever custom use case you have. We have been able to consolidate several different authentication methods through just Microsoft, and Apache NiFi has been helpful in facilitating that. Additionally, due to its many ways of extracting data from different sources, we can develop specific solutions ourselves, allowing us to integrate various data sources. Thanks to the open-source customizability, we can adapt Apache NiFi to our built cluster, which has numerous benefits, particularly since we are managing many of our processes. This approach saves us significant costs compared to moving to something more managed or on the cloud, as managing open-source technologies ourselves ultimately reduces expenses.
Regarding cost savings, I do not have a strict idea of how much we have saved since the company was already using Apache NiFi when I joined, but I am certain comparisons have been made against other ETL or data orchestration tools that are popular among different cloud providers such as AWS or Azure . The cost savings must be significant, particularly given that we are handling terabytes and petabytes of data daily, trying to find software that allows this in an affordable manner. It is clear that substantial savings exist, as long as we manage our own clusters and bugs effectively. The tradeoff with managed services is that they handle much of this, ensuring uninterrupted service, but these come at a cost. Conversely, with open-source software management, we incur no costs as we handle everything ourselves.
What needs improvement?
I believe Apache NiFi could be improved with easier, out-of-the-box provided monitoring solutions. While Apache NiFi has an API that generates logs, it would be beneficial to have simpler access to that data saved historically. It would assist in easily retrieving data for historical analysis and storing it elsewhere without the hassle of setting up APIs and delving into documentation. Just having a more streamlined approach to collecting this data would be greatly advantageous.
I would suggest continuous improvements regarding the custom developer-built processors, as many times the errors that arise are not useful. We often seem to struggle with a combination of implementing our own error handling or analyzing logs, as the information does not always align or proves unhelpful. Continuous enhancement in this area would be wonderful, so we do not need to decipher which error is more accurate or which report gets us nearer to the actual problem. For instance, I encountered a situation where flow files would not process; they were retried but returned to the queue before the Python processor due to ambiguous errors. It eventually turned out that the issue was the flow files' size being too large for the Python processor, which we only discovered by splitting the flow files, at which point the issue resolved. The initial error did not indicate it was related to memory or size limitations but appeared as a parsing error or something similar.
For how long have I used the solution?
I have been working in my current field for about three years.
What do I think about the stability of the solution?
Apache NiFi is now more stable than before.
What do I think about the scalability of the solution?
Apache NiFi's scalability is good. You can scale it up as long as you have the machines and servers available. If you have room for more instances, scaling up is fairly straightforward, provided you manage configurations effectively.
How are customer service and support?
Apache NiFi's customer support is good.
How would you rate customer service and support?
What was our ROI?
I have definitely seen a return on investment through time savings. Working with Apache NiFi allows us to manage it more efficiently, transitioning from spending hours or days resolving issues to requiring much less intervention now. Thanks to improvements on both our side in how we run processes and enhancements to Apache NiFi, we have reduced the time commitment to almost not needing to interact with Apache NiFi except for minor queue-clearance tasks, allowing it to run smoothly. At this point, we have certainly saved hundreds of hours.
What other advice do I have?
The customizability of Apache NiFi helps even with unique use cases, as I mentioned before, given that Apache NiFi can be used in this capacity. While there are better applications or software options available, when you are trying to keep it simple and finding ways to utilize a couple of processors for a unique solution, you can do that in Apache NiFi. For example, we have several notification-type pipelines we have built in Apache NiFi, such as reading from a SQL database to identify users who have not completed training and then sending them an email reminder to complete that training. We have that running regularly, week by week. Another instance involves a processing data flow that scans for specific data found in logs, which triggers an email notification to the relevant team letting them know that a unique identifier has appeared, allowing them to handle the situation.
I encountered some odd cases such as increasing concurrent threads on a processor, which should work similarly to copying several processors, yet functional throughput varies. It seems that using a distributed processor yields better throughput than just increasing the concurrent threads on one processor, which has been odd but is a workaround we had to adopt to boost throughput. Resolving such quirks could elevate the rating further.
I rate Apache NiFi an eight out of ten. I choose eight because, as open-source software, there is always room for improvement, but the tradeoff between learning how to use the software and the savings it provides, along with its customizability, ranks it pretty high. It is effective for what it does and continues to improve, so it could score higher if there are significant enhancements in custom-built processors and ongoing improvements in functionality.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Low-code orchestration has bridged on-prem and cloud workflows and now streamlines data sharing
What is our primary use case?
A specific example of a workflow where Apache NiFi plays a key role is when there is an on-premises Hadoop system and a cloud component. Because of the company's policy regarding firewalls, the data cannot be directly moved through services such as Kinesis or DMS integrating directly with on-premises resources. Apache NiFi works as an orchestrator and a middle tool to get the parameters to trigger the job and then pass on the handler to the cloud services. Because of the firewall, Apache NiFi comes into the picture. Another use case for Apache NiFi is once the data is created in S3 ; I can extract a subset of the data and send it as an SFTP for outside recipients.
I have another scenario regarding my main use case with Apache NiFi; there is a use case for synthetic data, and we are using synthetic data generative AI software to synthesize data in the cloud environment. Now for users who are not on the cloud and want to access the synthetic data, Apache NiFi is used to pull the data back from the cloud.
What is most valuable?
The drag-and-drop interface of Apache NiFi has helped my team by cutting down on development time, allowing us to focus more on the mapping part and testing part of it.
In terms of features, if a job requires a predetermined mapping for the metadata, Apache NiFi comes in handy to map for the metadata, which is needed before an external file can be validated and pushed onto the cloud for processing. This is better than traditional systems where we would have to sit down and map individual attributes and their data types to create that metadata interface.
Apache NiFi has positively impacted my organization by definitely bridging the gap between the on-premises and cloud interaction until we find a solution to open the firewall for cloud components to directly interact with on-premises services.
What needs improvement?
About needed improvements, I think integration with other tools would really help in the age of AI. Apache NiFi should have APIs or connectors that can connect seamlessly to other external entities, whether in the cloud or on-premises, creating a plug-and-play mechanism.
For how long have I used the solution?
What do I think about the stability of the solution?
To troubleshoot or resolve the Apache NiFi crashes and data retrieval issues, my team primarily replicates the same scenario in development to see where the issue lies. On one occasion, the failure was linked to an authentication mechanism change at the enterprise level.
In my experience, Apache NiFi is stable.
What do I think about the scalability of the solution?
How are customer service and support?
How would you rate customer service and support?
Which solution did I use previously and why did I switch?
What about the implementation team?
What was our ROI?
Which other solutions did I evaluate?
What other advice do I have?
Data ingestion has accelerated and now supports flexible API integration and custom transformations
What is our primary use case?
Apache NiFi is used to orchestrate ingestion processes. For example, Apache NiFi ingests data from external sources such as external databases or external APIs. Custom transformation is then applied, and data is written inside the data lake.
How has it helped my organization?
Apache NiFi speeds up ingestion pipelines development. Ingestion pipelines that usually took a week to develop can now be developed in a couple of days.
What is most valuable?
Apache NiFi has extensive integration capabilities and integrates with many sources. It supports custom transformations, making it a very flexible tool that can be leveraged to perform most computation needs.
For transformation with Apache NiFi, JSONs are processed and denormalized to map information onto different tables. For source integration, the most valuable aspect was the ingestion from external APIs.
What needs improvement?
Apache NiFi is a very good tool, but there is room for improvement.
For how long have I used the solution?
Apache NiFi has been used on different projects for a couple of years.
What other advice do I have?
Apache NiFi should be considered if a scalable and flexible tool is needed for building ETL pipelines and reducing time to production. This review has a rating of 8.
Data workflows have accelerated project delivery and reduce costs for analytics teams
What is our primary use case?
Apache NiFi is used for real-time and batch ingestion on data warehouse platforms. For example, Apache NiFi ingests all analytics from the e-commerce website into the data warehouse in the AWS Redshift database.
How has it helped my organization?
Speeding up projects with Apache NiFi has helped the organization by resulting in cost savings. A 30% reduction in cost was noticed as a specific metric regarding those savings.
What is most valuable?
The best feature of Apache NiFi is the simplicity of the tools because it is a drag-and-drop tool. The simplicity of Apache NiFi's tools helps by speeding up all the implementation process. Apache NiFi is also used to speed up projects in order to gain more projects in less time.
What needs improvement?
Apache NiFi is a good product as it is currently.
For how long have I used the solution?
Apache NiFi has been used for a long time, five years in different projects.
What do I think about the stability of the solution?
Apache NiFi is stable.
What do I think about the scalability of the solution?
The scalability of Apache NiFi is good because it is simple to scale up the resources.
How are customer service and support?
The customer support for Apache NiFi is fine. I would rate the customer support of Apache NiFi a 10 on a scale of 1 to 10.
How would you rate customer service and support?
Positive
Which solution did I use previously and why did I switch?
A custom solution implemented with Python was previously used before switching to Apache NiFi. The decision was made to switch from the custom Python solution to Apache NiFi to simplify all the deployment.
How was the initial setup?
Apache NiFi was purchased through the AWS Marketplace .
What was our ROI?
A return on investment has not been observed, and it is not possible to share these metrics.
What's my experience with pricing, setup cost, and licensing?
The experience with pricing, setup cost, and licensing was fine, as the integration with the AWS Marketplace was very good. The pricing in Italy is considered a little bit high, but the product is worth it.
Which other solutions did I evaluate?
Other options were not evaluated before choosing Apache NiFi.
What other advice do I have?
Apache NiFi receives a rating of 9 out of 10. This rating of 9 out of 10 for Apache NiFi was chosen because of the documentation and the support of the product. The advice for others looking into using Apache NiFi is to test the solution with a POC and then go to production in a quick way.
Which deployment model are you using for this solution?
If public cloud, private cloud, or hybrid cloud, which cloud provider do you use?
Visual workflow offers clarity and boosts data pipeline construction
What is our primary use case?
How has it helped my organization?
What is most valuable?
What needs improvement?
For how long have I used the solution?
What do I think about the stability of the solution?
What do I think about the scalability of the solution?
How are customer service and support?
How would you rate customer service and support?
Negative
