Secure Data and Analytics with Talend Data Fabric and Stitch Using AWS PrivateLink
By Tamara Astakhova, Sr. Partner Solution Architect – AWS
By Jean-Claude Kuo, Principal Product Manager Cloud Security – Talend
By Darragh O Flanagan, Partner Solution Architects – AWS
The most valuable data for many companies is often the most sensitive and most regulated, such as electronic health records, financial data, or Personally Identifiable Information (PII).
Too often, cybersecurity risk considerations inhibit companies from unlocking the full value of their data by adopting modern software-as-a-service (SaaS)-based solutions. This often locks customers into on-premises infrastructure or requires them to buy, develop, or manage solutions that are not core to their businesses.
Removing these obstacles is a key enabler for customers, allowing them to focus on core operations and opening the door to consumption-based pricing and new licensing models delivered by SaaS vendors.
Talend Data Fabric combines data integration, data integrity, and data governance in a single, unified platform that makes it easy to collect, transform, clean, govern, and share your data. Talend Stitch is fully managed, scalable service that helps replicate data into a cloud data warehouse and quickly access analytics to make better, faster decisions.
Integration with AWS PrivateLink helps protect communication between Amazon Virtual Private Cloud (VPC) and Talend Cloud as well as Talend Stitch. This support allows customers to send data without using the public internet, thus avoiding public disclosure of their own services and services in Talend Cloud and Talend Stitch. This also significantly reduces their exposure risk.
In this post, we’ll describe how integrating Talend Data Fabric and Talend Stitch with AWS PrivateLink can help organizations accelerate digital transformation while complying with strict security and regulatory rules requirements, ensuring the data never enters the public internet.
The SaaS Public Internet Challenge
Talend Data Fabric offers flexibility with a varied deployment model: SaaS, hybrid, or even on-premises. AWS PrivateLink provides private connectivity between Amazon Simple Storage Service (Amazon S3) and on-premises resources using private IPs from your virtual network.
By providing the support of AWS PrivateLink, Talend offers a path for organizations willing to adopt or expand their use cases with Talend in a hybrid or SaaS model. All of this while helping companies meet strict security and regulatory compliance by keeping their data secure and safe in their trusted perimeter.
The diagram in Figure 1 represents a typical hybrid architecture on a trusted corporate perimeter perspective. Boundary is simplistically delimited by the non-exposure of endpoints to the public internet, which is deemed at risk and protected by security controls such as firewalls.
However, with external SaaS applications that require deep integration into your organization’s information system, many challenges remain. Internet-facing applications are treated as an exception in such security model, leading to additional complexity, cost, and stretched time to value.
That’s the SaaS public internet challenge.
Figure 1 – No-internet hybrid architecture.
Talend Integration with AWS PrivateLink
Because of the unique Talend hybrid architecture, runtime (Talend Remote Engine) can be deployed on a customer’s preferred location, whether on premises or on the customer’s VPC instance, closest to where data resides.
With support of AWS PrivateLink, it’s no longer necessary to open a public internet-facing outbound port and connections allowing Talend Remote Engine pairing with Talend Cloud control plane.
Figure 2 – Talend Data Fabric with AWS Private Link integration.
In the above illustration, Talend Remote Engine can ingest data from Amazon S3 or on-premises database, and then apply transformation and data quality rules before storing back cleansed and entrusted data into a cloud data warehouse destination such as Amazon Redshift and Snowflake.
In this journey, data and metadata flow through the AWS secure and private network. This prevents data and service endpoints from being exposed to the unpredictable public internet, from source location to destination storage.
The following diagram shows the common use case for establishing private connectivity for data flow with Talend Stitch and AWS PrivateLink.
Figure 3 – Talend Stitch with AWS PrivateLink integration.
In both use cases, this prevents data and service endpoints from being exposed to the unpredictable public internet, from source location to destination storage.
Accelerating Time to Value
Since AWS PrivateLink operates on the network level, there are no significant changes to the end user experience. Talend Data Fabric through AWS PrivateLink works transparently for end users, including Talend applications and API endpoints
Deploying AWS PrivateLink is order of magnitude is faster comparatively to VPC peering or a site-to-site VPN approach (no network range overlaps), thus reducing the dependencies to infrastructure teams.
Finally, you can get faster clearance from security and compliance teams with the unique combination of private network isolation with Talend’s IP Access control, ensuring no one can access a tenant outside of your private network.
For more information about setting up the IP allowlist policy to restrict user access, see the Talend Cloud User Guide.
By taking the security efforts out of working with data, organizations don’t have to compromise between data protection and delivering trusted business outcomes.
Talend Remote Engine is available for purchase through AWS Marketplace. Please speak to your Talend account representative for custom purchase options through AWS Marketplace Private Offer. For any additional information, please contact your Talend business partner.
Talend – AWS Partner Spotlight
Talend is an AWS Competency Partner that provides a data integration platform enabling companies to accelerate migrations to cloud data lakes and warehouses on AWS.