AWS for Industries
Collaborative AI Model Training with Rhino Federated Computing on AWS
Introduction
Across the healthcare and life sciences (HCLS) industry, organizations are generating large volumes of data with potential to improve patient care. However, transforming this data into meaningful insights can be challenging for many institutions. Privacy regulations like Health Insurance Portability and Accountability Act (HIPAA), EU General Data Protection Regulation (GDPR), and requirements for data sovereignty and provenance necessitate careful coordination of data sharing between organizations to ensure security and compliance. Even within a single institution, limited datasets can restrict the development of effective AI models, particularly for rare conditions that demand extensive, diverse data for accurate predictions.
Federated computing (FC) offers an innovative solution to these challenges. This approach combines federated learning and federated statistics, allowing organizations to perform computations on their local data while sharing only aggregated results, eliminating the need to centralize sensitive data. It supports both simple operations (e.g., cross-site summary statistics) and advanced analytics (e.g., training AI models on distributed and private data). By keeping raw data within its original environment, FC addresses key concerns around data privacy, security, and sovereignty. It ensures full control and ownership of sensitive information while enabling compliant and secure data utilization across organizational boundaries.
In this blog, we will introduce the Rhino Federated Computing Platform (Rhino FCP) and demonstrate how it integrates with AWS to mitigate data collaboration challenges in HCLS. We will explore how organizations can accelerate research and AI model development while maintaining strict security standards through collaborative workflows.
Rhino Federated Computing Platform
Overview
Rhino FCP is a platform for cross-organizational data collaboration, with a focus on HCLS data privacy. Initially centered on federated learning, the platform has expanded to include broader FC capabilities. It supports both cloud and on-premises environments, offering end-to-end data management workflows for multimodal data, customizable privacy controls, and deployment of custom code and third-party applications. The platform is used by various healthcare and biopharma organizations worldwide, including AbbVie, Amgen, AstraZeneca, Johnson & Johnson, and UCB, as part of the FAITE Consortium; as well as Eli Lilly’s TuneLab.
Rhino FCP Core Architecture
Figure 1 illustrates the core components of the Rhino FCP architecture. Users interact with the system via a Web UI, Python SDK, or API. The Rhino Orchestration layer, accessed through HTTPS, coordinates distributed extract, transform, and load (ETL), annotation processes, and federated learning orchestration without direct access to raw data.
The Rhino Client runs in a virtual machine (VM) within each data custodian’s account, automated through terraform scripts. It connects to designated storage buckets containing authorized data. Operating within the data custodian’s network boundary, the Rhino Client handles local storage and computation tasks while maintaining communication with the Rhino Orchestration layer. This architecture keeps row-level data within the data custodian’s environment during AI model training and analysis.
Figure 1: An overview of Rhino Federated Computing Platform architecture.
Security
The platform incorporates encryption at rest, in transit, and during processing by using AWS Key Management Service (AWS KMS) and supporting confidential VMs that run workloads inside a trusted execution environment (TEE). It uses frameworks like NVIDIA FLARE for AI model training and validation, while ensuring privacy-filtered outputs. It includes tools for experiment management, model evaluation, and fairness analysis across distributed datasets.
Federated Data Discovery
Rhino FCP facilitates collaborative projects among connected partners. Within these shared projects, participants establish role-based permissions that govern data access, data subject privacy, and code execution rights. Once permissions are set, authorized users can deploy code on accessible datasets. They can analyze multiple datasets, build cohorts, pre-process data, train and fine-tune models, and deploy visualization tools. Participating partners can approve and implement new operational capabilities as their needs evolve.
Key capabilities
In addition to federated data discovery, analytics, and AI, Rhino FCP supports these core capabilities:
Streamlined data harmonization and preparation: Uses Generative AI to automatically anonymize and transform local data into common data models like Fast Healthcare Interoperability Resources (FHIR) and Observational Medical Outcomes Partnership (OMOP), with human-in-the-loop validation.
LLM-based interface: Offers a large language model-based interface to execute complex workflows while maintaining security and privacy. It guides users through cohort building, model configuration, and input validation while maintaining context across steps. By translating natural language to queries and structured outputs, it reduces technical expertise requirements. It enforces role-based permissions and privacy rules, enabling federated execution without exposing patient data.
Interactive containers and visualization: Provides secure, interactive sessions for data exploration, annotation, and visualization. This includes tools like Jupyter notebooks and DICOM viewers, with all activities logged and data remaining on-site.
Security and governance: Implements security standards with encryption (at rest, in transit, and during processing), role-based access control, and audit logging. The platform is HIPAA, GDPR, ISO 27001, and SOC 2 Type II compliant and can integrate with external monitoring tools.
Rhino Federated Computing Platform on AWS
As shown in Figure 2, the Rhino FCP architecture on AWS consists of an orchestration node that manages secure federated workflows across multiple client nodes. Users interact with the system through either the Rhino Web UI or SDK, connecting via HTTPS to the Rhino Orchestrator. The orchestration layer leverages key AWS services, including AWS KMS for cryptographic key management, Amazon Bedrock for domain-specific AI workflows including data harmonization and compliance checks, and Amazon Q for natural language querying of datasets and pipeline metadata.
Figure 2: Architecture of Rhino Federated Computing Platform on AWS.
Each client node runs on an Amazon Elastic Compute Cloud (Amazon EC2) instance with a Rhino Agent that interfaces with local data buckets via SFTP and manages local model training through Federated Learning (FL) clients. Secure communication between orchestration and client nodes uses Transport Layer Security (TLS) protocol, exchanging only aggregated statistics and updated model weights. The entire platform is monitored and audited through HIPAA-eligible services, including AWS CloudTrail, Amazon CloudWatch, and AWS Config.
HCLS use cases
FC enables data analysis across organizations when centralization is not feasible due to privacy or cost concerns. In healthcare, providers implement FC for AI model development across hospital networks, analyzing patient journeys between institutions, and facilitating research collaborations while protecting sensitive data. Biopharma companies use FC to fine-tune AI models with biotech partners, manage real-world data from multiple sources, and enable remote annotation of proprietary datasets. Public health organizations leverage FC for disease surveillance across health systems and to create hybrid data repositories that allow secure access to distributed datasets while maintaining local control.
Federated AI for breast cancer research with the European Network of AI Excellence Centres (ELISE)
ELISE (European Network of AI Excellence Centres) is a pan-European initiative that brings together top AI research hubs, industry partners, and talent-mobility programs to drive trustworthy, multidisciplinary AI innovation across academia and business. Rhino’s work with ELISE demonstrates how Rhino FC on AWS accelerates diagnostic innovation in breast cancer by enabling secure, privacy-preserving AI collaboration across continents. Facing the dual challenges of limited data diversity and strict data protection regulations, the consortium, which included Rhino FC, Emory University in the United States, and two medical centers in Israel (Assuta Medical Center and the I-Medata AI Center at Sourasky Medical Center), used federated learning and edge computing for AI-driven pathology. The use of public training datasets at Emory, combined with validation on real-world institutional datasets in Tel Aviv, illustrates the platform’s ability to harmonize data and models without disclosing patient-level information. The project ensured compliance with HIPAA in the United States and Israeli privacy laws governing medical data. Rhino’s federated architecture is also designed to support compliance with GDPR and other international privacy regimes when projects extend into European jurisdictions.
Rhino FC’s architecture ensured that sensitive pathology image data remained securely within institutional firewalls. Emory developed and pre-trained a deep-learning model to detect mitotic features using the publicly available MIDOG++ dataset, achieving a mean average precision of 0.82 at a 0.5 threshold, which represented a new benchmark in mitosis detection. This Emory-trained model was validated on 500 annotated breast cancer cases from the Israeli partners without any transfer of raw image data. This demonstrated the model’s generalizability and clinical efficacy across diverse datasets and geographic boundaries.
Beyond technical performance, this project delivered tangible benefits in workflow efficiency, trust, and regulatory alignment. The federated approach enabled partners to combine their complementary strengths, including advanced algorithm development at Emory and high-quality institutional data at Israeli medical centers, while avoiding the complexity and legal risks associated with centralized datasets. It also showcased the power of FC to rapidly transform fragmented, siloed data into a robust shared asset for AI-driven diagnostics.
According to Dr. German Corredor Prada of Emory, this collaboration allowed him to validate and improve the generalizability of algorithms by securely leveraging diverse data from across the world while maximizing privacy.
Eli Lilly TuneLab
Eli Lilly recently announced the launch of Lilly TuneLab, an artificial intelligence and machine learning (AI/ML) platform, leveraging Rhino FCP on AWS, that provides biotech companies access to drug discovery models trained on over $1 billion worth of Lilly research data. TuneLab enables selected partners to tap into Lilly’s internal AI capabilities through a privacy-preserving federated learning approach—allowing companies to leverage high-performing models without exposing their proprietary data or accessing Lilly’s directly.
“With TuneLab, we are offering a unique ‘give/get’ value proposition,” said Aliza Apple, Vice President of Lilly Catalyze360 AI/ML and Global Head of TuneLab. “We give biotech companies access to Lilly’s machine learning models trained on decades of proprietary drug discovery data. In return, we get valuable training data contributions from each participant through federated learning. This approach allows the models to continuously improve with everyone’s data while keeping all information private — a powerful way to accelerate discovery together.”
Conclusion
Rhino FCP on AWS enables collaborative AI model development in HCLS through its integration of FC capabilities with AWS’ security and scalability features. The platform enables organizations to unlock the value of distributed datasets while maintaining data privacy and regulatory compliance. This approach has been validated through implementations like the ELISE cross-continental breast cancer research initiative and Eli Lilly’s TuneLab drug discovery platform. As HCLS data continues to grow in volume and complexity across distributed environments, FC provides the framework for collaborative innovation within regulatory boundaries.
For additional technical documentation and implementation details, visit the AWS Marketplace product page or contact an AWS representative.

