Overview
This solution uses a finetuned Phi3.5 small Language Model (SLM) to identify sensitive information, that support redaction decisions which are grounded in document semantics. The system produces a redacted document and a rule trace that support audit of privacy compliance and validation by the reviewer. This solution takes as input the original document from user (that need sensitive data redaction) and a list of rules according to which the user wants to redact information. The input document content is chunked into smaller paragraphs and the rules are reasoned over the respective paragraph to redact the sensitive information. The solution output a final redacted document combining all the chunks into a single document and gives explanation for redaction. It helps organization to protect sensitive information and enable them to adhere to various Regulatory frameworks like GDPR, HIPPA etc.
Highlights
- A unique and easy-to-use solution for user defining, identifying and redacting any type of sensitive information in a document using a carefully finetuned phi-3.5 model. This solution protects the sensitive information organization data and enables them to adhere to various regulatory frameworks like GDPR, HIPPA etc. The relevant metrics to evaluate the performance of redaction with respect to privacy attacks are presented enabling data officers to quantify the data privacy.
- With the increase in digitization of personal and corporate communication, the automatic sanitization of textual data has become a crucial component to ensure data privacy and compliance at scale. Our finetuned SLM model automate this process of redaction and attack privacy by using a rule reasoning-based text sanitization. This solution involves minimum manual intervention and can handle high volume data redaction incurring reduced cost of redaction.
- Mphasis DeepInsights is a cloud-based cognitive computing platform that offers data extraction & predictive analytics capabilities. Need Customized Deep learning and Machine Learning Solutions? Get in Touch!
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost |
|---|---|---|
ml.m5.large Inference (Batch) Recommended | Model inference on the ml.m5.large instance type, batch mode | $0.00/host/hour |
ml.g5.4xlarge Training Recommended | Algorithm training on the ml.g5.4xlarge instance type | $2.00/host/hour |
ml.m5.xlarge Inference (Batch) | Model inference on the ml.m5.xlarge instance type, batch mode | $0.00/host/hour |
ml.m5.2xlarge Inference (Batch) | Model inference on the ml.m5.2xlarge instance type, batch mode | $0.00/host/hour |
ml.m5.4xlarge Inference (Batch) | Model inference on the ml.m5.4xlarge instance type, batch mode | $0.00/host/hour |
ml.m5.12xlarge Inference (Batch) | Model inference on the ml.m5.12xlarge instance type, batch mode | $0.00/host/hour |
ml.m5.24xlarge Inference (Batch) | Model inference on the ml.m5.24xlarge instance type, batch mode | $0.00/host/hour |
ml.m4.xlarge Inference (Batch) | Model inference on the ml.m4.xlarge instance type, batch mode | $0.00/host/hour |
ml.m4.2xlarge Inference (Batch) | Model inference on the ml.m4.2xlarge instance type, batch mode | $0.00/host/hour |
ml.m4.4xlarge Inference (Batch) | Model inference on the ml.m4.4xlarge instance type, batch mode | $0.00/host/hour |
Vendor refund policy
We do not have any refund policy
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker algorithm
An Amazon SageMaker algorithm is a machine learning model that requires your training data to make predictions. Use the included training algorithm to generate your unique model artifact. Then deploy the model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
v1
Additional details
Inputs
- Summary
The model input is a zip file which consists of two pdfs:
- The original document that needs to be redacted
- The rules file consisting of natural language rules in accordance to which sensitive data is redacted.
- Limitations for input type
- Do not enter more than 3 rules in one run
Support
Vendor support
For any assistance reach out to us at:
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products

