
Overview
This is a Extractive Question Answering model built upon a Text Embedding model from PyTorch Hub . It takes as input a pair of question-context strings, and returns a sub-string from the context as a answer to the question. The Text Embedding model which is pre-trained on English Text returns an embedding of the input pair of question-context strings. PyTorch, the PyTorch logo and any related marks are trademarks of Facebook, Inc.
Highlights
- This is an Extractive Question Answering model from PyTorch Hub: https://pytorch.org/hub/huggingface_pytorch-transformers/
Details
Unlock automation with AI agent solutions

Features and programs
Financing for AWS Marketplace purchases
Pricing
Dimension | Description | Cost/host/hour |
|---|---|---|
ml.g4dn.xlarge Inference (Real-Time) Recommended | Model inference on the ml.g4dn.xlarge instance type, real-time mode | $0.00 |
ml.p2.xlarge Inference (Batch) Recommended | Model inference on the ml.p2.xlarge instance type, batch mode | $0.00 |
ml.m5.large Inference (Real-Time) | Model inference on the ml.m5.large instance type, real-time mode | $0.00 |
ml.m5.xlarge Inference (Real-Time) | Model inference on the ml.m5.xlarge instance type, real-time mode | $0.00 |
ml.c5.xlarge Inference (Real-Time) | Model inference on the ml.c5.xlarge instance type, real-time mode | $0.00 |
ml.c5.2xlarge Inference (Real-Time) | Model inference on the ml.c5.2xlarge instance type, real-time mode | $0.00 |
ml.p2.xlarge Inference (Real-Time) | Model inference on the ml.p2.xlarge instance type, real-time mode | $0.00 |
ml.p3.2xlarge Inference (Real-Time) | Model inference on the ml.p3.2xlarge instance type, real-time mode | $0.00 |
ml.m5.large Inference (Batch) | Model inference on the ml.m5.large instance type, batch mode | $0.00 |
ml.m5.xlarge Inference (Batch) | Model inference on the ml.m5.xlarge instance type, batch mode | $0.00 |
Vendor refund policy
None
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
Amazon SageMaker model
An Amazon SageMaker model package is a pre-trained machine learning model ready to use without additional training. Use the model package to create a model on Amazon SageMaker for real-time inference or batch processing. Amazon SageMaker is a fully managed platform for building, training, and deploying machine learning models at scale.
Version release notes
This GPU version supports model run on GPU instance types
Additional details
Inputs
- Summary
The input is a text paragraph and a question.
- Input MIME type
- application/list-text
Resources
Vendor resources
Support
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products





Customer reviews
Review Of BERT Large Uncased Whole Word Masking SQuAD
1. **Whole Word Masking:** Unlike other models that mask individual subword tokens, this model masks whole words, making it possible to understand the context e.g., as “running” . is masked, the model only “runs” . and “##ning” does not cover them separately but treats them as a whole, increasing the retention of context
2. **Large Model Capability:** Because of its size (24 layers and 340 million parameters), BERT Large Uncased can handle complex language structures and nuances in ways that smaller models can circumvent. This is important for QA activities where understanding the smallest of terms can make a big difference.
3. **SQuAD Fine-tuning:** Fine-tuning (including both answered and unanswered questions) in SQuAD 2.0 means that this model has a strong ability to estimate when it doesn’t know the answer. This makes it valuable for real-world applications where it is important to prove that the latter does not exist.
4. **Uncased Text Handling:** Because it has no characters, it deals with words in a case-free manner, which helps in many applications by simplifying tokenization and often speeds up training without losing information with reasonableness.
Overall, BERT Large Uncased Whole Word Masking is more accurate for QA projects, mainly because it strikes a balance between understanding context and solving real-world questions, making it an attractive choice for both research and practical applications
1. **Computational demands:** The model is large, with 340 million parameters, which means that it requires a large amount of computing power and memory. It can be difficult to run efficiently without high-performance hardware, making it expensive to use in real-time applications.
2. **Latency Problems:** Due to its size, BERT Large can be slow, especially in areas where low latency responses are important, such as customer support or conversational AI. When speed is paramount, downtime can hinder productivity.
3. **Limited to fixed-length inputs:** BERT has a maximum input length (typically 512 tokens). This can be a limitation for long documents, as it forces users to chop or split input into smaller chunks, which can lead to loss of context and affect accuracy in QA tasks
4. **Lack of Translation:** Like other Transformer models, BERT operates as a black box, meaning it is difficult to fully understand how it produces specific responses and in cases where translation is needed, ambiguities can be a drawback.
5. **Uncased Model Limitations:** While being uncased helps simplify operations, this can raise issues where capitalization makes sense. For example, company names (such as “apple” company vs. “apple” fruit) can sometimes be misinterpreted, affecting accuracy in specific cases in.
6. **Pre-Trained Knowledge Limit:** Despite being refined in SQuAD, BERT still has a knowledge cutoff, which means it may struggle with questions on recent events or niche topics , unless updated with new data, which requires additional resources
In summary, while the BERT Large Uncased Whole Word Masking SQuAD excels in terms of accuracy
1. **Enhanced Information Retrieval:** SQuAD enables you to fine-tune the model to get accurate answers in large datasets or documents. This makes it easier to quickly access specific information, saving time for users and staff who need quick, reliable answers without sifting through lengthy documents
2. **Improved user support:** Integration of BERT with internal customer support or support services reduces the burden on support teams and can answer common questions with more accurate answers demand. This speeds up response times and prioritizes support for critical incidents.
3. **High precision in QA tasks:** The ability of the model to understand context and address answer/non-answer questions ensures accurate answers. This accuracy is valuable in situations such as compliance assessment or knowledge management, where misinterpretation can lead to errors or compliance risks.
4. **Answer consistency:** Keeping BERT in context by covering whole words improves its consistency when answering similar questions. This ensures that responses to multiple requests are reliable and consistent, which benefits internal teams and users looking for information.
5. **Automation of routine queries:** BERT can handle routine or generic queries automatically, reducing the need for human intervention. This allows employees time to focus on more important tasks, ultimately improving productivity and reducing costs associated with manual query processing