
Amazon SageMaker is a fully-managed platform that enables developers and data scientists to quickly and easily build, train, and deploy machine learning models at any scale. With Amazon SageMaker, all the barriers and complexity that typically slow down developers who want to use machine learning are removed. The service includes models that can be used together or independently to build, train, and deploy your machine learning models.
Product Overview
This algorithm produces similarity scores for a document or a line of text compared to documents in a corpus. The algorithm includes a tf-idf text featurizer to create n-gram features describing the text. It then uses the library scipy.spatial.distance to compute the cosine distance between the new document and each one in the corpus based on all n-gram features in the texts. The similarity index is then computed as (1 - cosine_distance). You can use this algorithm to look for similar texts and detect plagiarism in documents.
Key Data
Version | |
By | TIBCO Software, Inc. |
Categories | |
Type | Algorithm |
Fulfillment Methods | Amazon SageMaker
|
Usage Information
Fulfillment Methods
Amazon SageMaker
Additional Resources
End User License Agreement
By subscribing to this product you agree to terms and conditions outlined in the product End User License Agreement (EULA)
Support Information
AWS Infrastructure
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services. Learn more
Refund Policy
NA