Posted On: Oct 11, 2021

We are excited to announce event dataset storage for Amazon Fraud Detector. The new capability enables customers to easily send and store their production fraud data directly within Amazon Fraud Detector. Customers can use their event datasets to train machine learning (ML) models with higher predictive performance since the models can apply historical context to new events by automatically calculating values such as account age and purchase frequency. Customers can also move faster by retraining models without needing to upload a new training dataset to S3, and they can close the feedback loop from offline fraud investigations by updating their fraud labels for stored events.

Prior to this launch, customers could only train models on data stored in S3. To retrain a model, customers would need to manually update their dataset, upload the latest dataset to S3, and then point Amazon Fraud Detector to it. These data preparation steps made model retraining time consuming, increasing the chances that a model could go “stale”.

Using the newly launched event datasets, customers can upload their historical event data directly into Amazon Fraud Detector for training models. The event dataset is also automatically updated with each new prediction so there is no need to upload new datasets for each model retraining. Event dataset metrics, such as the number of events and size of the dataset, are updated automatically and can also be refreshed on-demand. Customers can update event labels (e.g., fraud, legitimate) based on offline reviews to close the ML feedback loop. With their event dataset stored in Amazon Fraud Detector, customers can now train a new model or retrain an existing model in even fewer clicks.

To get started, create a new event type or select an existing one, and then navigate to the ‘Stored events’ tab in the Fraud Detector console. In this tab, you can enable real-time event storage for predictions. To store historic data, you can upload a CSV file of event data or use the new SendEvent API to stream the events to Amazon Fraud Detector. Once you have a stored dataset, you can quickly train or retrain model versions by selecting ‘stored events’ as your model training data source. Event data storage costs $0.10 per GB per month and is available in the US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Ireland), Asia Pacific (Singapore) and Asia Pacific (Sydney) regions. For additional details about event data storage, see our documentation.