- Version v0.6.3
- By MK1
The MK1 Flywheel runtime offers an optimized endpoint to run realtime inference on large language models with superior throughput and low latency, even on single-GPU configurations. For those planning on larger scale deployments, we provide personalized support and information to help scale your...
Algorithm - Fulfilled on Amazon SageMaker