Overview
TensorFlow Serving model status
Terminal output of curl -u cloudimg:<password> http://<host>/v1/models/half_plus_two showing state=AVAILABLE for the bundled half_plus_two SavedModel.
TensorFlow Serving model status
REST predict request
gRPC predict example
This is a repackaged open source software product wherein additional charges apply for cloudimg support services.
Overview TensorFlow Serving is Google's flexible, high-performance serving system for machine learning models, designed for production environments. This image delivers TF Serving running under Docker behind an nginx basic-auth gateway, ready to serve TensorFlow SavedModels within minutes of launch.
Application Stack The official tensorflow/serving CPU image runs as a Docker container managed by docker compose v2 and supervised by systemd. Two endpoints are exposed: a gRPC predict endpoint on 8500 and a REST predict endpoint on 8501. An nginx reverse proxy on port 80 fronts the REST API and enforces HTTP Basic authentication so the model server is never exposed unauthenticated on the public internet.
Sample Model Google's canonical half_plus_two SavedModel is bundled at /var/lib/tfserving/models/half_plus_two/1/ so the server has a working model to serve on first boot. Replace it with any TensorFlow SavedModel by dropping a new versioned directory under /var/lib/tfserving/models/.
Secure First Boot On first boot a one-shot systemd unit rotates the nginx basic-auth password (per-instance, openssl-generated), writes /etc/nginx/.htpasswd, and saves the credentials and sample curl command to /root/tensorflow-serving-credentials.txt, readable only by root.
Ready To Use Docker, the compose plugin, the tensorflow/serving image, the bundled sample model, nginx, the systemd units and the per-instance password are all configured. Browse to http:///v1/models/half_plus_two with the cloudimg user and the per-instance password to confirm the model is AVAILABLE, then POST instances to /v1/models/half_plus_two:predict to score inputs.
cloudimg Support 24/7 technical support by email and chat. Help with TensorFlow Serving deployment, model upgrades, gRPC and REST integration, nginx hardening, and TLS termination.
Use Cases Low-latency online inference for TensorFlow SavedModels. Multi-model serving via a single container. A/B testing model versions. Edge or regional model hosting with REST or gRPC clients.
TensorFlow and the TensorFlow logo are trademarks of Google LLC. All other product and company names are trademarks or registered trademarks of their respective holders. Use of them does not imply any affiliation with or endorsement by them.
Highlights
- TensorFlow Serving preinstalled and ready, with the canonical half_plus_two sample SavedModel and no manual setup required
- Hardened first boot rotates the nginx basic-auth password on every instance and stores credentials in a root-only file
- Around-the-clock technical support from cloudimg, with expert assistance for TF Serving deployment, upgrades, model swaps and nginx hardening
Details
Introducing multi-product solutions
You can now purchase comprehensive solutions tailored to use cases and industries.
Features and programs
Financing for AWS Marketplace purchases
Pricing
Free trial
- ...
Dimension | Description | Cost/hour |
|---|---|---|
c5.xlarge Recommended | c5.xlarge | $0.08 |
t2.micro | t2.micro instance type | $0.04 |
t3.micro | t3.micro instance type | $0.04 |
d3.4xlarge | d3.4xlarge instance type | $0.24 |
t3.small | t3.small instance type | $0.04 |
m5ad.8xlarge | m5ad.8xlarge instance type | $0.24 |
c8ine.8xlarge | c8ine.8xlarge instance type | $0.24 |
g6.4xlarge | g6.4xlarge instance type | $0.24 |
r5a.16xlarge | r5a.16xlarge instance type | $0.24 |
m5a.xlarge | m5a.xlarge instance type | $0.12 |
Vendor refund policy
Refunds available on request.
How can we make this page better?
Legal
Vendor terms and conditions
Content disclaimer
Delivery details
64-bit (x86) Amazon Machine Image (AMI)
Amazon Machine Image (AMI)
An AMI is a virtual image that provides the information required to launch an instance. Amazon EC2 (Elastic Compute Cloud) instances are virtual servers on which you can run your applications and workloads, offering varying combinations of CPU, memory, storage, and networking resources. You can launch as many instances from as many different AMIs as you need.
Version release notes
Initial release of TensorFlow Serving 2 with nginx basic-auth gateway and bundled half_plus_two sample SavedModel on a dedicated 20 GiB model volume.
Additional details
Usage instructions
Connect via SSH on port 22 as the default login user for your operating system variant (the user guide lists it per variant). The basic-auth-gated REST API is served on port 80 at the /v1/ prefix. Retrieve the generated credentials with: sudo cat /root/tensorflow-serving-credentials.txt. Health check: curl -u cloudimg:<password> http://<instance-public-ip>/v1/models/half_plus_two. Predict: curl -u cloudimg:<password> -d '{"instances":[1.0,2.0,5.0]}' -H 'Content-Type: application/json' http://<instance-public-ip>/v1/models/half_plus_two:predict. The raw TF Serving REST port 8501 and gRPC port 8500 are also published but unauthenticated -- restrict or remove those security group rules in production.
Resources
Vendor resources
Support
Vendor support
cloudimg provides 24/7 technical support for this product by email and live chat. Our engineers help with deployment, configuration, updates, performance tuning and troubleshooting; critical issues receive a one hour average response. Contact support@cloudimg.co.uk .
AWS infrastructure support
AWS Support is a one-on-one, fast-response support channel that is staffed 24x7x365 with experienced and technical support engineers. The service helps customers of all sizes and technical abilities to successfully utilize the products and features provided by Amazon Web Services.
Similar products
