AWS Marketplace: Deep Learning AMI with Source Code (CUDA 9, Ubuntu) Comments

Sign in

Sign in

or

Create a new account

Agent Mode

Categories

About What is AWS Marketplace?Why AWS Marketplace?Get started in AWS Marketplace Procurement options Cost management tools Governance & control features ??service_pages.dropdown.free_trials_en??Sell in AWS Marketplace

AI Agents & Tools AI Security Content Creation Customer Experience Personalization Customer Support Data Analysis Finance & Accounting IT Support Legal & Compliance Observability Procurement & Supply Chain Quality Assurance Research Sales & Marketing Scheduling & Coordination Software Development

Business Applications Blockchain Collaboration & Productivity Contact Center Content Management CRM eCommerce eLearning Human Resources IT Business Management Project Management

Cloud Operations Cloud Financial Management Cloud Governance

Data Products Automotive Data Environmental Data Financial Services Data Gaming Data Healthcare & Life Sciences Data Manufacturing Data Media & Entertainment Data Public Sector Data Resources Data Retail, Location & Marketing Data Telecommunications Data

DevOps Agile Lifecycle Management Application Development Application Servers Application Stacks Continuous Integration and Continuous Delivery Infrastructure as Code Issue & Bug Tracking Log Analysis Monitoring Source Control Testing

Industries Automotive Education & Research Energy Financial Services Healthcare & Life Sciences Industrial Media & Entertainment

Infrastructure Software Backup & Recovery Data Analytics High Performance Computing Migration Network Infrastructure Operating Systems Security Storage

IoT Analytics Applications Device Connectivity Device Management Device Security Industrial IoT Smart Home & City

Machine Learning Audio Computer Vision Data Labeling Services Generative AI Human Review Services Image Intelligent Automation ML Solutions Natural Language Processing Speech Recognition Structured Text Video

Professional Services Assessments Implementation Managed Services Premium Support Training

Delivery Methods API-Based Agents & Tools Amazon Machine Image EC2 Image Builder Component Amazon SageMaker AWS Data Exchange CloudFormation Stack Container Image Helm Chart Add-on for Amazon EKS Professional Services SaaS

Solutions AI Agents & Tools AWS Well-Architected Business Applications CloudOps Data & Analytics Data Products DevOps Digital Sovereignty Generative AI Infrastructure Software Internet of Things Machine Learning Managed Services Providers Migration Security

Industry ??industrySolutions.dropdown.advertising_and_marketing_en??Energy ??industrySolutions.dropdown.engineering_construction_and_real_estate_en??Financial Services Healthcare & Life Industrial ??industrySolutions.dropdown.life_sciences_en??Media & Entertainment Nonprofit ??industrySolutions.dropdown.power_and_utility_en??Public Health Public Sector ??industrySolutions.dropdown.retail_en????industrySolutions.dropdown.sustainability_en??Telecommunications

AWS Service Integrations AWS Control Tower AWS PrivateLink Pre-trained Amazon SageMaker Models

Resources All resources Developer tools & tutorials Blog Events & webinars Analyst reports Customer success stories Buyer guide Frequently asked questions

Your Saved List

Become a Channel Partner Sell in AWS Marketplace Amazon Web Services Home Help

Tensorflow Batchnorm Issue but otherwise good

By Everett Berry
on 11/15/2017

This is a great instance for the CUDA versions and configuration and once I fixed the issue below my training was very fast. HOWEVER you should be very careful with using tensorflow on this instance. It is a Frankenstein's Monster of bleeding edge tensorflow (1.4-rc0) plus some PRs which have not even been merged to master to take advantage of the Voltas and CUDA 9.

My issue was:
'AttributeError: can't set attribute' while using the BatchNormalization layer in Tensorflow. It relates to this PR (https://github.com/tensorflow/tensorflow/pull/13388) where a 'dtype' is added to BatchNorm to allow for FP16 and FP 32 operations. There is an extra line in the tensorflow included in this AMI in /usr/local/lib/python2.7/dist-packages/tensorflow/python/layers/normalization.py on line 145, 'self.dtype = dtype' which causes the error above when using the normal BatchNorm api. Commenting this line out fixes the problem.

Weirdly, this assignment on line 145 is not included in the PR (although the dates and authors match) so I think there must have been a rebase or something. Regardless, the line exists in the tensorflow in this ami and will cause you pain on almost any neural network because they almost all use BatchNorm. I couldn't figure out where I should post this because the code on Github does not have this problem.

Other than that - this is a fine AMI and I'm grateful to AWS for providing it and
for their continued advances in GPUs.