Scaling through distributed training

Machine learning data sets and models continue to increase in size, bringing accuracy improvements in computer vision and natural language processing tasks. This means data scientists will increasingly encounter situations where their model training cannot fit on one GPU instance. Distributed training enables scale beyond the limitations of one GPU, either through data parallelisation or model parallelisation. In this session, learn the basic concepts behind distributed training and understand how Amazon SageMaker can help you implement distributed training for your models faster.
Previous Video
2021 and beyond: Trends in applying emerging technology
2021 and beyond: Trends in applying emerging technology

Want to be on top of the latest technology and learn how to get business value from emerging technology tre...

Next Video
Using reinforcement learning to solve business problems
Using reinforcement learning to solve business problems

Join this session to learn how to structure your business problem into a reinforcement learning structure, ...