Scaling through distributed training

Machine learning data sets and models continue to increase in size, bringing accuracy improvements in computer vision and natural language processing tasks. This means data scientists will increasingly encounter situations where their model training cannot fit on one GPU instance. Distributed training enables scale beyond the limitations of one GPU, either through data parallelisation or model parallelisation. In this session, learn the basic concepts behind distributed training and understand how Amazon SageMaker can help you implement distributed training for your models faster.

Scaling through distributed training

Intelligent document processing using artificial intelligence

Bias detection and explainability in AI and machine learning applications

Developing fast and efficient data science while ensuring security and compliance

Computer vision at the edge

A/B testing machine learning models with Amazon SageMaker MLOps

Using reinforcement learning to solve business problems