Scaling through distributed training
Fill form to unlock content
Loading, please wait
Error - something went wrong!
AWS Summit Online Australia & New Zealand 2021
Thank you!
Machine learning data sets and models continue to increase in size, bringing accuracy improvements in computer vision and natural language processing tasks. This means data scientists will increasingly encounter situations where their model training cannot fit on one GPU instance. Distributed training enables scale beyond the limitations of one GPU, either through data parallelisation or model parallelisation. In this session, learn the basic concepts behind distributed training and understand how Amazon SageMaker can help you implement distributed training for your models faster.