Processing Big Data with Hadoop, Spark, and other frameworks in Amazon EMR (Level 300)

November 30, -0001
Amazon EMR provides a managed Hadoop framework that makes it easy, fast, and cost-effective to process vast amounts of data. You can also run other popular distributed frameworks such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi, and Presto. EMR Notebooks, based on the popular Jupyter Notebook, provide a development and collaboration environment for ad hoc querying and exploratory analysis. In this session, learn how data pipelines are currently evolving to be more elastic and scalable, and how Amazon EMR securely and reliably handles a broad set of big data use cases, including data transformations (ETL) and machine learning. Speaker: Aneesh Chandra PN, Big Data Solutions Architect, AWS
Previous Video
Building data integration services for real-time on AWS (Level 200)
Building data integration services for real-time on AWS (Level 200)

For many use cases timing is critical and the value of data diminishes rapidly. This means that every micro...

Next Video
Data warehousing on AWS: Amazon Redshift use cases and deployment patterns (Level 200)
Data warehousing on AWS: Amazon Redshift use cases and deployment patterns (Level 200)

The amount of data generated by IoT, smart devices, cloud applications, and social is growing exponentially...