IL - Azure Big Data and Analytics Boot Camp
The Azure Big Data and Analytics Boot camp is designed to give students a clear architectural understanding of the application of big data patterns in Azure. Students will participate in team based architectural planning and hands-on implementation sessions. Students will be taught basic Lambda architecture patterns in Azure, leveraging the scalability and elasticity of Azure in Big Data and IoT solutions.
An introduction to data science techniques in Azure will also be covered. Individual case studies will focus on specific real-world problems that represent common big data patterns and practices. Students will also experience several hands-on labs to introduce them to some of the key services available.
- Duration: 5 Days
- Level: 300
Who this course is designed for
- Data Specialists
- Data Scientists
- Understand the key capabilities of several Azure Data, Storage, Analytics and Intelligence services
- Understand the core storage services including Data Lake Store, Blob Storage, HDFS, Event Hubs and IoT Hubs
- Understand core processing services including HDInsight, Stream Analytics, SQL Data Warehouse and Data Lake Analytics
- Understand how to operationalize data pipelines with Data Factory
- Understand common architectures including Lambda and Kappa architectures
- Understand how to manage and secure the data solution
- Prepare for Exam 70-475 Designing and Implementing Big Data Solutions in Azure
MODULE 1: Overview of the Azure Analytics Platform
In this module, students will learn the basics of analytics pipeline terminology and where the Microsoft Azure services fit. This module introduces the Lambda Architecture, which is used as a reference architecture for building an analytics data pipeline.
MODULE 2: Bulk and relational ingest
In this module, students will be introduced to the various tools and protocols available for the loading of data from bulk and relational sources for ingestion into an Azure based analytics pipeline.
MODULE 3: Ingest storage
In this module, students will be introduced to the Microsoft Azure services that support batch storage of ingested data: Azure Storage Blobs, Data Lake Store and HDFS.
MODULE 4: Batch Processing
In this module, students will be introduced to some of the services offered by Microsoft Azure that support the batch processing of data at scale. Topics include the application of HDInsight to perform batch processing the MapReduce, Tez and Spark. Similarly, SQL Data Warehouse is introduced to support processing of data present in batch storage.
MODULE 5: Interactive Processing & Querying
In this module, students will be introduced to the services which enable lower latency, interactive querying of big data. Students will learn various options for querying data using SQL. Service covered include Azure SQL Data Warehouse, HDInsight with Spark SQL, HDInsight with HBase/Phoenix and performing analytics with Data Lake Analytics with USQL.
MODULE 6: Real-Time Ingest & Storage
In this module, students will learn about the protocols for real-time ingest including HTTP, AMQP and MQTT and the storage of data received using queue based services including Event Hubs and IoT Hub.
MODULE 7: Real-time Processing
In this module, the student will learn about different services and capabilities of Azure for processing ingested real-time data. Key concepts such as tuple-at-time and micro-batch processing are introduced. Services covered include HDInsight with Apache Storm, HDInsight with Storm/Trident, HDInsight with Spark Streaming, Web Jobs, Azure Functions, and Stream Analytics.
MODULE 8: Intelligence & Machine Learning
In this module, student will understand the fundamentals of machine learning using Azure Machine Learning. Covered topics include ML Studio, Training Experiments, Predictive Experiments and operationalizing experiments with Web Services and Cortana Intelligence components.
MODULE 9: Data Pipelines
This module will help the student pull all the pieces together into a pipeline managed under a single pane of glass by using Azure Data Factory.
MODULE 10: Security & Governance
In this concluding module, the student will look horizontally across the data pipeline to understand how to secure the data at rest and in transit, as well to enable governance and discovery with services such as Azure Data Catalog.
3 Day Version
- 40% of the training will be taken online with using self-paced videos and hands-on labs.
- 60% of the training will be taken with a live-instructor using a virtual training environment and hands-on labs
5 Day Version
- 100% instructor-led with hands-on labs and architecture design sessions