DP-100T01-A: Designing and implementing a data science solution on Azure

Prepare to pass the DP-100: Designing and Implementing a Data Science Solution on Azure Certification Exam

Course Description

Learn how to operate machine learning solutions at cloud scale using Azure Machine Learning. This course teaches you to leverage your existing knowledge of Python and machine learning to manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Azure Machine Learning and MLflow.

Audience Profile

This course is designed for data scientists with existing knowledge of Python and machine learning frameworks like Scikit-Learn, PyTorch, and Tensorflow, who want to build and operate machine learning solutions in the cloud.

About this Course

Course Outline

Skills at a glance

Design and prepare a machine learning solution (20–25%)
Explore data, and train models (35–40%)
Prepare a model for deployment (20–25%)
Deploy and retrain a model (10–15%)

Design and prepare a machine learning solution (20–25%)

Design a machine learning solution

Determine the appropriate compute specifications for a training workload
Describe model deployment requirements
Select which development approach to use to build or train a model

Manage an Azure Machine Learning workspace

Create an Azure Machine Learning workspace
Manage a workspace by using developer tools for workspace interaction
Set up Git integration for source control
Create and manage registries

Manage data in an Azure Machine Learning workspace

Select Azure Storage resources
Register and maintain datastores
Create and manage data assets

Manage compute for experiments in Azure Machine Learning

Create compute targets for experiments and training
Select an environment for a machine learning use case
Configure attached compute resources, including Azure Synapse Spark pools and serverless Spark compute
Monitor compute utilization

Explore data, and train models (35–40%)

Explore data by using data assets and data stores

Access and wrangle data during interactive development
Wrangle interactive data with attached Synapse Spark pools and serverless Spark compute

Create models by using the Azure Machine Learning designer

Create a training pipeline
Consume data assets from the designer
Use custom code components in designer
Evaluate the model, including responsible AI guidelines

Use automated machine learning to explore optimal models

Use automated machine learning for tabular data
Use automated machine learning for computer vision
Use automated machine learning for natural language processing
Select and understand training options, including preprocessing and algorithms
Evaluate an automated machine learning run, including responsible AI guidelines

Use notebooks for custom model training

Develop code by using a compute instance
Track model training by using MLflow
Evaluate a model
Train a model by using Python SDK v2
Use the terminal to configure a compute instance

Tune hyperparameters with Azure Machine Learning

Select a sampling method
Define the search space
Define the primary metric
Define early termination options

Prepare a model for deployment (20–25%)

Run model training scripts

Configure job run settings for a script
Configure compute for a job run
Consume data from a data asset in a job
Run a script as a job by using Azure Machine Learning
Use MLflow to log metrics from a job run
Use logs to troubleshoot job run errors
Configure an environment for a job run
Define parameters for a job

Implement training pipelines

Create a pipeline
Pass data between steps in a pipeline
Run and schedule a pipeline
Monitor pipeline runs
Create custom components
Use component-based pipelines

Manage models in Azure Machine Learning

Describe MLflow model output
Identify an appropriate framework to package a model
Assess a model by using responsible AI principles

Deploy and retrain a model (10–15%)

Deploy a model

Configure settings for online deployment
Configure compute for a batch deployment
Deploy a model to an online endpoint
Deploy a model to a batch endpoint
Test an online deployed service
Invoke the batch endpoint to start a batch scoring job

Apply machine learning operations (MLOps) practices

Trigger an Azure Machine Learning job, including from Azure DevOps or GitHub
Automate model retraining based on new data additions or data changes
Define event-based retraining triggers