DP-100T01-A: Designing and implementing a data science solution on Azure
Prepare to pass the DP-100: Designing and Implementing a Data Science Solution on Azure Certification Exam
Course Description
Learn how to operate machine learning solutions at cloud scale using Azure Machine Learning. This course teaches you to leverage your existing knowledge of Python and machine learning to manage data ingestion and preparation, model training and deployment, and machine learning solution monitoring with Azure Machine Learning and MLflow.
Audience Profile
This course is designed for data scientists with existing knowledge of Python and machine learning frameworks like Scikit-Learn, PyTorch, and Tensorflow, who want to build and operate machine learning solutions in the cloud.
About this Course
Course Outline
Skills at a glance
Design and prepare a machine learning solution (20–25%)
Explore data, and train models (35–40%)
Prepare a model for deployment (20–25%)
Deploy and retrain a model (10–15%)
Design and prepare a machine learning solution (20–25%)
Design a machine learning solution
Determine the appropriate compute specifications for a training workload
Describe model deployment requirements
Select which development approach to use to build or train a model
Manage an Azure Machine Learning workspace
Create an Azure Machine Learning workspace
Manage a workspace by using developer tools for workspace interaction
Set up Git integration for source control
Create and manage registries
Manage data in an Azure Machine Learning workspace
Select Azure Storage resources
Register and maintain datastores
Create and manage data assets
Manage compute for experiments in Azure Machine Learning
Create compute targets for experiments and training
Select an environment for a machine learning use case
Configure attached compute resources, including Azure Synapse Spark pools and serverless Spark compute
Monitor compute utilization
Explore data, and train models (35–40%)
Explore data by using data assets and data stores
Access and wrangle data during interactive development
Wrangle interactive data with attached Synapse Spark pools and serverless Spark compute
Create models by using the Azure Machine Learning designer
Create a training pipeline
Consume data assets from the designer
Use custom code components in designer
Evaluate the model, including responsible AI guidelines
Use automated machine learning to explore optimal models
Use automated machine learning for tabular data
Use automated machine learning for computer vision
Use automated machine learning for natural language processing
Select and understand training options, including preprocessing and algorithms
Evaluate an automated machine learning run, including responsible AI guidelines
Use notebooks for custom model training
Develop code by using a compute instance
Track model training by using MLflow
Evaluate a model
Train a model by using Python SDK v2
Use the terminal to configure a compute instance
Tune hyperparameters with Azure Machine Learning
Select a sampling method
Define the search space
Define the primary metric
Define early termination options
Prepare a model for deployment (20–25%)
Run model training scripts
Configure job run settings for a script
Configure compute for a job run
Consume data from a data asset in a job
Run a script as a job by using Azure Machine Learning
Use MLflow to log metrics from a job run
Use logs to troubleshoot job run errors
Configure an environment for a job run
Define parameters for a job
Implement training pipelines
Create a pipeline
Pass data between steps in a pipeline
Run and schedule a pipeline
Monitor pipeline runs
Create custom components
Use component-based pipelines
Manage models in Azure Machine Learning
Describe MLflow model output
Identify an appropriate framework to package a model
Assess a model by using responsible AI principles
Deploy and retrain a model (10–15%)
Deploy a model
Configure settings for online deployment
Configure compute for a batch deployment
Deploy a model to an online endpoint
Deploy a model to a batch endpoint
Test an online deployed service
Invoke the batch endpoint to start a batch scoring job
Apply machine learning operations (MLOps) practices
Trigger an Azure Machine Learning job, including from Azure DevOps or GitHub
Automate model retraining based on new data additions or data changes
Define event-based retraining triggers
Duration
4 Days
Prerequisites
none
Level
Intermediate
Role
Data Scientist
Product
Azure