Name: Data Engineering on Microsoft Azure
Brand: ValidExamDumps
SKU: DP-203
Price: 20 USD
Availability: InStock
Rating: 5.0 (250 reviews)

Free Microsoft DP-203 Exam Actual Questions

The questions for DP-203 were last updated On Apr 23, 2025

At ValidExamDumps, we consistently monitor updates to the Microsoft DP-203 exam questions by Microsoft. Whenever our team identifies changes in the exam questions,exam objectives, exam focus areas or in exam requirements, We immediately update our exam questions for both PDF and online practice exams. This commitment ensures our customers always have access to the most current and accurate questions. By preparing with these actual questions, our customers can successfully pass the Microsoft Data Engineering on Microsoft Azure exam on their first attempt without needing additional materials or study guides.

Other certification materials providers often include outdated or removed questions by Microsoft in their Microsoft DP-203 exam. These outdated questions lead to customers failing their Microsoft Data Engineering on Microsoft Azure exam. In contrast, we ensure our questions bank includes only precise and up-to-date questions, guaranteeing their presence in your actual exam. Our main priority is your success in the Microsoft DP-203 exam, not profiting from selling obsolete exam questions in PDF or Online Practice Test.

Question No. 1

You have an Azure subscription that contains a Microsoft Purview account named MP1, an Azure data factory named DF1, and a storage account named storage. MP1 is configured

10 scan storage1. DF1 is connected to MP1 and contains 3 dataset named DS1. DS1 references 2 file in storage.

In DF1, you plan to create a pipeline that will process data from DS1.

You need to review the schema and lineage information in MP1 for the data referenced by DS1.

Which two features can you use to locate the information? Each correct answer presents a complete solution.

NOTE: Each correct answer is worth one point.

Athe Storage browser of storage1 in the Azure portal

Bthe search bar in the Azure portal

Cthe search bar in Azure Data Factory Studio

Dthe search bar in the Microsoft Purview governance portal

Show Answer

Correct Answer: A, D

The search bar in the Microsoft Purview governance portal: This is a feature that allows you to search for assets in your data estate using keywords, filters, and facets.You can use the search bar to find the files in storage1 that are referenced by DS1, and then view their schema and lineage information in the asset details page12.

The search bar in Azure Data Factory Studio: This is a feature that allows you to search for datasets, linked services, pipelines, and other resources in your data factory. You can use the search bar to find DS1 in DF1, and then view its schema and lineage information in the dataset details page.You can also click on the Open in Purview button to open the corresponding asset in MP13.

The two features that can be used to locate the schema and lineage information for the data referenced by DS1 are the search bar in Azure Data Factory Studio and the search bar in the Microsoft Purview governance portal.

The search bar in Azure Data Factory Studio allows you to search for the dataset DS1 and view its properties and lineage. This can help you locate information about the source and destination data stores, as well as the transformations that were applied to the data.

The search bar in the Microsoft Purview governance portal allows you to search for the storage account and view its metadata, including schema and lineage information. This can help you understand the different data assets that are stored in the storage account and how they are related to each other.

The Storage browser of storage1 in the Azure portal may allow you to view the files that are stored in the storage account, but it does not provide lineage or schema information for those files. Similarly, the search bar in the Azure portal may allow you to search for resources in the Azure subscription, but it does not provide detailed information about the data assets themselves.

What is Azure Purview?

Use Azure Data Factory Studio

Question No. 2

You have two Azure Data Factory instances named ADFdev and ADFprod. ADFdev connects to an Azure DevOps Git repository.

You publish changes from the main branch of the Git repository to ADFdev.

You need to deploy the artifacts from ADFdev to ADFprod.

What should you do first?

AFrom ADFdev, modify the Git configuration.

BFrom ADFdev, create a linked service.

CFrom Azure DevOps, create a release pipeline.

DFrom Azure DevOps, update the main branch.

Show Answer

Correct Answer: C

In Azure Data Factory, continuous integration and delivery (CI/CD) means moving Data Factory pipelines from one environment (development, test, production) to another.

Note:

The following is a guide for setting up an Azure Pipelines release that automates the deployment of a data factory to multiple environments.

In Azure DevOps, open the project that's configured with your data factory.

On the left side of the page, select Pipelines, and then select Releases.

Select New pipeline, or, if you have existing pipelines, select New and then New release pipeline.

In the Stage name box, enter the name of your environment.

Select Add artifact, and then select the git repository configured with your development data factory. Select the publish branch of the repository for the Default branch. By default, this publish branch is adf_publish.

Select the Empty job template.

https://docs.microsoft.com/en-us/azure/data-factory/continuous-integration-deployment

Question No. 3

You have an enterprise data warehouse in Azure Synapse Analytics named DW1 on a server named Server1.

You need to verify whether the size of the transaction log file for each distribution of DW1 is smaller than 160 GB.

What should you do?

AOn the master database, execute a query against the sys.dm_pdw_nodes_os_performance_counters dynamic management view.

BFrom Azure Monitor in the Azure portal, execute a query against the logs of DW1.

COn DW1, execute a query against the sys.database_files dynamic management view.
D. Execute a query against the logs of DW1 by using the
Get-AzOperationalInsightSearchResult PowerShell cmdlet.

Show Answer

Correct Answer: A

The following query returns the transaction log size on each distribution. If one of the log files is reaching 160 GB, you should consider scaling up your instance or limiting your transaction size.

-- Transaction log size

SELECT

instance_name as distribution_db,

cntr_value*1.0/1048576 as log_file_size_used_GB,

pdw_node_id

FROM sys.dm_pdw_nodes_os_performance_counters

WHERE

instance_name like 'Distribution_%'

AND counter_name = 'Log File(s) Used Size (KB)'

https://docs.microsoft.com/en-us/azure/sql-data-warehouse/sql-data-warehouse-manage-monitor

Question No. 4

You have two Azure Blob Storage accounts named account1 and account2?

You plan to create an Azure Data Factory pipeline that will use scheduled intervals to replicate newly created or modified blobs from account1 to account?

You need to recommend a solution to implement the pipeline. The solution must meet the following requirements:

* Ensure that the pipeline only copies blobs that were created of modified since the most recent replication event.

* Minimize the effort to create the pipeline.

What should you recommend?

ACreate a pipeline that contains a flowlet.

BCreate a pipeline that contains a Data Flow activity.

CRun the Copy Data tool and select Metadata-driven copy task.

DRun the Copy Data tool and select Built-in copy task.

Show Answer

Correct Answer: A

Question No. 5

You plan to perform batch processing in Azure Databricks once daily.

Which type of Databricks cluster should you use?

AHigh Concurrency

Bautomated

Cinteractive

Show Answer

Correct Answer: C

Azure Databricks has two types of clusters: interactive and automated. You use interactive clusters to analyze data collaboratively with interactive notebooks. You use automated clusters to run fast and robust automated jobs.

Example: Scheduled batch workloads (data engineers running ETL jobs)

This scenario involves running batch job JARs and notebooks on a regular cadence through the Databricks platform.

The suggested best practice is to launch a new cluster for each run of critical jobs. This helps avoid any issues (failures, missing SLA, and so on) due to an existing workload (noisy neighbor) on a shared cluster.

https://docs.databricks.com/administration-guide/cloud-configurations/aws/cmbp.html#scenario-3-scheduled-batch-workloads-data-engineers-running-etl-jobs