Microsoft Perform Data Engineering on Microsoft Azure HDInsight

1.

You have an Azure HDInsight cluster.
You need a build a solution to ingest real-time streaming data into a nonrelational distributed database.
.What should you use to build the solution?

Apache Hive and Apache Kafka Spark and Phoenix Apache Storm and Apache HBase Apache Pig and Apache HCatalog

2.

You have an Apache Hive table that contains one billion rows.
You plan to use queries that will filter the data by using the WHERE clause. The values of the columns will be known only while the data loads into a Hive table.
You need to decrease the query runtime.
What should you configure?

static partitioning bucket sampling parallel execution dynamic partitioning

3.

You plan to copy data from Azure Blob storage to an Azure SQL database by using Azure Data Factory.
Which file formats can you use?

binary, JSON, Apache Parquet, and ORC OXPS, binary, text and JSON XML, Apache Avro, text, and ORC text, JSON, Apache Avro, and Apache Parquet

4.

You have an Apache Spark cluster in Azure HDInsight.
You plan to join a large table and a lookup table.
You need to minimize data transfers during the join operation.
What should you do?

Use the reduceByKey function. Use a Broadcast variable. Repartition the data. Use the DISK_ONLY storage level. Store the lookup table to a disk.

5.

You have an Apache Spark cluster in Azure HDInsight.
You execute the following command.
What is the result of running the command?

the Hive ORC library is imported to Spark and external tables in ORC format are created the Spark library is imported and the data is loaded to an Apache Hive table the Hive ORC library is imported to Spark and the ORC-formatted data stored in Apache Hive tables becomes accessible the Spark library is imported and Scala functions are executed

6.

You use YARN to manage the resources for a Spark Thrift Server running on a Linux-based Apache Spark cluster in Azure HDInsight. You discover that the cluster does not fully utilize the resources. You want to increase resource allocation. You need to increase the number of executors and the allocation of memory to the Spark Thrift Server driver. Which two parameters should you modify? Each correct answer presents part of the solution. NOTE: Each correct selection is worth one point.

spark.dynamicAllocation.maxExecutors spark.cores.max spark.executor.memory spark_thrift_cmd_opts spark.executor.instances

7.

You are configuring the Hive views on an Azure HDInsight cluster that is configured to use Kerberos.
You plan to use the YARN logs to troubleshoot a query that runs against Apache Hadoop.
You need to view the method, the service, and the authenticated account used to run the query.
Which method call should you view in the YARN logs?

HQL WebHDFS HDFS C* API Ambari RESR API

8.

You have an Azure HDInsight cluster.
You need to store data in a file format that maximizes compression and increases read performance.
Which type of file format should you use?

ORC Apache Parquet Apache Avro Apache Sequence

9.

You have an Apache Hadoop cluster in Azure HDInsight that has a head node and three data nodes. You have a MapReduce job.
You receive a notification that a data node failed.
You need to identify which component cause the failure.
Which tool should you use?

JobTracker TaskTracker ResourceManager ApplicationMaster

10.

You deploy Apache Kafka to an Azure HDInsight cluster.
You plan to load data into a topic that has a specific schema.
You need to load the data while maintaining the existing schema.
Which file format should you use to receive the data?

JSON Kudu Apache Sequence CSV

Microsoft Perform Data Engineering on Microsoft Azure HDInsight

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Congratulations

COMPANY

Products

OTHERS

Partner

Microsoft Perform Data Engineering on Microsoft Azure HDInsight

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Congratulations

Detail Form

COMPANY

Products

OTHERS

Partner