Amazon AWS Certified Big Data Specialist

Question - 30 Maximum mark : - 30

Home
All Exams
Amazon AWS Certified Big Data Specialist Online Practice Exams
Amazon AWS Certified Big Data Specialist

A city has been collecting data on its public bicycle share program for the past three years. The 5PB dataset currently resides on Amazon S3. The data contains the following datapoints:
Bicycle origination points
Bicycle destination points
Mileage between the points
Number of bicycle slots available at the station (which is variable based on the station location)
Number of slots available and taken at a given time
The program has received additional funds to increase the number of bicycle stations available. All data is regularly archived to Amazon Glacier.
The new bicycle stations must be located to provide the most riders access to bicycles.
How should this task be performed?

Move the data from Amazon S3 into Amazon EBS-backed volumes and use an EC-2 based Hadoop cluster with spot instances to run a Spark job that performs a stochastic gradient descent optimization. ✘ Use the Amazon Redshift COPY command to move the data from Amazon S3 into Redshift and perform a SQL query that outputs the most popular bicycle stations. ✘ Persist the data on Amazon S3 and use a transient EMR cluster with spot instances to run a Spark streaming job that will move the data into Amazon Kinesis. ✘ Keep the data on Amazon S3 and use an Amazon EMR-based Hadoop cluster with spot instances to run a Spark job that performs a stochastic gradient descent optimization over EMRFS. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

A company is centralizing a large number of unencrypted small files from multiple Amazon S3 buckets. The company needs to verify that the files contain the same data aftercentralization.
Which method meets the requirements?

Compare the S3 Etags from the source and destination objects. ✘ Call the S3 CompareObjects API for the source and destination objects. ✘ Place a HEAD request against the source and destination objects comparing SIG v4. ✘ Compare the size of the source and destination objects. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

An administrator receives about 100 files per hour into Amazon S3 and will be loading the files into Amazon Redshift. Customers who analyze the data within Redshift gain significant value when they receive data as quickly as possible. The customers have agreed to a maximum loading interval of 5 minutes.
Which loading approach should the administrator use to meet this objective?

Load each file as it arrives because getting data into the cluster as quickly as possibly is the priority. ✘ Load the cluster as soon as the administrator has the same number of files as nodes in the cluster. ✘ Load the cluster when the administrator has an event multiple of files relative to Cluster Slice Count, or 5 minutes, whichever comes first. ✘ Load the cluster when the number of files is less than the Cluster Slice Count. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

An enterprise customer is migrating to Redshift and is considering using dense storage nodes in its Redshift cluster. The customer wants to migrate 50 TB of data. The customer’s query patterns involve performing many joins with thousands of rows.
The customer needs to know how many nodes are needed in its target Redshift cluster. The customer has a limited budget and needs to avoid performing tests unless absolutely needed. Which approach should this customer use?

Start with many small nodes. ✘ Start with fewer large nodes. ✘ Have two separate clusters with a mix of a small and large nodes. ✘ Insist on performing multiple tests to determine the optimal configuration. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

A system needs to collect on-premises application spool files into a persistent storage layer in AWS. Each spool file is 2 KB. The application generates 1 M files per hour. Each source file is automatically deleted from the local server after an hour.
What is the most cost-efficient option to meet these requirements?

Write file contents to an Amazon DynamoDB table. ✘ Copy files to Amazon S3 Standard Storage. ✘ Write file contents to Amazon ElastiCache. ✘ Copy files to Amazon S3 infrequent Access Storage. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

A telecommunications company needs to predict customer churn (i.e., customers who decide to switch to a competitor). The company has historic records of each customer, including monthly consumption patterns, calls to customer service, and whether the customer ultimately quit the service. All of this data is stored in Amazon S3. The company needs to know which customers are likely going to churn soon so that they can win back their loyalty.
What is the optimal approach to meet these requirements?

Use the Amazon Machine Learning service to build the binary classification model based on the dataset stored in Amazon S3. The model will be used regularly to predict churn attribute for existing customers. ✘ Use AWS QuickSight to connect it to data stored in Amazon S3 to obtain the necessary business insight. Plot the churn trend graph to extrapolate churn likelihood for existing customers. ✘ Use EMR to run the Hive queries to build a profile of a churning customer. Apply a profile to existing customers to determine the likelihood of churn. ✘ Use a Redshift cluster to COPY the data from Amazon S3. Create a User Defined Function in Redshift that computes the likelihood of churn. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

The department of transportation for a major metropolitan area has placed sensors on roads at key locations around the city. The goal is to analyze the flow of traffic and notifications from emergency services to identify potential issues and to help planners correct trouble spots. A data engineer needs a scalable and fault-tolerant solution that allows planners to respond to issues within 30 seconds of their occurrence.
Which solution should the data engineer choose?

Collect the sensor data with Amazon Kinesis Firehose and store it in Amazon Redshift for analysis. Collect emergency services events with Amazon SQS and store in Amazon DynampDB for analysis. ✘ Collect the sensor data with Amazon SQS and store in Amazon DynamoDB for analysis. Collect emergency services events with Amazon Kinesis Firehose and store in Amazon Redshift for analysis. ✘ Collect both sensor data and emergency services events with Amazon Kinesis Streams and use DynamoDB for analysis. ✘ Collect both sensor data and emergency services events with Amazon Kinesis Firehose and use Amazon Redshift for analysis. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

An administrator is processing events in near real-time using Kinesis streams and Lambda. Lambda intermittently fails to process batches from one of the shards due to a 5-munite time limit.
What is a possible solution for this problem?

Add more Lambda functions to improve concurrent batch processing. ✘ Reduce the batch size that Lambda is reading from the stream. ✘ Ignore and skip events that are older than 5 minutes and put them to Dead Letter Queue (DLQ). ✘ Configure Lambda to read from fewer shards in parallel. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

An organization uses Amazon Elastic MapReduce(EMR) to process a series of extract-transform-load (ETL) steps that run in sequence. The output of each step must be fully processed in subsequent steps but willnot be retained.
Which of the following techniques will meet this requirement most efficiently?

Use the EMR File System (EMRFS) to store the outputs from each step as objects in Amazon Simple Storage Service (S3). ✘ Use the s3n URI to store the data to be processed as objects in Amazon S3. ✘ Define the ETL steps as separate AWS Data Pipeline activities. ✘ Load the data to be processed into HDFS, and then write the final output to Amazon S3. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

10.

An organization uses a custom map reduce application to build monthly reports based on many small data files in an Amazon S3 bucket. The data is submitted from various business units on a frequent but unpredictable schedule. As the dataset continues to grow, it becomes increasingly difficult to process all of the data in one day. The organization has scaled up its Amazon EMR cluster, but other optimizations could improve performance.
The organization needs to improve performance with minimal changes to existing processes and applications.
What action should the organization take?

Use Amazon S3 Event Notifications and AWS Lambda to create a quick search file index in DynamoDB. ✘ Add Spark to the Amazon EMR cluster and utilize Resilient Distributed Datasets in-memory. ✘ Use Amazon S3 Event Notifications and AWS Lambda to index each file into an Amazon Elasticsearch Service cluster. ✘ Schedule a daily AWS Data Pipeline process that aggregates content into larger files using S3DistCp. ✔ Have business units submit data via Amazon Kinesis Firehose to aggregate data hourly into Amazon S3. ✔

Answer & Solution Discuss in Board Save for Later

Answer & Solution

Answer: Option B

Solution:

It provides fallback for IE8.

Prev Next

Congratulations

you have successfully completed the test

Back to Question View Result

Detail Form

We will send your result on your email id and phone no. please fill detail

Name

Email ID Note: Report will be send to the above Email ID

Phone No

Amazon AWS Certified Big Data Specialist

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Answer & Solution

Answer: Option B

Solution:

Congratulations

COMPANY

Products

OTHERS

Partner