Download Professional-Machine-Learning-Engineer Dumps (2022) - Free PDF Exam Demo [Q24-Q49]

Share

Download Professional-Machine-Learning-Engineer Dumps (2022) - Free PDF Exam Demo

Enhance your career with Professional-Machine-Learning-Engineer PDF Dumps - True Google Exam Questions

NEW QUESTION 24
A retail chain has been ingesting purchasing records from its network of 20,000 stores to Amazon S3 using Amazon Kinesis Data Firehose. To support training an improved machine learning model, training records will require new but simple transformations, and some attributes will be combined. The model needs to be retrained daily.
Given the large number of stores and the legacy data ingestion, which change will require the LEAST amount of development effort?

  • A. Spin up a fleet of Amazon EC2 instances with the transformation logic, have them transform the data records accumulating on Amazon S3, and output the transformed records to Amazon S3.
  • B. Deploy an Amazon EMR cluster running Apache Spark with the transformation logic, and have the cluster run each day on the accumulating records in Amazon S3, outputting new/transformed records to Amazon S3.
  • C. Insert an Amazon Kinesis Data Analytics stream downstream of the Kinesis Data Firehose stream that transforms raw record attributes into simple transformed values using SQL.
  • D. Require that the stores to switch to capturing their data locally on AWS Storage Gateway for loading into Amazon S3, then use AWS Glue to do the transformation.

Answer: C

 

NEW QUESTION 25
Your team is building an application for a global bank that will be used by millions of customers. You built a forecasting model that predicts customers1 account balances 3 days in the future. Your team will use the results in a new feature that will notify users when their account balance is likely to drop below $25. How should you serve your predictions?

  • A. 1. Create a Pub/Sub topic for each user
    2. Deploy an application on the App Engine standard environment that sends a notification when your model predicts that a user's account balance will drop below the $25 threshold
  • B. 1 Build a notification system on Firebase
    2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when your model predicts that a user's account balance will drop below the $25 threshold
  • C. 1. Build a notification system on Firebase
    2. Register each user with a user ID on the Firebase Cloud Messaging server, which sends a notification when the average of all account balance predictions drops below the $25 threshold
  • D. 1. Create a Pub/Sub topic for each user
    2 Deploy a Cloud Function that sends a notification when your model predicts that a user's account balance will drop below the $25 threshold.

Answer: D

 

NEW QUESTION 26
Your data science team needs to rapidly experiment with various features, model architectures, and hyperparameters. They need to track the accuracy metrics for various experiments and use an API to query the metrics over time. What should they use to track and report their experiments while minimizing manual effort?

  • A. Use Al Platform Training to execute the experiments Write the accuracy metrics to BigQuery, and query the results using the BigQueryAPI.
  • B. Use Al Platform Notebooks to execute the experiments. Collect the results in a shared Google Sheets file, and query the results using the Google Sheets API
  • C. Use Kubeflow Pipelines to execute the experiments Export the metrics file, and query the results using the Kubeflow Pipelines API.
  • D. Use Al Platform Training to execute the experiments Write the accuracy metrics to Cloud Monitoring, and query the results using the Monitoring API.

Answer: C

Explanation:
https://codelabs.developers.google.com/codelabs/cloud-kubeflow-pipelines-gis Kubeflow Pipelines (KFP) helps solve these issues by providing a way to deploy robust, repeatable machine learning pipelines along with monitoring, auditing, version tracking, and reproducibility. Cloud AI Pipelines makes it easy to set up a KFP installation.
https://www.kubeflow.org/docs/components/pipelines/introduction/#what-is-kubeflow-pipelines
"Kubeflow Pipelines supports the export of scalar metrics. You can write a list of metrics to a local file to describe the performance of the model. The pipeline agent uploads the local file as your run-time metrics. You can view the uploaded metrics as a visualization in the Runs page for a particular experiment in the Kubeflow Pipelines UI." https://www.kubeflow.org/docs/components/pipelines/sdk/pipelines-metrics/

 

NEW QUESTION 27
You developed an ML model with Al Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?

  • A. Switch to the tensorflow-model-server-universal version of TensorFlow Serving
  • B. Recompile TensorFlow Serving using the source to support CPU-specific optimizations Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes
  • C. Significantly increase the max_enqueued_batches TensorFlow Serving parameter
  • D. Significantly increase the max_batch_size TensorFlow Serving parameter

Answer: D

 

NEW QUESTION 28
A Data Scientist wants to gain real-time insights into a data stream of GZIP files.
Which solution would allow the use of SQL to query the stream with the LEAST latency?

  • A. Amazon Kinesis Data Analytics with an AWS Lambda function to transform the data.
  • B. An Amazon Kinesis Client Library to transform the data and save it to an Amazon ES cluster.
  • C. Amazon Kinesis Data Firehose to transform the data and put it into an Amazon S3 bucket.
  • D. AWS Glue with a custom ETL script to transform the data.

Answer: A

Explanation:
Explanation/Reference: https://aws.amazon.com/big-data/real-time-analytics-featured-partners/

 

NEW QUESTION 29
A large mobile network operating company is building a machine learning model to predict customers who are likely to unsubscribe from the service. The company plans to offer an incentive for these customers as the cost of churn is far greater than the cost of the incentive.
The model produces the following confusion matrix after evaluating on a test dataset of 100 customers:

Based on the model evaluation results, why is this a viable model for production?

  • A. The precision of the model is 86%, which is less than the accuracy of the model.
  • B. The model is 86% accurate and the cost incurred by the company as a result of false negatives is less than the false positives.
  • C. The precision of the model is 86%, which is greater than the accuracy of the model.
  • D. The model is 86% accurate and the cost incurred by the company as a result of false positives is less than the false negatives.

Answer: B

 

NEW QUESTION 30
You are working on a Neural Network-based project. The dataset provided to you has columns with different ranges. While preparing the data for model training, you discover that gradient optimization is having difficulty moving weights to a good solution. What should you do?

  • A. Improve the data cleaning step by removing features with missing values.
  • B. Change the partitioning step to reduce the dimension of the test set and have a larger training set.
  • C. Use the representation transformation (normalization) technique.
  • D. Use feature construction to combine the strongest features.

Answer: C

Explanation:
https://developers.google.com/machine-learning/data-prep/transform/transform-numeric
- NN models needs features with close ranges
- SGD converges well using features in [0, 1] scale
- The question specifically mention "different ranges"
Documentation - https://developers.google.com/machine-learning/data-prep/transform/transform-numeric

 

NEW QUESTION 31
Your team is building a convolutional neural network (CNN)-based architecture from scratch. The preliminary experiments running on your on-premises CPU-only infrastructure were encouraging, but have slow convergence. You have been asked to speed up model training to reduce time-to-market. You want to experiment with virtual machines (VMs) on Google Cloud to leverage more powerful hardware. Your code does not include any manual device placement and has not been wrapped in Estimator model-level abstraction. Which environment should you train your model on?

  • A. A Deep Learning VM with an n1-standard-2 machine and 1 GPU with all libraries pre-installed.
  • B. AVM on Compute Engine and 8 GPUs with all dependencies installed manually.
  • C. A Deep Learning VM with more powerful CPU e2-highcpu-16 machines with all libraries pre-installed.
  • D. AVM on Compute Engine and 1 TPU with all dependencies installed manually.

Answer: D

 

NEW QUESTION 32
You work for a bank and are building a random forest model for fraud detection. You have a dataset that includes transactions, of which 1% are identified as fraudulent.
Which data transformation strategy would likely improve the performance of your classifier?

  • A. Z-normalize all the numeric features.
  • B. Oversample the fraudulent transaction 10 times.
  • C. Use one-hot encoding on all categorical features.
  • D. Write your data in TFRecords.

Answer: B

 

NEW QUESTION 33
A real estate company wants to create a machine learning model for predicting housing prices based on a historical dataset. The dataset contains 32 features.
Which model will meet the business requirement?

  • A. Logistic regression
  • B. Principal component analysis (PCA)
  • C. K-means
  • D. Linear regression

Answer: D

 

NEW QUESTION 34
You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?

  • A. Traceability, reproducibility, and explainability
  • B. Federated learning, reproducibility, and explainability
  • C. Redaction, reproducibility, and explainability
  • D. Differential privacy federated learning, and explainability

Answer: A

Explanation:
https://www.oecd.org/finance/Impact-Big-Data-AI-in-the-Insurance-Sector.pdf
https://medium.com/artefact-engineering-and-data-science/including-ethics-best-practices-in-your-data-science-project-from-day-one-c15b26c2bf99

 

NEW QUESTION 35
A data scientist wants to use Amazon Forecast to build a forecasting model for inventory demand for a retail company. The company has provided a dataset of historic inventory demand for its products as a .csv file stored in an Amazon S3 bucket. The table below shows a sample of the dataset.

How should the data scientist transform the data?

  • A. Use AWS Batch jobs to separate the dataset into a target time series dataset, a related time series dataset, and an item metadata dataset. Upload them directly to Forecast from a local machine.
  • B. Use ETL jobs in AWS Glue to separate the dataset into a target time series dataset and an item metadata dataset. Upload both datasets as .csv files to Amazon S3.
  • C. Use a Jupyter notebook in Amazon SageMaker to transform the data into the optimized protobuf recordIO format. Upload the dataset in this format to Amazon S3.
  • D. Use a Jupyter notebook in Amazon SageMaker to separate the dataset into a related time series dataset and an item metadata dataset. Upload both datasets as tables in Amazon Aurora.

Answer: D

 

NEW QUESTION 36
A large company has developed a BI application that generates reports and dashboards using data collected from various operational metrics. The company wants to provide executives with an enhanced experience so they can use natural language to get data from the reports. The company wants the executives to be able ask questions using written and spoken interfaces.
Which combination of services can be used to build this conversational interface? (Choose three.)

  • A. Amazon Connect
  • B. Amazon Transcribe
  • C. Amazon Polly
  • D. Alexa for Business
  • E. Amazon Comprehend
  • F. Amazon Lex

Answer: A,B,E

 

NEW QUESTION 37
Your team needs to build a model that predicts whether images contain a driver's license, passport, or credit card. The data engineering team already built the pipeline and generated a dataset composed of 10,000 images with driver's licenses, 1,000 images with passports, and 1,000 images with credit cards. You now have to train a model with the following label map: ['driversjicense', 'passport', 'credit_card']. Which loss function should you use?

  • A. Categorical cross-entropy
  • B. Sparse categorical cross-entropy
  • C. Binary cross-entropy
  • D. Categorical hinge

Answer: B

Explanation:
se sparse_categorical_crossentropy. Examples for above 3-class classification problem: [1] , [2], [3]

 

NEW QUESTION 38
A Machine Learning Specialist is configuring Amazon SageMaker so multiple Data Scientists can access notebooks, train models, and deploy endpoints. To ensure the best operational performance, the Specialist needs to be able to track how often the Scientists are deploying models, GPU and CPU utilization on the deployed SageMaker endpoints, and all errors that are generated when an endpoint is invoked.
Which services are integrated with Amazon SageMaker to track this information? (Choose two.)

  • A. AWS Health
  • B. Amazon CloudWatch
  • C. AWS Trusted Advisor
  • D. AWS CloudTrail
  • E. AWS Config

Answer: B,D

Explanation:
Explanation/Reference: https://aws.amazon.com/sagemaker/faqs/

 

NEW QUESTION 39
You work with a data engineering team that has developed a pipeline to clean your dataset and save it in a Cloud Storage bucket. You have created an ML model and want to use the data to refresh your model as soon as new data is available. As part of your CI/CD workflow, you want to automatically run a Kubeflow Pipelines training job on Google Kubernetes Engine (GKE). How should you architect this workflow?

  • A. Configure a Cloud Storage trigger to send a message to a Pub/Sub topic when a new file is available in a storage bucket. Use a Pub/Sub-triggered Cloud Function to start the training job on a GKE cluster
  • B. Configure your pipeline with Dataflow, which saves the files in Cloud Storage After the file is saved, start the training job on a GKE cluster
  • C. Use App Engine to create a lightweight python client that continuously polls Cloud Storage for new files As soon as a file arrives, initiate the training job
  • D. Use Cloud Scheduler to schedule jobs at a regular interval. For the first step of the job. check the timestamp of objects in your Cloud Storage bucket If there are no new files since the last run, abort the job.

Answer: A

Explanation:
https://cloud.google.com/architecture/architecture-for-mlops-using-tfx-kubeflow-pipelines-and-cloud-build#triggering-and-scheduling-kubeflow-pipelines

 

NEW QUESTION 40
You are developing ML models with Al Platform for image segmentation on CT scans. You frequently update your model architectures based on the newest available research papers, and have to rerun training on the same dataset to benchmark their performance. You want to minimize computation costs and manual intervention while having version control for your code. What should you do?

  • A. Use Cloud Build linked with Cloud Source Repositories to trigger retraining when new code is pushed to the repository
  • B. Use the gcloud command-line tool to submit training jobs on Al Platform when you update your code
  • C. Use Cloud Functions to identify changes to your code in Cloud Storage and trigger a retraining job
  • D. Create an automated workflow in Cloud Composer that runs daily and looks for changes in code in Cloud Storage using a sensor.

Answer: B

 

NEW QUESTION 41
A Data Science team within a large company uses Amazon SageMaker notebooks to access data stored in Amazon S3 buckets. The IT Security team is concerned that internet-enabled notebook instances create a security vulnerability where malicious code running on the instances could compromise data privacy. The company mandates that all instances stay within a secured VPC with no internet access, and data communication traffic must stay within the AWS network.
How should the Data Science team configure the notebook instance placement to meet these requirements?

  • A. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has S3 VPC endpoints and Amazon SageMaker VPC endpoints attached to it.
  • B. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Ensure the VPC has a NAT gateway and an associated security group allowing only outbound connections to Amazon S3 and Amazon SageMaker.
  • C. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Use IAM policies to grant access to Amazon S3 and Amazon SageMaker.
  • D. Associate the Amazon SageMaker notebook with a private subnet in a VPC. Place the Amazon SageMaker endpoint and S3 buckets within the same VPC.

Answer: B

 

NEW QUESTION 42
You are training a deep learning model for semantic image segmentation with reduced training time. While using a Deep Learning VM Image, you receive the following error: The resource 'projects/deeplearning-platforn/zones/europe-west4-c/acceleratorTypes/nvidia-tesla-k80' was not found. What should you do?

  • A. Ensure that you have preemptible GPU quota in the selected region.
  • B. Ensure that the required GPU is available in the selected region.
  • C. Ensure that you have GPU quota in the selected region.
  • D. Ensure that the selected GPU has enough GPU memory for the workload.

Answer: C

 

NEW QUESTION 43
You are training an LSTM-based model on Al Platform to summarize text using the following job submission script:

You want to ensure that training time is minimized without significantly compromising the accuracy of your model. What should you do?

  • A. Modify the 'epochs' parameter
  • B. Modify the batch size' parameter
  • C. Modify the 'scale-tier' parameter
  • D. Modify the 'learning rate' parameter

Answer: A

 

NEW QUESTION 44
You are going to train a DNN regression model with Keras APIs using this code:

How many trainable weights does your model have? (The arithmetic below is correct.)

  • A. 501*256+257*128+128*2=161408
  • B. 500*256*0 25+256*128*0 25+128*2 = 40448
  • C. 500*256+256*128+128*2 = 161024
  • D. 501*256+257*128+2 = 161154

Answer: B

 

NEW QUESTION 45
A Machine Learning Specialist uploads a dataset to an Amazon S3 bucket protected with server-side encryption using AWS KMS.
How should the ML Specialist define the Amazon SageMaker notebook instance so it can read the same dataset from Amazon S3?

  • A. Сonfigure the Amazon SageMaker notebook instance to have access to the VPC. Grant permission in the KMS key policy to the notebook's KMS role.
  • B. Define security group(s) to allow all HTTP inbound/outbound traffic and assign those security group(s) to the Amazon SageMaker notebook instance.
  • C. Assign the same KMS key used to encrypt data in Amazon S3 to the Amazon SageMaker notebook instance.
  • D. Assign an IAM role to the Amazon SageMaker notebook with S3 read access to the dataset. Grant permission in the KMS key policy to that role.

Answer: C

Explanation:
Explanation/Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/encryption-at-rest.html

 

NEW QUESTION 46
You work for a credit card company and have been asked to create a custom fraud detection model based on historical data using AutoML Tables. You need to prioritize detection of fraudulent transactions while minimizing false positives. Which optimization objective should you use when training the model?

  • A. An optimization objective that maximizes the area under the precision-recall curve (AUC PR) value
  • B. An optimization objective that minimizes Log loss
  • C. An optimization objective that maximizes the area under the receiver operating characteristic curve (AUC ROC) value
  • D. An optimization objective that maximizes the Precision at a Recall value of 0.50

Answer: A

 

NEW QUESTION 47
Your organization wants to make its internal shuttle service route more efficient. The shuttles currently stop at all pick-up points across the city every 30 minutes between 7 am and 10 am. The development team has already built an application on Google Kubernetes Engine that requires users to confirm their presence and shuttle station one day in advance. What approach should you take?

  • A. 1. Build a tree-based classification model that predicts whether the shuttle should pick up passengers at each shuttle station.
    2. Dispatch an available shuttle and provide the map with the required stops based on the prediction
  • B. 1. Build a reinforcement learning model with tree-based classification models that predict the presence of passengers at shuttle stops as agents and a reward function around a distance-based metric
    2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the simulated outcome.
  • C. 1. Build a tree-based regression model that predicts how many passengers will be picked up at each shuttle station.
    2. Dispatch an appropriately sized shuttle and provide the map with the required stops based on the prediction.
  • D. 1. Define the optimal route as the shortest route that passes by all shuttle stations with confirmed attendance at the given time under capacity constraints.
    2 Dispatch an appropriately sized shuttle and indicate the required stops on the map

Answer: B

 

NEW QUESTION 48
You work on a growing team of more than 50 data scientists who all use Al Platform. You are designing a strategy to organize your jobs, models, and versions in a clean and scalable way. Which strategy should you choose?

  • A. Separate each data scientist's work into a different project to ensure that the jobs, models, and versions created by each data scientist are accessible only to that user.
  • B. Use labels to organize resources into descriptive categories. Apply a label to each created resource so that users can filter the results by label when viewing or monitoring the resources
  • C. Set up a BigQuery sink for Cloud Logging logs that is appropriately filtered to capture information about Al Platform resource usage In BigQuery create a SQL view that maps users to the resources they are using.
  • D. Set up restrictive I AM permissions on the Al Platform notebooks so that only a single user or group can access a given instance.

Answer: A

 

NEW QUESTION 49
......

100% Free Professional-Machine-Learning-Engineer Files For passing the exam Quickly: https://www.troytecdumps.com/Professional-Machine-Learning-Engineer-troytec-exam-dumps.html

New Download free Professional-Machine-Learning-Engineer PDF for Google Practice Tests: https://drive.google.com/open?id=1SiFKfUxSsk3IFB0zQpTrbt2HA8p97Qou