Free Amazon MLA-C01 Übungsprüfungen - Seite 2 von 6

Question #11

You are a data scientist at a retail company responsible for deploying a machine learning model that predicts customer purchase behavior. The model needs to serve real-time predictions with low latency to support the company’s recommendation engine on its e-commerce platform. The deployment solution must also be scalable to handle varying traffic loads during peak shopping periods, such as Black Friday and holiday sales. Additionally, you need to monitor the model’s performance and automatically roll out updates when a new version of the model is available.

Given these requirements, which AWS deployment service and configuration is the MOST SUITABLE for deploying the machine learning model?

A . Deploy the model on Amazon EC2 instances with a load balancer to distribute traffic, manually scaling the instances based on expected traffic during peak periods
B . Deploy the model on Amazon SageMaker with batch transform jobs, running the jobs periodically to generate predictions and storing the results in Amazon S3 for the recommendation engine
C . Deploy the model using Amazon SageMaker real-time hosting services with an auto-scaling endpoint, enabling you to automatically adjust the number of instances based on traffic demand
D . Use AWS Lambda to deploy the model as a serverless function, automatically scaling based on the number of requests, and store the model artifacts in Amazon S3

Lösung einblenden Lösung ausblenden

Richtige Antwort: C
C

Explanation:

Correct option:

Deploy the model using Amazon SageMaker real-time hosting services with an auto-scaling endpoint, enabling you to automatically adjust the number of instances based on traffic demand

Amazon SageMaker hosting services provide a managed environment for deploying models in real-time, with support for auto-scaling based on traffic. This ensures low latency and scalability during peak periods, making it ideal for an e-commerce platform with fluctuating traffic. SageMaker also offers monitoring and versioning capabilities, allowing you to manage model updates efficiently.

Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. You can deploy your model to SageMaker hosting services and get an endpoint that can be used for inference. These endpoints are fully managed and support autoscaling.

via – https://docs.aws.amazon.com/sagemaker/latest/dg/pipelines.html

Incorrect options:

Deploy the model on Amazon EC2 instances with a load balancer to distribute traffic, manually scaling the instances based on expected traffic during peak periods – Deploying on Amazon EC2 with a load balancer can work, but it requires manual scaling, which may not be responsive enough to sudden traffic spikes. It also lacks the integrated monitoring and management features provided by

SageMaker.

Use AWS Lambda to deploy the model as a serverless function, automatically scaling based on the number of requests, and store the model artifacts in Amazon S3 – AWS Lambda is suitable for serverless deployments with lightweight models and use cases where request volume is unpredictable.

However, it may not be ideal for high-throughput, low-latency applications like real-time recommendation engines, especially if the model is large or requires significant compute resources.

Deploy the model on Amazon SageMaker with batch transform jobs, running the jobs periodically to generate predictions and storing the results in Amazon S3 for the recommendation engine – Batch transform jobs in SageMaker are designed for batch predictions rather than real-time inference. While this approach works for generating predictions in bulk, it does not meet the requirement for real-time, low-latency predictions needed by the recommendation engine.

References:

https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-deployment.html

https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html

Question #12

When is unexplainability not acceptable?

A . When determining the result of a sports match
B . When making product recommendations
C . When explaining why a loan was declined
D . When explaining why a transaction was deemed fraudulent

Lösung einblenden Lösung ausblenden

Question #13

What is the bias versus variance trade-off in machine learning?

A . The bias versus variance trade-off refers to the balance between underfitting and overfitting, where high bias leads to overfitting and high variance leads to underfitting
B . The bias versus variance trade-off is a technique used to improve model performance by increasing both bias and variance simultaneously to achieve better generalization
C . The bias versus variance trade-off refers to the challenge of balancing the error due to the model’s complexity (variance) and the error due to incorrect assumptions in the model (bias), where high bias can cause underfitting and high variance can cause overfitting
D . The bias versus variance trade-off involves choosing between a model with high complexity that may capture more noise (high bias) and a simpler model that may generalize better but miss important patterns (high variance)

Lösung einblenden Lösung ausblenden

Question #14

A company has recently migrated to AWS Cloud and it wants to optimize the hardware used for its AI workflows.

Which of the following would you suggest?

A . Leverage either AWS Trainium or AWS Inferentia for the deep learning (DL) and generative AI inference applications
B . Leverage AWS Trainium for high-performance, cost-effective Deep Learning training. Leverage AWS Inferentia for the deep learning (DL) and generative AI inference applications
C . Leverage either AWS Trainium or AWS Inferentia for high-performance, cost-effective Deep Learning training
D . Leverage AWS Inferentia for high-performance, cost-effective Deep Learning training. Leverage AWS Trainium for the deep learning (DL) and generative AI inference applications

Lösung einblenden Lösung ausblenden

Question #15

What role will machine learning play in the future of technology?

A . It will only be used in healthcare
B . It will be boundless with applications in various fields
C . It will become obsolete
D . It will be limited to fraud detection

Lösung einblenden Lösung ausblenden

Question #16

You are a data engineer responsible for monitoring the performance of a suite of machine learning models deployed across multiple environments at an e-commerce company. The models are used for various tasks, including recommendation engines, demand forecasting, and customer segmentation. To ensure that these models are performing optimally, you need to set up a centralized dashboard that allows stakeholders to monitor key performance metrics such as latency, accuracy, throughput, and resource utilization. The dashboard should be user-friendly, provide insights at a glance, and support both technical and non-technical users.

Which approach is MOST SUITABLE for setting up a dashboard to monitor the performance metrics of these ML models?

A . Use Amazon QuickSight to create a visual dashboard that integrates data from Amazon CloudWatch Logs via Amazon S3, providing interactive charts and graphs that allow stakeholders to drill down into specific metrics as needed
B . Create a custom web application that pulls data from CloudWatch Logs, generating real-time visualizations of performance metrics. Embed the application within the company’s internal portal for easy access
C . Set up a CloudWatch Dashboard to aggregate and display performance metrics such as CPU utilization, memory usage, and latency from various services. Use Amazon SNS to send daily reports based on these metrics to stakeholders
D . Use AWS Lambda to query CloudWatch Logs and send the results to an S3 bucket daily, where stakeholders can manually review the metrics and create their own visualizations using spreadsheet software

Lösung einblenden Lösung ausblenden

Richtige Antwort: A
A

Explanation:

Correct option:

Use Amazon QuickSight to create a visual dashboard that integrates data from Amazon CloudWatch Logs via Amazon S3, providing interactive charts and graphs that allow stakeholders to drill down into specific metrics as needed

Amazon QuickSight is a powerful BI tool that allows you to create interactive and visually appealing dashboards. It can integrate data from multiple sources, including CloudWatch Logs (via a CloudWatch Logs subscription that sends any incoming log events that match your defined filters to your Amazon Data Firehose delivery stream and finally stored in Amazon S3), making it an ideal choice for monitoring the performance of ML models across different environments. The ability to create interactive charts and drill down into specific metrics makes the dashboard accessible to both technical and non-technical users, providing insights at various levels of detail.

Incorrect options:

Set up a CloudWatch Dashboard to aggregate and display performance metrics such as CPU utilization, memory usage, and latency from various services. Use Amazon SNS to send daily reports based on these metrics to stakeholders – CloudWatch Dashboards are excellent for aggregating and displaying performance metrics directly from AWS services. However, they are more suited to technical users and might lack the advanced visualization capabilities and interactivity that stakeholders might need for deeper analysis.

Create a custom web application that pulls data from CloudWatch Logs, generating real-time visualizations of performance metrics. Embed the application within the company’s internal portal for easy access – A custom web application could provide the required functionality, but it would require significant development and maintenance efforts. QuickSight, on the other hand, offers similar capabilities out-of-the-box with less overhead.

Use AWS Lambda to query CloudWatch Logs and send the results to an S3 bucket daily, where stakeholders can manually review the metrics and create their own visualizations using spreadsheet software While using AWS Lambda and S3 to gather CloudWatch Logs data is possible, this option results in a manual process for visualization and lacks the real-time, interactive capabilities needed for effective monitoring. Stakeholders would also need to create their own visualizations, which may not be practical.

References:

https://aws.amazon.com/quicksight/

https://docs.aws.amazon.com/AmazonCloudWatch/latest/logs/SubscriptionFilters.html

Question #17

You are a Machine Learning Engineer at a healthcare company working on a binary classification model to predict whether a patient has a particular disease based on several medical features. The consequences of misclassifications are severe: false positives lead to unnecessary and expensive follow-up tests, while false negatives could result in a failure to provide critical treatment. You need to evaluate the model using appropriate metrics to balance the risks associated with these types of errors. Given the critical nature of the application, which combination of evaluation metrics should you prioritize to minimize both false positives and false negatives while ensuring that the model is reliable for deployment? (Select two)

A . Prioritize accuracy, as it provides a general sense of how well the model is performing across all predictions
B . Focus on precision to reduce the number of false positives, thus avoiding unnecessary follow-up tests for patients who do not have the disease
C . Evaluate the model using the Area Under the ROC Curve (AUC) to understand its performance across different classification thresholds
D . Use the F1 score to balance the trade-off between precision and recall, ensuring that both false positives and false negatives are considered
E . Prioritize recall to reduce the number of false negatives, ensuring that as many patients with the disease as possible are correctly identified

Lösung einblenden Lösung ausblenden

Richtige Antwort: D, E
D, E

Explanation:

Correct options:

Use the F1 score to balance the trade-off between precision and recall, ensuring that both false positives and false negatives are considered

The F1 score is the harmonic mean of precision and recall, and it provides a single metric that balances both false positives and false negatives. It is particularly useful in scenarios where both types of errors have significant consequences, as in this healthcare scenario.

Prioritize recall to reduce the number of false negatives, ensuring that as many patients with the disease as possible are correctly identified

Given the severe risk associated with false negatives (failing to identify a patient with the disease), recall is a crucial metric. By prioritizing recall, you ensure that the model captures as many actual positive cases as possible, reducing the chance of missing a patient who needs treatment.

Key metrics to measure machine learning model performance:

via – https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-metrics-validation.html

Incorrect options:

Prioritize accuracy, as it provides a general sense of how well the model is performing across all predictions – Accuracy alone is not a sufficient metric in this scenario because it does not differentiate between the types of errors (false positives and false negatives). High accuracy might still result in unacceptable levels of false negatives or false positives.

Focus on precision to reduce the number of false positives, thus avoiding unnecessary follow-up tests for patients who do not have the disease – Focusing solely on precision would reduce false positives but might increase false negatives, which could be catastrophic in this healthcare scenario where missing a disease diagnosis could lead to severe consequences.

Evaluate the model using the Area Under the ROC Curve (AUC) to understand its performance across different classification thresholds – AUC-ROC is valuable for understanding the model’s performance across thresholds, but it doesn’t provide the same direct focus on balancing precision and recall as the F1 score does. In a healthcare scenario with high stakes, the direct balance offered by F1 score and a focus on recall are more critical.

References:

https://docs.aws.amazon.com/sagemaker/latest/dg/autopilot-metrics-validation.html

https://docs.aws.amazon.com/machine-learning/latest/dg/binary-classification.html

Question #18

Which solution will meet these requirements?

A . Use the SageMaker Feature Store GetRecord API with the record identifier.
B . Use the SageMaker Feature Store BatchGetRecord API with the record identifier. Filter to find the latest record.
C . Create an Amazon Athena query to retrieve the data from the feature table. Use the write_time value to find the latest record.
D . Create an Amazon Athena query to retrieve the data from the feature table.

Lösung einblenden Lösung ausblenden

Question #19

How does the Amazon flywheel model ensure continuous improvement and growth?

A . By using feedback loops that integrate customer experience, selection, and pricing.
B . By limiting the scope of operational improvements to cost reduction only.
C . By avoiding changes to business practices.
D . By maintaining a static approach to business operations.

Lösung einblenden Lösung ausblenden

Question #20

How does model training work in Deep Learning?

A . Model training in deep learning involves only the use of support vector machines and decision trees to create predictive models
B . Model training in deep learning involves using large datasets to adjust the weights and biases of a neural network through multiple iterations, using techniques such as gradient descent to minimize the error
C . Model training in deep learning involves manually setting the weights and biases of a neural network based on predefined rules
D . Model training in deep learning requires no data; the neural network automatically learns from
predefined algorithms without any input

Lösung einblenden Lösung ausblenden