[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q29-Q32

EXAMTOPIC DUMPS Q29-Q32; online prediction architecture on AI platform, Monitoring Data value Skews results in Bad performance in production

Q 29.

You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online prediction. Which architecture should you use?

Online prediction architecture on AI platform

A. Validate the accuracy of the model that you trained on preprocessed data. ~~Create a new model that uses the raw data and is available in real time.~~ Deploy the new model onto AI Platform for online prediction.

⭕ B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.

C. Stream incoming prediction request data into ~~Cloud Spanner~~. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.

D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the ~~Cloud Function~~. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.

Can Implement computationally expensive preprocessing operations in Apache Beam, and run them at scale using Dataflow

`Dataflow`
- a fully managed autoscaling service for batch and stream data processing.
- perform instance-level transformations, stateful full-pass transformations, and window-aggregation feature transformations.

Most of the time where you need to execute a full transformation pipeline & a comparison between dataflow and cloud function : recommended to go with dataflow. It's a solution more prepared to solve those cases.\

Q 30.

Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?

Monitoring Data value Skews results in Bad performance in production

⭕ A. Create alerts to monitor for skew, and retrain the model.

B. Perform feature selection on the model, and retrain the model with fewer features.

🚩 C. Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service.
→ L2 regularization prevent overfitting which can potential maintain model performance if data distribution is little skewed. HOWEVER THE Q SAID TEST RESULT WAS GOOD.

D. Perform feature selection on the model, and retrain the model on a monthly basis with fewer features.

Need to trigger a retraining of the model to capture Data value skews

Data values skews: These skews are significant changes in the statistical properties of data, which means that data patterns are changing, and you need to trigger a retraining of the model to capture
these changes.

Q 31.

You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:

Optimizer: SGD

Image shape = 224ֳ-224

Batch size = 64

Epochs = 10

Verbose =2
During training you encounter the following error: ResourceExhaustedError: Out Of Memory (OOM) when allocating tensor. What should you do?

NEED TO REDUCE MEMORY USE for ML training with image data

Image Classification using GPU-powered virtual machine on Compute Engine, to solve the error ResourceExhaustedError: Out Of Memory (OOM) ⇒ NEED TO REDUCE MEMORY USE

❌ A. Change the optimizer.
→ Learning rate and optimizer shouldn't really impact memory utilisation.

⭕ B. Reduce the batch size.

❌ C. Change the learning rate.

🚩 D. Reduce the image shape.
→ Decreasing image size would work, but might be costly in terms final performance

Batch size, Image shape

Batch size

✔ Defines the number of data samples used to compute each update to the model's trainable parameters (i.e. weights and biases) 파라미터 업데이트 연산에 사용되는 데이터 표본의 개수를 의미한다.
✔ critical impact on training time and the resulting accuracy of the trained model 훈련시간과 모델의 정확도에 중요한 영향을 미친다.
✔ The larger the batch, the more samples propagate through the model in the forward pass. Since a batch size increase will require more GPU memory, a lack of GPU memory can prevent you from increasing the batch size. 배치사이즈가 클수록 훈련 단계별 연산량이 많아지는데, 배치 사이즈 조정을 하는 경우 GPU 메모리에 제한을 받을 수 있다.

Overview of hyperparameter tuning
train_batch_size in hyperparameter examples
TensorFlow guide to mixed precision training
Discussions

OOM 메모리가 부족 에러가 발생했으므로, 더 적은 메모리를 사용해 훈련하려면 reduce batch size, reduce image shape 조정을 시도해볼 수 있다.
AI Platform Training > Hyperparameters of built-in image classification
- train_batch_size : # of images used in one training step. If this number is too big, the job may fail with out-of-memory (OOM). (Default: 32)
- image_size : The image size (width and height) used for training. Note that the training job may be OOM if its value is too big. (Default: 224)
모델 훈련에 사용한 하이퍼파라미터
- Image shape = 224ֳ-224 디폴트 224 와 일치하지만,
- Batch size = 64 디폴트값 32 보다 높으므로 배치 사이즈를 줄여 다시 훈련해볼 수 있다.

Q 32.

You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?

❌ A. Significantly ~~increase the max_batch_size~~ TensorFlow Serving parameter.
→ bigger batch size ; increase latency

B. Switch to the tensorflow-model-server-universal version of TensorFlow Serving.

❌ C. Significantly increase the max_enqueued_batches TensorFlow Serving parameter.
→ bigger batch size ; increase latency

D. Recompile TensorFlow Serving using the source to support CPU-specific optimizations. Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes.
→ focusing on server performance which development env is higher than production env.

저작자표시 비영리 변경금지

'Certificate - DS > Machine learning engineer' 카테고리의 다른 글

[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q33-Q36 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q25-Q28 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q21-Q24 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q17-Q20 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q13-Q16 (0)	2021.12.10

JINSTORY

[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q29-Q32

Q 29.

Online prediction architecture on AI platform

Can Implement computationally expensive preprocessing operations in Apache Beam, and run them at scale using Dataflow

Q 30.

Monitoring Data value Skews results in Bad performance in production

Need to trigger a retraining of the model to capture Data value skews

Q 31.

NEED TO REDUCE MEMORY USE for ML training with image data

Batch size, Image shape

Q 32.

'Certificate - DS > Machine learning engineer' 카테고리의 다른 글

티스토리툴바