Q 29.
You have trained a model on a dataset that required computationally expensive preprocessing operations. You need to execute the same preprocessing at prediction time. You deployed the model on AI Platform for high-throughput online prediction. Which architecture should you use?
Online prediction architecture on AI platform
- A. Validate the accuracy of the model that you trained on preprocessed data.
Create a new model that uses the raw data and is available in real time.Deploy the new model onto AI Platform for online prediction.- ⭕ B. Send incoming prediction requests to a Pub/Sub topic. Transform the incoming data using a Dataflow job. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
- C. Stream incoming prediction request data into
Cloud Spanner. Create a view to abstract your preprocessing logic. Query the view every second for new records. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.- D. Send incoming prediction requests to a Pub/Sub topic. Set up a Cloud Function that is triggered when messages are published to the Pub/Sub topic. Implement your preprocessing logic in the
Cloud Function. Submit a prediction request to AI Platform using the transformed data. Write the predictions to an outbound Pub/Sub queue.
Can Implement computationally expensive preprocessing operations in Apache Beam, and run them at scale using Dataflow
Dataflow |
---|
- a fully managed autoscaling service for batch and stream data processing. |
- perform instance-level transformations, stateful full-pass transformations, and window-aggregation feature transformations. |
- Most of the time where you need to execute a full transformation pipeline & a comparison between
dataflow
andcloud function
: recommended to go withdataflow
. It's a solution more prepared to solve those cases.\
Q 30.
Your team trained and tested a DNN regression model with good results. Six months after deployment, the model is performing poorly due to a change in the distribution of the input data. How should you address the input differences in production?
Monitoring Data value Skews results in Bad performance in production
- ⭕ A. Create alerts to monitor for skew, and retrain the model.
- B. Perform feature selection on the model, and retrain the model with fewer features.
- 🚩 C. Retrain the model, and select an L2 regularization parameter with a hyperparameter tuning service.
→L2 regularization
prevent overfitting which can potential maintain model performance if data distribution is little skewed. HOWEVER THE Q SAID TEST RESULT WAS GOOD.- D. Perform feature selection on the model, and retrain the model on a monthly basis with fewer features.
Need to trigger a retraining of the model to capture Data value skews
Q 31.
You need to train a computer vision model that predicts the type of government ID present in a given image using a GPU-powered virtual machine on Compute Engine. You use the following parameters:
- Optimizer: SGD
- Image shape = 224ֳ-224
- Batch size = 64
- Epochs = 10
- Verbose =2
During training you encounter the following error: ResourceExhaustedError: Out Of Memory (OOM) when allocating tensor. What should you do?
NEED TO REDUCE MEMORY USE for ML training with image data
Image Classification using GPU-powered virtual machine on Compute Engine, to solve the error ResourceExhaustedError: Out Of Memory (OOM)
⇒ NEED TO REDUCE MEMORY USE
- ❌ A. Change the optimizer.
→ Learning rate and optimizer shouldn't really impact memory utilisation.- ⭕ B. Reduce the batch size.
- ❌ C. Change the learning rate.
- 🚩 D. Reduce the image shape.
→ Decreasing image size would work, but might be costly in terms final performance
Batch size, Image shape
Batch size |
---|
✔ Defines the number of data samples used to compute each update to the model's trainable parameters (i.e. weights and biases) 파라미터 업데이트 연산에 사용되는 데이터 표본의 개수를 의미한다. |
✔ critical impact on training time and the resulting accuracy of the trained model 훈련시간과 모델의 정확도에 중요한 영향을 미친다. |
✔ The larger the batch, the more samples propagate through the model in the forward pass. Since a batch size increase will require more GPU memory, a lack of GPU memory can prevent you from increasing the batch size. 배치사이즈가 클수록 훈련 단계별 연산량이 많아지는데, 배치 사이즈 조정을 하는 경우 GPU 메모리에 제한을 받을 수 있다. |
- Overview of hyperparameter tuning
- train_batch_size in hyperparameter examples
- TensorFlow guide to mixed precision training
- Discussions
- OOM 메모리가 부족 에러가 발생했으므로, 더 적은 메모리를 사용해 훈련하려면 reduce batch size, reduce image shape 조정을 시도해볼 수 있다.
- AI Platform Training > Hyperparameters of built-in image classification
- train_batch_size : # of images used in one training step. If this number is too big, the job may fail with out-of-memory (OOM). (Default: 32)
- image_size : The image size (width and height) used for training. Note that the training job may be OOM if its value is too big. (Default: 224)
- 모델 훈련에 사용한 하이퍼파라미터
Image shape = 224ֳ-224
디폴트224
와 일치하지만,Batch size = 64
디폴트값32
보다 높으므로 배치 사이즈를 줄여 다시 훈련해볼 수 있다.
Q 32.
You developed an ML model with AI Platform, and you want to move it to production. You serve a few thousand queries per second and are experiencing latency issues. Incoming requests are served by a load balancer that distributes them across multiple Kubeflow CPU-only pods running on Google Kubernetes Engine (GKE). Your goal is to improve the serving latency without changing the underlying infrastructure. What should you do?
- ❌ A. Significantly
increase theTensorFlow Serving parameter.max_batch_size
→ bigger batch size ; increase latency- B. Switch to the
tensorflow-model-server-universal
version of TensorFlow Serving.- ❌ C. Significantly increase the
max_enqueued_batches
TensorFlow Serving parameter.
→ bigger batch size ; increase latency- D. Recompile TensorFlow Serving using the source to support CPU-specific optimizations. Instruct GKE to choose an appropriate baseline minimum CPU platform for serving nodes.
→ focusing on server performance which development env is higher than production env.
'Certificate - DS > Machine learning engineer' 카테고리의 다른 글
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q33-Q36 (0) | 2021.12.10 |
---|---|
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q25-Q28 (0) | 2021.12.10 |
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q21-Q24 (0) | 2021.12.10 |
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q17-Q20 (0) | 2021.12.10 |
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q13-Q16 (0) | 2021.12.10 |