Q 61.
You recently designed and built a custom neural network that uses critical dependencies specific to your organization's framework. You need to train the model using a managed training service on Google Cloud. However, the ML framework and related dependencies are not supported by Al Platform Training. Also, both your model and your data are too large to fit in memory on a single machine. Your ML framework of choice uses the scheduler, workers, and servers distribution structure. What should you do?
Custom NN model with ML framework not supported by AI Platform & too large to use a single machine
- A. Build your custom container to run jobs on Al Platform Training
→ PARTIALLY CORRECT; REQUIRE DISTRIBUTED TRIAINING- B.
Use a built-in modelavailable on Al Platform Training
→ REQUIRE CUSTOM MODEL & DISTRIBUTED TRIAINING- ⭕ C. Build your custom containers to run distributed training jobs on Al Platform Training
- D.
Reconfigureyour code to a ML framework with dependencies that are supported by Al Platform Training
→ NOT OPTIMAL
Q 62.
You need to train a regression model based on a dataset containing 50,000 records that is stored in BigQuery. The data includes a total of 20 categorical and numerical features with a target variable that can include negative values. You need to minimize effort and training time while maximizing model performance. What approach should you take to train this regression model?
BQML vs.AutoML
- A. Create a
customTensorFlow DNN model.
→ NOT OPTIMAL FOR CONDITION 1:MINIMIZE EFFORT- ⭕ B. Use
BQML
XGBoost regression to train the model.
→ MOST STRAIGHTFORWARD & FASTER- C. Use AutoML Tables to train the model
without early stopping.
→ NO NEED TO CODE BUT NOT OPTIMAL
→ WITHOUT EARLY STOPPING, TRAINING TIME MIGHT BE LONGER- D. Use AutoML Tables to train the model with
RMSLE as the optimization objective.
→ WITH GIVEN CONDITIONS, THE TARGET VALUE CAN BE NEGATIVE VALUES
RMSLE only if all label and predicted values are non-negative
RMSLE: root-mean-squared logarithmic error
- similar to RMSE, except that it uses the natural logarithm of the predicted and actual values plus 1.
- The RMSLE evaluation metric is returned only if all label and predicted values are non-negative 모든 레이블, 예측값이 음수가 아닌 경우에만 계산 가능
- RMSLE penalizes under-prediction more heavily than over-prediction.
- Also can be Good metric when don't want to penalize differences for large prediction values more heavily than for small prediction values.
- Metric ranges [from zero to infinity]
- A lower value indicates a higher quality model
Q 62.
Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?
Optimal components for ML system with docker Containers & online prediction
- A. Kubeflow Pipelines and App Engine
- ⭕ B. Kubeflow Pipelines and Al Platform Prediction
- C.
Cloud Composer, BigQuery ML , and Al Platform Prediction- D.
Cloud Composer, Al Platform Training with custom containers , and App Engine
Q 63.
You work for a toy manufacturer that has been experiencing a large increase in demand. You need to build an ML model to reduce the amount of time spent by quality control inspectors checking for product defects. Faster defect detection is a priority. The factory does not have reliable Wi-Fi. Your company wants to implement the new ML model as soon as possible. Which model should you use?
3 Types of Edge model
- A. AutoML Vision
Edge mobile-high-accuracy-1
model- B. AutoML Vision
Edge mobile-versatile-1
model- C. AutoML Vision model
- ⭕ D. AutoML Vision
Edge mobile-low-latency-1
model
→ OBJECTIVE : TO REDUCE THE AMOUNT OF TIME SPENT - low latencymobile-low-latency-1
Training Edge exportable models
- When you have a dataset with a solid set of labeled training items, you are ready to create and train your custom Edge model.
- At training time : choose the type of Edge model you want
- low latency (
mobile-low-latency-1
) - general purpose usage (
mobile-versatile-1
) - higher prediction quality (
mobile-high-accuracy-1
)
- low latency (
- Edge models are optimized for inference on an Edge device. Consequently, Edge model accuracy will differ from Cloud model accuracy.
- Resumable training : can pause and resume your custom model training for large datasets
- Base model time limit : You can resume training only on models that have been trained within the last 14 days; base models created more than 14 days before your request are not eligible for resumable training.
- No label modification : Resumable training fails if you change the labels in the base model's dataset.
- No guarantee of better performance : Using resumable training on a model does not guarantee better model performance.
Q 64.
You are an ML engineer at a bank that has a mobile application. Management has asked you to build an ML-based biometric authentication for the app that verifies a customer's identity based on their fingerprint. Fingerprints are considered highly sensitive personal information and cannot be downloaded and stored into the bank databases. Which learning strategy should you recommend to train and deploy this ML model?
ML model with PII
- A. Differential privacy
- B. Federated learning
- C. MD5 to encrypt data
- ⭕ D. Data Loss Prevention API
Cloud Data Loss Prevention (DLP) API - architecture & de-identification methods
- Reference architecture for using Google Cloud products to add a layer of security to sensitive datasets by using de-identification techniques.
Available de-identification techniques in Cloud DLP. |
---|
- Redaction: Deletes all or part of a detected sensitive value. |
- Replacement: Replaces a detected sensitive value with a specified surrogate value. |
- Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*). (e.g, Social Security Number) |
- Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Cloud DLP supports several types of tokenization, including transformations that can be reversed, or "re-identified." |
- Bucketing: "Generalizes" a sensitive value by replacing it with a range of values. (replacing a specific age with an age range, Job title) |
- Date shifting: Shifts sensitive date values by a random amount of time. |
- Time extraction: Extracts or preserves specified portions of date and time values. |
Crypto-based tokenization transformations
- Cloud DLP supports the following types of tokenization, including transformations that can be reversed and allow for re-identification:
- Cryptographic hashing - Card PIN
Given aCryptoKey
, Cloud DLP uses a SHA-256-based message authentication code (HMAC-SHA-256) on the input value, and then replaces the input value with the hashed value encoded in base64. - Format preserving encryption
Replaces an input value with a token that has been generated using format-preserving encryption (FPE) with the FFX mode of operation. This transformation method produces a token that is limited to the same alphabet as the input value and is the same length as the input value. FPE also supports re-identification given the original encryption key. - Deterministic encryption - Card Number & Card Holder's name: Replaces an input value with a token that has been generated using AES in Synthetic Initialization Vector mode (AES-SIV). This transformation method has no limitation on supported string character sets, generates identical tokens for each instance of an identical input value, and uses surrogates to enable re-identification given the original encryption key.
- Cryptographic hashing - Card PIN
'Certificate - DS > Machine learning engineer' 카테고리의 다른 글
Training performance - I/O bound, CPU bound, Memory bound (0) | 2021.12.13 |
---|---|
Recommendation AI (0) | 2021.12.13 |
Testing the new app version : Canary, A/B, Shadow (0) | 2021.12.12 |
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q57-Q60 (0) | 2021.12.10 |
[PMLE CERTIFICATE - EXAMTOPIC DUMPS Q53-Q56 (0) | 2021.12.10 |