[PMLE CERTIFICATE - EXAMTOPIC] Additional DUMPS

Custom NN model with ML framework not supported by AI Platform & too large to use a single machine, BQML vs.AutoML, Optimal components for ML system with docker Containers & online prediction, 3 Types of Edge model, ML model with PII

Q 61.

You recently designed and built a custom neural network that uses critical dependencies specific to your organization's framework. You need to train the model using a managed training service on Google Cloud. However, the ML framework and related dependencies are not supported by Al Platform Training. Also, both your model and your data are too large to fit in memory on a single machine. Your ML framework of choice uses the scheduler, workers, and servers distribution structure. What should you do?

Custom NN model with ML framework not supported by AI Platform & too large to use a single machine
  • A. Build your custom container to run jobs on Al Platform Training
    → PARTIALLY CORRECT; REQUIRE DISTRIBUTED TRIAINING
  • B. Use a built-in model available on Al Platform Training
    → REQUIRE CUSTOM MODEL & DISTRIBUTED TRIAINING
  • C. Build your custom containers to run distributed training jobs on Al Platform Training
  • D. Reconfigure your code to a ML framework with dependencies that are supported by Al Platform Training
    → NOT OPTIMAL

Q 62.

You need to train a regression model based on a dataset containing 50,000 records that is stored in BigQuery. The data includes a total of 20 categorical and numerical features with a target variable that can include negative values. You need to minimize effort and training time while maximizing model performance. What approach should you take to train this regression model?

BQML vs.AutoML
  • A. Create a custom TensorFlow DNN model.
    → NOT OPTIMAL FOR CONDITION 1:MINIMIZE EFFORT
  • B. Use BQML XGBoost regression to train the model.
    → MOST STRAIGHTFORWARD & FASTER
  • C. Use AutoML Tables to train the model without early stopping.
    → NO NEED TO CODE BUT NOT OPTIMAL
    → WITHOUT EARLY STOPPING, TRAINING TIME MIGHT BE LONGER
  • D. Use AutoML Tables to train the model with RMSLE as the optimization objective.
    → WITH GIVEN CONDITIONS, THE TARGET VALUE CAN BE NEGATIVE VALUES

RMSLE only if all label and predicted values are non-negative

RMSLE: root-mean-squared logarithmic error

  • similar to RMSE, except that it uses the natural logarithm of the predicted and actual values plus 1.
    • The RMSLE evaluation metric is returned only if all label and predicted values are non-negative 모든 레이블, 예측값이 음수가 아닌 경우에만 계산 가능
  • RMSLE penalizes under-prediction more heavily than over-prediction.
    • Also can be Good metric when don't want to penalize differences for large prediction values more heavily than for small prediction values.
  • Metric ranges [from zero to infinity]
  • A lower value indicates a higher quality model

Q 62.

Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?

Optimal components for ML system with docker Containers & online prediction
  • A. Kubeflow Pipelines and App Engine
  • B. Kubeflow Pipelines and Al Platform Prediction
  • C. Cloud Composer, BigQuery ML , and Al Platform Prediction
  • D. Cloud Composer, Al Platform Training with custom containers , and App Engine

Q 63.

You work for a toy manufacturer that has been experiencing a large increase in demand. You need to build an ML model to reduce the amount of time spent by quality control inspectors checking for product defects. Faster defect detection is a priority. The factory does not have reliable Wi-Fi. Your company wants to implement the new ML model as soon as possible. Which model should you use?

3 Types of Edge model
  • A. AutoML Vision Edge mobile-high-accuracy-1 model
  • B. AutoML Vision Edge mobile-versatile-1 model
  • C. AutoML Vision model
  • ⭕ D. AutoML Vision Edge mobile-low-latency-1 model
    → OBJECTIVE : TO REDUCE THE AMOUNT OF TIME SPENT - low latency mobile-low-latency-1

Training Edge exportable models

  • When you have a dataset with a solid set of labeled training items, you are ready to create and train your custom Edge model.
  • At training time : choose the type of Edge model you want
    • low latency (mobile-low-latency-1)
    • general purpose usage (mobile-versatile-1)
    • higher prediction quality (mobile-high-accuracy-1)
  • Edge models are optimized for inference on an Edge device. Consequently, Edge model accuracy will differ from Cloud model accuracy.
  • Resumable training : can pause and resume your custom model training for large datasets
    • Base model time limit : You can resume training only on models that have been trained within the last 14 days; base models created more than 14 days before your request are not eligible for resumable training.
    • No label modification : Resumable training fails if you change the labels in the base model's dataset.
    • No guarantee of better performance : Using resumable training on a model does not guarantee better model performance.

Q 64.

You are an ML engineer at a bank that has a mobile application. Management has asked you to build an ML-based biometric authentication for the app that verifies a customer's identity based on their fingerprint. Fingerprints are considered highly sensitive personal information and cannot be downloaded and stored into the bank databases. Which learning strategy should you recommend to train and deploy this ML model?

ML model with PII
  • A. Differential privacy
  • B. Federated learning
  • C. MD5 to encrypt data
  • ⭕ D. Data Loss Prevention API

Cloud Data Loss Prevention (DLP) API - architecture & de-identification methods

  • Reference architecture for using Google Cloud products to add a layer of security to sensitive datasets by using de-identification techniques.
Available de-identification techniques in Cloud DLP.
- Redaction: Deletes all or part of a detected sensitive value.
- Replacement: Replaces a detected sensitive value with a specified surrogate value.
- Masking: Replaces a number of characters of a sensitive value with a specified surrogate character, such as a hash (#) or asterisk (*). (e.g, Social Security Number)
- Crypto-based tokenization: Encrypts the original sensitive data value using a cryptographic key. Cloud DLP supports several types of tokenization, including transformations that can be reversed, or "re-identified."
- Bucketing: "Generalizes" a sensitive value by replacing it with a range of values. (replacing a specific age with an age range, Job title)
- Date shifting: Shifts sensitive date values by a random amount of time.
- Time extraction: Extracts or preserves specified portions of date and time values.

Crypto-based tokenization transformations

Crypto-based tokenization = "pseudonymization" : transformations are de-identification methods that replace the original sensitive data values with encrypted values.

  • Cloud DLP supports the following types of tokenization, including transformations that can be reversed and allow for re-identification: