[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q25-Q28

EXAMTOPIC DUMPS Q24-Q28 ; BINARY CLASSIFICATION - Relationship between SOFTMAX THRESHOLD & PERCISION, Codeless ETL tool, Dealing with Security, Privacy Issues, I/O bound solutions tf input data pipeline

Q 25.

You work for a social media company. You need to detect whether posted images contain cars. Each training example is a member of exactly one class. You have trained an object detection neural network and deployed the model version to AI Platform Prediction for evaluation. Before deployment, you created an evaluation job and attached it to the AI Platform Prediction model version. You notice that the precision is lower than your business requirements allow. How should you adjust the model's final layer softmax threshold to increase precision?

BINARY CLASSIFICATION - Relationship between SOFTMAX THRESHOLD & PERCISION

❌ A. ~~Increase~~ the recall.
→ Probably decrease the precision.

⭕ B. Decrease the recall.
→ Usually improving precision typically reduces recall and vice versa

❌ C. ~~Increase~~ the number of false positives.
→ Probably decrease the precision.

❌ D. Decrease the number of ~~false negatives~~.
→ Probably increase the recall and reduce precision

`Precision = TP/(TP+FP)`, `Recall = TP/(TP+FN)`

Q 26.

You are responsible for building a unified analytics environment across a variety ofon-premises data marts. Your company is experiencing data quality and security challenges when integrating data across the servers, caused by the use of a wide range of disconnected tools and temporary solutions. You need a fully managed, cloud-native data integration service that will lower the total cost of work and reduce repetitive work. Some members on your team prefer a codeless interface for building Extract, Transform, Load (ETL) process. Which service should you use?

####### Codeless ETL tool

❌ A. Dataflow

❌ B. Dataprep

❌ C. Apache Flink

⭕ D. Cloud Data Fusion

ETL processes on GCP for low/no-code solutions : Data Fusion & Cloud Dataprep

`Cloud Data Fusion`
A Fully managed, cloud-native, enterprise data integration service* for quickly building & managing data pipelines*
- Runs on top of Hadoop
- rough from a usability perspective and generally more expensive, it’s likely where Google is putting significant investment.

`Dataprep`
More refined and cost-effective but limited in capability
- third-party application offered by Trifacta through GCP

Q 27.

You are an ML engineer at a regulated insurance company. You are asked to develop an insurance approval model that accepts or rejects insurance applications from potential customers. What factors should you consider before building the model?

Regulated Data - Dealing with Security, Privacy Issues

A. Redaction, reproducibility, and explainability
→ usecase of Redaction for sensitivity data

B. Traceability, reproducibility, and explainability

C. Federated learning, reproducibility, and explainability

⭕ D. Differential privacy, Federated learning, and explainability

Federated Learning, Differetial Privacy, Traceability for Privacy

Federated learning 연합학습
Federated Learning Office Hours - AI Workshop Experiments
Distributed machine learning approach that enables ml on decentralized datasets* (_decentralized examples residing on devices, such as smartphones*)_ ; 스마트폰 과 같은 장치에 있는 분산된 샘플(데이터)를 사용해 훈련하는 방식
- _help protect data privacy_ & improve local speed and performance\
- Open-source TensorFlow Federated library

Differential privacy
Differential Privacy, Google Developers Blog: How we’re helping developers with differential privacy ; 민감한 개인정보를 다루는 산업군에서 ML 문제, 아키텍쳐, 평가 매트릭을 공유하면, Tensorflow Privacy를 사용해 훈련하는 방법에 대한 가이드를 받는 서비스
Users provide : A well-defined machine learning problem, proposed model architecture, and evaluation metrics. Customers who also have data, or expected data schema, will likely gain more from this engagement, _but there is no expectation of data sharing._
Users receive : _Advice on how to use Tensorflow Privacy to train the model in a manner that offers differential privacy._
To train and deploy models based on sensitive training data* (_health records, personal email, personal photos*, etc.) without compromising the privacy of the data_
- Tensorflow privacy (currently) most effective with ; More training data (ideally more than 10^5 or 10^6 examples) , Smaller models (ideally under 10^6 parameters) ,Classification/regression rather than generative models.

Traceability
Traceability information ( called *data lineage* )
- Use business metadata tool `Data Catalog` to create tagged data lineage
- Democratization of data within an organization is essential to help users derive innovative insights for growth. In a big data environment, traceability of where the data in the data warehouse originated and how it flows through a business is critical. This traceability information is called data lineage. Being able to track, manage, and view data lineage helps you to simplify tracking data errors, forensics, and data dependency identification.
- In addition, data lineage has become essential for securing business data.
- An organization’s data governance practices require tracking all movement of sensitive data, including personally identifiable information (PII). Of key concern is ensuring that metadata stays within the customer’s cloud organization or project.

Q 28.

You are training a Resnet model on AI Platform using TPUs to visually categorize types of defects in automobile engines. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bot## Q 28. I/O bound solutions tf input data pipeline
You are training a _Resnet** model on AI Platform using TPUs to visually categorize types of defects in automobile engines_. You capture the training profile using the Cloud TPU profiler plugin and observe that it is highly input-bound. You want to reduce the bottleneck and speed up your model training process. Which modifications should you make to the tf.data dataset? (Choose two.)

I/O bound solutions tf input data pipeline

⭕ A. Use the interleave option for reading data.
→ INTERLEAVE FOR PARALLELIZING DATA READING

B. Reduce the value of the repeat parameter.

C. Increase the buffer size for the shuttle option.

⭕D. Set the prefetch option equal to the training batch size.
→ PREFETCH FOR PRE-LOADING THE DATA ; REDUCING THE TIME

E. Decrease the batch size argument in your transformation.
→ BATCH SIZE IS MORE ABOUT MEMORY BOUND SOLUTION for bottleneck and speed up your model training process.

Training performance - I/O bound, CPU bound, Memory bound

저작자표시 비영리 변경금지

'Certificate - DS > Machine learning engineer' 카테고리의 다른 글

[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q37-Q40 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q33-Q36 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q29-Q32 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q21-Q24 (0)	2021.12.10
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q17-Q20 (0)	2021.12.10

JINSTORY

[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q25-Q28

Q 25.

BINARY CLASSIFICATION - Relationship between SOFTMAX THRESHOLD & PERCISION

`Precision = TP/(TP+FP)`, `Recall = TP/(TP+FN)`

Q 26.

ETL processes on GCP for low/no-code solutions : Data Fusion & Cloud Dataprep

Q 27.

Regulated Data - Dealing with Security, Privacy Issues

Federated Learning, Differetial Privacy, Traceability for Privacy

Q 28.

I/O bound solutions tf input data pipeline

'Certificate - DS > Machine learning engineer' 카테고리의 다른 글

티스토리툴바

JINSTORY

[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q25-Q28

Q 25.

BINARY CLASSIFICATION - Relationship between SOFTMAX THRESHOLD & PERCISION

Precision = TP/(TP+FP), Recall = TP/(TP+FN)

Q 26.

ETL processes on GCP for low/no-code solutions : Data Fusion & Cloud Dataprep

Q 27.

Regulated Data - Dealing with Security, Privacy Issues

Federated Learning, Differetial Privacy, Traceability for Privacy

Q 28.

I/O bound solutions tf input data pipeline

'Certificate - DS > Machine learning engineer' 카테고리의 다른 글

티스토리툴바

`Precision = TP/(TP+FP)`, `Recall = TP/(TP+FN)`