Google Professional Data Engineer Certificate EXAMTOPIC DUMPS Q121-Q125
Q 121.
You currently have a single on-premises Kafka cluster in a data center in the us-east region that is responsible for ingesting messages from IoT devices globally. Because large parts of globe have poor internet connectivity, messages sometimes batch at the edge, come in all at once, and cause a spike in load on your Kafka cluster. This is becoming difficult to manage and prohibitively expensive. What is the Google-recommended cloud native architecture for this scenario?
- ❌ A.
Edge TPUs as sensor devices for storing and transmitting the messages.- ❌ B. Cloud
Dataflowconnected to the Kafka cluster to scale the processing of incoming messages.- ⭕ C. An IoT gateway connected to
Cloud Pub/Sub
, withCloud Dataflow
to read and process the messages fromCloud Pub/Sub
.
→ Alternative to a single kafka cluster in Google Cloud native service isPub/Sub
→Pub/Sub
scales automatically based on demand.- ❌ D.
A Kafka cluster virtualized on Compute Engine in us-eastwith Cloud Load Balancing to connect to the devices around the world.
Cloud Native = Pub/Sub + DataFlow
Q 124.
You are designing a cloud-native historical data processing system to meet the following conditions:
- The data being analyzed is in CSV, Avro, and PDF formats and will be accessed by multiple analysis tools including
Cloud Dataproc
,BigQuery
, andCompute Engine
.- A streaming data pipeline stores new data daily.
- Peformance is not a factor in the solution.
- The solution design should maximize availability.
How should you design data storage for this solution?
- ❌ A.
Create a Cloud Dataproc cluster with high availability.Store the data in HDFS, and peform analysis as needed.- ❌ B.
Store the data in BigQuery.Access the data using the BigQuery Connector on Cloud Dataproc and Compute Engine.
→ BQ does not support PDF format.(UNSTRUCTURED)- ❌ C. Store the data in a
regionalCloud Storage bucket. Access the bucket directly using Cloud Dataproc, BigQuery, and Compute Engine.
→ To MAXIMIZE Availability : "multi/dual" > region- ⭕ D. Store the data in a multi-regional
Cloud Storage
bucket. Access the data directly usingCloud Dataproc
,BigQuery
, andCompute Engine
.
→ To MAXIMIZE Availability : "multi/dual" > region
Cloud Storage Classes
Bigquery supports following formats :
- Batch loading data | BigQuery | Google Cloud
- Avro
- Comma-separated values (CSV)
- JSON (newline-delimited)
- ORC
- Parquet
- Firestore exports stored in Cloud Storage.
Q 125.
You have a petabyte of analytics data and need to design a storage and processing platform for it. You must be able to perform data warehouse-style analytics on the data in Google Cloud and expose the dataset as files for batch analysis tools in other cloud providers. What should you do?
- ⚠️ A. Store and process the entire dataset in BigQuery.
→ We also need the tool for exposing the data as files :Cloud Storage
- ❌ B. Store and process
the entire dataset in Cloud Bigtable.- ⭕ C. Store the full dataset in BigQuery, and store a compressed copy of the data in a Cloud Storage bucket.
→ Data Warehouse-style analytics :BigQuery
→ Exposing the data as files :Cloud Storage
- ❌ D.
Store the warm data as files in Cloud Storage, and store the active data in BigQuery.Keep this ratio as 80% warm and 20% active.
→ Whether the data is warm data or not is not the key; The data should be available for other providers.
Warm data $vs.$ Hot data
What is Warm Data? - Definition from Techopedia
- Warm data is a term for data that gets analyzed on a fairly frequent basis, but is not constantly in play or in motion.
- By contrast, hot data is data that is used very frequently and data that administrators perceive to be always changing.
'Certificate - DS > Data engineer' 카테고리의 다른 글
Cloud Pub/Sub and Kafka (0) | 2022.02.21 |
---|---|
BigQuery - Partitioning, Clustering, Sharding (0) | 2022.02.21 |
[PDE CERTIFICATE - EXAMTOPIC] DUMPS Q111-Q115 (0) | 2022.02.17 |
[PDE CERTIFICATE - EXAMTOPIC] DUMPS Q96-Q100 (0) | 2022.02.17 |
[PDE CERTIFICATE - EXAMTOPIC] DUMPS Q91-Q95 (0) | 2022.02.17 |