[PDE CERTIFICATE - EXAMTOPIC] DUMPS Q71-Q75

Google Professional Data Engineer Certificate EXAMTOPIC DUMPS Q71-Q75

Q 71.

You are developing an application on Google Cloud that will automatically generate subject labels for users' blog posts. You are under competitive pressure to add this feature quickly, and you have no additional developer resources. No one on your team has experience with machine learning. What should you do?

  • A. Call the Cloud Natural Language API from your application. Process the generated Entity Analysis as labels.
  • ❌ B. Call the Cloud Natural Language API from your application. Process the generated Sentiment Analysis as labels.
  • ❌ C. Build and train a text classification model using TensorFlow. Deploy the model using Cloud Machine Learning Engine. Call the model from your application and process the results as labels.
  • ❌ D. Build and train a text classification model using TensorFlow. Deploy the model using a Kubernetes Engine cluster. Call the model from your application and process the results as labels.

Natural Language API

Entity Analysis $vs.$ Sentiment Analysis
  • Entity analysis - Identify entities within documents—including receipts, invoices, and contracts—and label them by types such as date, person, and media.
  • Sentiment analysis - Understand the overall opinion, feeling, or attitude sentiment expressed in a block of text.

Q 72.

You are designing storage for 20 TB of text files as part of deploying a data pipeline on Google Cloud. Your input data is in CSV format. You want to minimize the cost of querying aggregate values for multiple users who will query the data in Cloud Storage with multiple engines. Which storage service and schema design should you use?

  • ❌ A. Use Cloud Bigtable for storage. Install the HBase shell on a Compute Engine instance to query the Cloud Bigtable data.
  • ❌ B. Use Cloud Bigtable for storage. Link as permanent tables in BigQuery for query.
  • C. Use Cloud Storage for storage. Link as permanent tables in BigQuery for query.
  • ❌ D. Use Cloud Storage for storage. Link as temporary tables in BigQuery for query.

Q 74.

Your financial services company is moving to cloud technology and wants to store 50 TB of financial time-series data in the cloud. This data is updated frequently and new data will be streaming in all the time. Your company also wants to move their existing Apache Hadoop jobs to the cloud to get insights into this data. Which product should they use to store the data?

  • A. Cloud Bigtable
  • ❌ B. Google BigQuery
  • ❌ C. Google Cloud Storage
  • ❌ D. Google Cloud Datastore

✔️ Which GCP services to use - No SQL Options for storage (Memorystore, Datastore, Bigtable)

Choosing Bigtable

  • Cloud BigTable : Fully managed NoSQL database with petabyte-scale and very low latency
    • Store more than 1TB of structed data
    • High volume of writes
    • Read/Write latency of less than 10 milliseconds along with strong consistency
    • Easy integration with open source big data tools (Hadoop, Cloud Dataflow, Cloud Dataproc)
    • Supports the open source industry standard HBase API
    • Great choice for both operational and analytical applications
      • Ideal for Ad Tech, Fintech, and IoT
    • Scales UP well
      • Scaling
        • Bigtable Throughput scales linearly

          Every single node
        • The smallest Cloud Bigtable cluster : 3 nodes
        • Can handle 30,000 operations per seconds
        • Pay for those nodes while they are operational, whether the application is using the or not.
  • Cloud Firestore
    • Storage service that scales DOWN well

Cloud Bigtable for NoSQL (high-throughput streaming)

Getting started with time-series trend predictions using GCP | Google Cloud Blog