Which GCP service to use - BigQuery ML (BQML)

BigQuery options for DATA AND ANALYTICS Description
BigQuery Data warehouse/analytics
BigQuery BI Engine In-memory analytics engine
BigQuery ML BigQuery model training/serving

BigQuery ML : BigQuery model training/serving

  • Create and execute machine learning models in BigQuery using standard SQL queries.
  • BigQuery ML increases development speed by eliminating the need to move data.
  • Can only export to Cloud Storage

Query & Terms

  • Label = alias a column as‘label’or specify column in OPTIONS using input_label_cols
  • Feature : ML.FEATURE_INFO passed through to the model as part of your SQL SELECT statement
  • SELECT * FROM ML.FEATURE_INFO(MODEL`mydataset.mymodel`)
  • Model = an object created in BigQuery that resides in your BigQuery dataset
  • CREATE OR REPLACE MODEL <dataset>.<name> OPTIONS(model_type='<type>') AS <training dataset>
  • Training Progress : ML.TRAINING_INFO
  • SELECT * FROM ML.TRAINING_INFO(MODEL `mydataset.mymodel`)
  • Inspect Weights : ML.WEIGHTS
  • SELECT * FROM ML.WEIGHTS(MODEL `mydataset.mymodel`, (<query>))
  • Evaluation : ML.EVALUATE
  • SELECT * FROM ML.EVALUATE(MODEL `mydataset.mymodel`)
  • Prediction : ML.PREDICT
  • SELECT * FROM ML.PREDICT(MODEL `mydataset.mymodel`, (<query>))

Supported Model Type

  1. Linear Regression
  2. Binary logistic regression
  3. Multi class logistic regression
  4. K-means clustering
  5. Matrix factorization for recommendation system
  6. Time series
  7. Boosted tree
  8. DNN
  9. AUTOML tables
  10. Import previously trained TensorFlow models

Making predictions with imported TensorFlow models in BQML

3 ways to import & use TF models in BQML : console / bq / API

1. how to import TensorFlow models into a BigQuery ML dataset

# run a batch query to import a tf model from cloud storage
bq query --use_legacy_sql=false \
"CREATE OR REPLACE MODEL
  example_dataset.imported_tf_model
OPTIONS
  (MODEL_TYPE='TENSORFLOW',
    MODEL_PATH='gs://cloud-training-demos/txtclass/export/exporter/1549825580/*')"


### result of bq ls [dataset_name]
$ bq ls example_dataset

       tableId        Type    Labels   Time Partitioning
 ------------------- ------- -------- -------------------
  imported_tf_model   MODEL

2. how to use them to make predictions from a SQL query

bq query \
--use_legacy_sql=false \
'SELECT *
FROM ML.PREDICT(
  MODEL tensorflow_sample.imported_tf_model,
  (SELECT title AS input FROM `bigquery-public-data.hacker_news.stories`))'

### result  
+----------------------------------------------------------------------+------------------------------------------------------------------------------------+
|                               dense_1                                |                                       input                                        |
+----------------------------------------------------------------------+------------------------------------------------------------------------------------+
|   ["0.8611106276512146","0.06648492068052292","0.07240450382232666"] | Appshare                                                                           |
|    ["0.6251608729362488","0.2989124357700348","0.07592673599720001"] | A Handfull of Gold.                                                                |
|   ["0.014276246540248394","0.972910463809967","0.01281337533146143"] | Fastest Growing Skin Care Supplement for Increased Hair, Skin and Nail Nourishment |
| ["0.9821603298187256","1.8601855117594823E-5","0.01782100833952427"] | R4 3ds sdhc                                                                        |
|   ["0.8611106276512146","0.06648492068052292","0.07240450382232666"] | Empréstimo Com Nome Sujo                                                           |
+----------------------------------------------------------------------+------------------------------------------------------------------------------------+  

Q 38.

You have trained a text classification model in TensorFlow using AI Platform. You want to use the trained model for batch predictions on text data stored in BigQuery while minimizing computational overhead. What should you do?

  • A. Export the model to BigQuery ML.
    OPTIMAL choice ; dataset in BQ which supports Import previously trained TensorFlow models
  • ❌ B. Deploy and version the model on AI Platform.
  • ❌ C. Use Dataflow with the SavedModel to read the data from BigQuery.
  • ❌ D. Submit a batch prediction job on AI Platform that points to the model location in Cloud Storage.