Batch / Streaming pipeline 파이프라인 유형에 따른 예측 성능 튜닝 방법을 정리합니다.
Prediction/Inference Performance
Performance must consider prediction-time, not just training.
3 Considerations for Performance during inference : Throughputs, Latency, Cost
- Throughputs requirements : how many queries per second do you need to process?
- Latency requirements : how long can a query take?
- Cost : in terms of infrastructure and maintenance
3 Options for Prediction/Inference Implementation
- Using a deployed model REST/HTTP API for Streaming
- Using prediction jobs on Cloud ML Engine(CMLE) for Batch
- Using direct model prediction on Cloud Dataflow for Batch/Streaming
Batch data pipeline
Batch
= Bounded Dataset
- Read Data & Data Processing
- Read data from some persistent storage
- Google Cloud Storage(data lake), BigQuery(data warehouse)
- Processing, carried by Cloud Dataflow, typically enriches the data with the predictions of a ML model
- Read data from some persistent storage
- Inference
- Using a TensorFlow SavedModel
- Loading a TF SavedModel from Cloud Storage into the Dataflow job and invoke it
- Using TF Serving
- Accessing TF Serving via a HTTP end-point as a microservice from CLME or Kubeflow (running on Kubernetes Engine)
- Using a TensorFlow SavedModel
Prediction Performance for Batch Pipelines
Performance for Batch Pipelines
- Raw processing speed
Cloud ML Engine (AI Platform Notebooks) batch predictions > TF SavedModel > TF Serving on Cloud ML Engine- Maintainability
Cloud ML Engine (AI Platform Notebooks) batch predictions > TF Serving on Cloud ML Engine > TF SavedModel
Laurence said "what’s not to love about a fully managed service?" Fully managed service인 CMLE/AI Platform Notebooks 는 전처리 속도와 유지/보수 측면에서 제일 좋은 성능을 보인다.
Using online predictions as a microservice allows for easier upgradability and dependency management than loading up the current version into the Dataflow job. 반면 TF SavedModel
, TF Serving on CMLE
둘은 전처리 속도와 유지/보수 측면에서 성능 순위가 달라진다.
Streaming data pipeline
A streaming pipeline is similar, except that the input dataset is not bounded.
'Certificate - DS > Machine learning engineer' 카테고리의 다른 글
Which GCP service to use - Cloud Functions (0) | 2021.11.28 |
---|---|
Which GCP service to use - Cloud Dataflow & Cloud Dataproc (0) | 2021.11.28 |
Designing data preparation and processing systems - Data Pipeline for Preprocessing 데이터 전처리 파이프라인 (0) | 2021.11.28 |
Deep Learning VM Image (0) | 2021.11.27 |
Production ML Systems - Design Training&Serving Architecture (0) | 2021.11.26 |