3 GCP services for Orchestration : Scheduler, Composer, Workflows
- Cloud
Scheduler
- Managed cron job service
- for schedule driven single-service orchestration
- Cloud
Composer
- Managed workflow orchestration service
- for orchestration of your data workloads
- Cloud
Workflows
- HTTP services orchestration
- for complex multi-service orchestration
Decision tree
Composer $vs.$ Workflows
Orchestrating multiple services , Handling long running workflows ⇒ Cloud Composer & Workflows
Cloud Composer
Commonly used for orchestrating the transformation of data as part of ELT or data engineering or workflows
- Handle a delay of a few seconds between task executions
- Building a batch orchestration workflow for data engineering. (ETL)
- Collection of tasks can be modeled as a Directed Acyclic Graph (DAG) Workflows.
- Benefit from Airflow operators, especially strong for data engineering.
- Have an existing investment or experience in Airflow DAGS.
- Benefit from the open source nature of Apache Airflow project.
- ❌ NOT suitable if low latency was required in between tasks
- Need to specify how many workers you need for a given Composer environment
Workflows
Focused on the orchestration of microservices / HTTP-based services built with Cloud Functions, Cloud Run, SaaS, or other APIs.
- ⭕ Designed for latency sensitive use cases : low latency or have a high execution count.
- Orchestrate microservices built with Cloud Functions. Cloud Run, SaaS, or other APIs.
- Serverless : no infrastructure to manage or scale
- No need to specify how many workers you need
- Follow spiky traffic patterns and need to scale in a serverless way
- Require loops and jumps to already executed steps (not a DAG)
EXAMTOPIC 1
Your data science team has requested a system that supports scheduled model retraining, Docker containers, and a service that supports autoscaling and monitoring for online prediction requests. Which platform components should you choose for this system?
- A. Kubeflow Pipelines and App Engine
- ⭕ B. Kubeflow Pipelines and Al Platform Prediction
→ Kubeflow is a platform for building and deploying portable, scalable machine learning (ML) workflows based on Docker containers. (probably the most commonly used functionality of kubeflow)
→ AI platform is a service supporting autoscaling and online prediction requests- C.
Cloud Composer, BigQuery ML , and Al Platform Prediction- D.
Cloud Composer, Al Platform Training with custom containers, and App Engine
- Cloud Composer is NOT suitable if low latency was required in between tasks.
- online prediction requests : latency sensitive usecases
'Certificate - DS > Machine learning engineer' 카테고리의 다른 글
Which GCP service to use - BigQuery ML (BQML) (0) | 2021.11.29 |
---|---|
Feature Engineering - Feature Crosses (0) | 2021.11.29 |
Which GCP service to use - Cloud Functions (0) | 2021.11.28 |
Which GCP service to use - Cloud Dataflow & Cloud Dataproc (0) | 2021.11.28 |
Production ML Systems - Tuning Prediction performance ( Batch / Streaming pipeline) 예측 성능 튜닝 (0) | 2021.11.28 |