Training performance

[PMLE-EXAMTOPIC] Training performance - I/O bound, CPU bound, Memory bound

Tuning Performance to reduce training time, reduce cost, and increase scale.

Contstraint	Commonly Occurs	Take Action to improve the performance
I/O (Input/output) bound	- Large # of inputs - Input requires parsing (heterogeneous) - Small models - input data on a storage system with low throughput(처리량)	- Store efficiently - Parallelize reads - Consider *batch size*
Cpu bound	- Expensive computations - Underpowered Hardware	- Train on faster accelerator - Upgrade processor ; GPUs - Run on TPUs- Simplify model
Memory bound	- Large number of inputs Complex model	- Add more memory- Use fewer layers- Reduce *batch size*

I/O (Input/output) bound : how fast can you get data into the model in each training step?
- tf.data API : to build complex input pipelines from simple, reusable pieces.
  - Better performance with the tf.data API ⇒ flexible and efficient input pipeline
Cpu bound : how fast can you compute the gradient in each training step?
Accelerator GPUs and TPUs can radically reduce the time required to execute a single training step!
- GPU
- TPU option on Google Cloud
- Simpler model training
  1. less computationally expensive activation function
  2. just train for fewer steps
Memory bound : how many weights can you hold in memory, so that you can do the matrix multiplications in-memory on the GPU or TPU?
- Less Complex model
- REDUCE BATCH SIZE

GCP 실습 QWIKLABS 에러 해결 (0)	2022.01.01
Recommendation AI (0)	2021.12.13
[PMLE CERTIFICATE - EXAMTOPIC] Additional DUMPS (0)	2021.12.13
Testing the new app version : Canary, A/B, Shadow (0)	2021.12.12
[PMLE CERTIFICATE - EXAMTOPIC] DUMPS Q57-Q60 (0)	2021.12.10

JINSTORY