Glossary

Generator

The subsystem within a generative adversarial network that creates new examples. Contrast with discriminative model.

Generative model

Practically speaking, a model that does either of the following:

  • Creates (generates) new examples from the training dataset. For example, a generative model could create poetry after training on a dataset of poems. The generator part of a generative adversarial network falls into this category.
  • Determines the probability that a new example comes from the training set, or was created from the same mechanism that created the training set. For example, after training on a dataset consisting of English sentences, a generative model could determine the probability that new input is a valid English sentence.

A generative model can theoretically discern the distribution of examples or particular features in a dataset. That is:

p(examples)

Unsupervised learning models are generative.

Contrast with discriminative models.

GPT

GPT (Generative Pre-trained Transformer)
A family of Transformer-based large language models developed by OpenAI.
GPT variants can apply to multiple modalities, including:
✔ image generation (for example, ImageGPT)
✔ text-to-image generation (for example, DALL-E).

Cross Validation (CV)

Cross-Validation Nested Cross-Validation

Cross-Validation

  • Common types of cross-validation : k-fold cross-validation and hold-out cross-validation
  • CV Procedure
    1. Split the dataset into 2 subsets → Training set & Test set
    2. For parameter tuning : Split the Trining set into 2 subsets → Training subset & Validation set
      • Train the model on the Training subset
      • Choose the parameters that minimize the error on the Validation set
    3. Training the model on the Full Training set using the chosen parameters & Record the error on the Test set

Nested Cross-Validation

  • Traditional CV methods should not be used on Time-series(TS) data : Temporal Dependencies & Arbitrary Choice of Test Set
Nested CV Procedure
1. Predict Second Half
2. Day Forward-Chaining

Multimodal Learning

인간의 인지적 학습법을 모방하여 다양한 형태(modality)의 데이터로 학습하는 방법으로, 변수차원이 각기다른 데이터셋 (modality)가 여럿 모여 동시에 학습하는 방법론이다. (데이터 통합적 분석을 위해 중요성 대두)

Multimodal

  • Multimodal : Modality가 여러 개 존재
    Modality 양식 : 특정 자원으로부터 수집된 데이터 표현 방식
    → Multimodal data : 다양한 자원으로부터 수집된 데이터가 하나의 정보를 표현하는 데이터

  • e.g, 반도체 관측치 하나에 대해 센서 시그널, 이미지, 텍스트 데이터 수집해 분석