menu

ML&PR_4: Multinomial Variables | 2.2.

date_range 08/05/2020 23:12

2.2. Multinomial Variables

2.2. Multinomial Variables
  • Binary variables can only describe quantities one of two possible values. Now I will introduce to you the case of possible values. In order to conveniently, we represent them by a -dimensional vector in which the element equals and all remaining elements equal . For instance, we represent where :
    Obviously we have
  • If we denote the probability of by , the distribution of is:
    where , and . It means:
    and:
  • Now consider a data set , we can show that:
    where .
  • We can consider the joint distribution of the quantities , conditioned on and the total of number observations:
    Formula is known that Multinomial distribution, where:
    and note that:

ML&PR_3: Binary Variables | 2.1.

date_range 05/05/2020 14:41

2.1. Binary variables

2.1. Binary variables
  • Let imagine when flip a coin, the outcome is 'heads' or 'tails'. We represent it by numerals representing 'heads' and representing 'tails'. Now we call is binary variable.
  • The probability of will be denoted by the parameter where :
    and . The probability distribution of is a Bernoulli distribution:
    And we have:
  • In general, when we flip a coin times and appears times. This is called Binomial distribution:
    where
    And:

UML_5: Agnostic PAC Learning | 3.2.1.

date_range 28/04/2020 01:36

3.2.1. Agnostic PAC Learning

3.2.1. Agnostic PAC Learning
  • In previous articles of UML series, we know that the realizability assumption requires that there exist such that . In fact, labels don't completely depend on the features. Then, we relax the realizability assumption by replacing the "target labeling function" with more flexible, a data-labels generating distribution.
  • We redefine be a probability distribution over . So, includes two parts: a distribution (marginal distribution) and (conditional probability).
  • True Error Revised:
  • Goal: We wish to find some hypothesis that minimizes .
  • The Bayes Optimal Predictor:
    • Given over , the best labeling function is:

UML_4: PAC Learning | 3.1.

date_range 26/04/2020 23:23

3.1. PAC learning

3.1. PAC learning
  • In UML_3, we have shown that for a finite hypothesis class and a sufficiently large training sample (follow the distribution and labeling function ) then output of will be probably approximately correct. Now, we define Probably Approximately Correct (PAC) learning.
  • Define (PAC learning): is PAC learnable if there exist a function with property:
    • For every , every over and every , the relizable assumption (Giả thuyết tính khả thi in UML_3) holds with respect , then when running the learning algorithm on , the algorithm returns a hypothesis such that, with probability of at least that we have .

ML&PR_2: Principal Component Analysis: Maximum variance formulation | 12.1.1.

date_range 17/04/2020 21:52

12.1. Principal Component Analysis (PCA)

12.1. Principal Component Analysis (PCA)
  • PCA is a technique is widely used for: dimensionality reduction, lossy data compression, feature extraction and data visualization.