ML&PR_4: Multinomial Variables | 2.2.
2.2. Multinomial Variables
date_range 08/05/2020 23:12
2.2. Multinomial Variables
- Binary variables can only describe quantities one of two possible values. Now I will introduce to you the case of possible values. In order to conveniently, we represent them by a -dimensional vector in which the element equals and all remaining elements equal .
For instance, we represent where :
Obviously we have
- If we denote the probability of by , the distribution of is:
where , and . It means:and:
- Now consider a data set , we can show that:
where .
- We can consider the joint distribution of the quantities , conditioned on and the total of number observations:
Formula is known that Multinomial distribution, where:and note that:
ML&PR_3: Binary Variables | 2.1.
2.1. Binary variables
date_range 05/05/2020 14:41
2.1. Binary variables
- Let imagine when flip a coin, the outcome is 'heads' or 'tails'. We represent it by numerals representing 'heads' and representing 'tails'. Now we call is binary variable.
- The probability of will be denoted by the parameter where :
and . The probability distribution of is a Bernoulli distribution:And we have:
- In general, when we flip a coin times and appears times. This is called Binomial distribution:
whereAnd:
UML_5: Agnostic PAC Learning | 3.2.1.
3.2.1. Agnostic PAC Learning
date_range 28/04/2020 01:36
3.2.1. Agnostic PAC Learning
- In previous articles of UML series, we know that the realizability assumption requires that there exist such that . In fact, labels don't completely depend on the features. Then, we relax the realizability assumption by replacing the "target labeling function" with more flexible, a data-labels generating distribution.
- We redefine be a probability distribution over . So, includes two parts: a distribution (marginal distribution) and (conditional probability).
- True Error Revised:
- Goal: We wish to find some hypothesis that minimizes .
- The Bayes Optimal Predictor:
- Given over , the best labeling function is:
- Given over , the best labeling function is:
UML_4: PAC Learning | 3.1.
3.1. PAC learning
date_range 26/04/2020 23:23
3.1. PAC learning
- In UML_3, we have shown that for a finite
hypothesis class and a sufficiently large training sample (follow the distribution and labeling function ) then output of will be probably approximately correct. Now, we define Probably Approximately Correct (PAC) learning. - Define (PAC learning):
is PAC learnable if there exist a function with property: - For every
, every over and every , the relizable assumption (Giả thuyết tính khả thi in UML_3) holds with respect , then when running the learning algorithm on , the algorithm returns a hypothesis such that, with probability of at least that we have .
- For every
ML&PR_2: Principal Component Analysis: Maximum variance formulation | 12.1.1.
12.1. Principal Component Analysis (PCA)
date_range 17/04/2020 21:52
12.1. Principal Component Analysis (PCA)
- PCA is a technique is widely used for: dimensionality reduction, lossy data compression, feature extraction and data visualization.