UML_4: PAC Learning

UML_4: PAC Learning | 3.1.

26/04/2020 23:23

Statistical_Learning

sl

vapnik

3.1. PAC learning

In UML_3, we have shown that for a finite $\mathcal{H}$ hypothesis class and a sufficiently large training sample (follow the $\mathcal{D}$ distribution and labeling function $f$ ) then output of $\mathcal{H}$ will be probably approximately correct. Now, we define Probably Approximately Correct (PAC) learning.
Define (PAC learning): $\mathcal{H}$ is PAC learnable if there exist a function $m_{\mathcal{H}}(\epsilon,\delta):(0,1)^2\to\N$ with property:
- For every $(\epsilon,\delta)\in(0,1)$ , every $\mathcal{D}$ over $\mathcal{X}$ and every $f:\mathcal{X}\to\{0,1\}$ , the relizable assumption (Giả thuyết tính khả thi in UML_3) holds with respect $\mathcal{H,D,f}$ , then when running the learning algorithm on $m\ge m_{\mathcal{H}}(\epsilon,\delta)$ , the algorithm returns a hypothesis $h$ such that, with probability of at least $1-\delta$ that we have $L_{(\mathcal{D,f})}(h)\le\epsilon$ .
In PAC learning, we have two parameters:
- $\epsilon$ determines how far the output classifier can be optimal.
- $\delta$ indicating how likely $h$ meets accuracy $\epsilon$ .
Sample complexity:
- The function $m_{\mathcal{H}}(\epsilon,\delta):(0,1)^2\to\N$ determines the sample complexity of $\mathcal{H}$ . How many samples are required to guarantee property of PAC? It also depend on the size of $\mathcal{H}$ .
- If $\mathcal{H}$ is PAC learnable, there are many $m_{\mathcal{H}}$ that satisfy the requirement given in the Define. Thus, we will define the sample complexity of $\mathcal{H}$ so that for any $\epsilon, \delta$ , $m_{\mathcal{H}}(\epsilon,\delta)$ can satisfy the requirement of PAC. Let remember the conclusion of the analysis of finite hypothesis class in UML_3. Now we have:
  
  Corollary: Every finite hypothesis $\mathcal{H}$ is PAC learnable with sample complexity:

UML_4: PAC Learning | 3.1.

3.1. PAC learning

Reference: