Khoa Blog

OPT_2: Gradient descent convergence rate (P1)

21/12/2021 15:50

Gradient descend convergence guarantee

Given an objective function with is the parameters (weights), satisfies:
- Lipschitz continuous: $||\nabla f(w)-\nabla f(v)||\le L||w-v||$ $(1)$
- $C^2$ function: $\nabla^2f(w)\preceq LI$ $(2)$

UML_6: Learning via Uniform Convergence

24/11/2021 09:36

UML I.4 Learning via Uniform Convergence

Recall in previous posts, we discussed about the realizable assumption and ERM learning. We hope an hypothesis $h$ , when minimizing error on $S$ , also respect to $\mathcal{D}$ . In other words, we need all members of $\mathcal{H}$ are good approximations of their true risk.
Def 1( $\epsilon$ -representative sample): A training set $S$ is called $\epsilon$ -representative if
Lemma 1: Assume that a training set is -representative, any satisfies
- This lemma implies that the ERM rule is an agnostic PAC learner, it suffices to show that with probability of at least $1-\delta$ over the random choice of a $S$ , it will be an $\epsilon$ -representative training set.
- The proof in $\text{Appendix 1}$ .

OPT_1: Gradient descent on convex function

02/07/2021 03:50

As we know in gradient descent method, we shift our parameters against descent direction. In this post, we denote as:

MATH_7: From Binomial to Poisson distribution

22/04/2021 02:59

math7

Problem: Toss a coin $n$ times, let $X=1$ represents "obtain a head" and $X=0$ represents "obtain a tail" where $\mathbf{Pr}(X=1)=p$ . The probability of obtaining $x$ times $X=1$ is:

CVX_3: Operations that preserve convexity

15/04/2021 19:20

cvx3 In this post, we study about the operations on sets that preserve convexity.

Theorem 1: If all of $\mathcal{S_i}|_{i=1}^n$ are convex, $\bigcap_{i=1}^n\mathcal{S_i}$ is convex. ( $\text{Appendix 1}$ )