https://d2l.ai/chapter_linear-classification/generalization-classification.html

popullation (spelling)

Fix the order for *"A number of burning questions demand immediate attention: 1 … 1… 1… "*

Can someone help me understand the problem of false discovery discussed in 4.6.2. I am unable to understand why using the same test set with multiple classifiers/models is a problem. As long as hyper parameter tuning is done on a validation set what is the problem in bench-marking performance of different models on the same test set. Similarly what exactly is adaptive over-fitting and how is it different from the problem of false discovery

“No such ~~a~~ uniform convergence result could possibly hold.”

# Addition about Hoeffding’s inequality

When I learn this chapter, what confused me is that the formular. How to learn the new formular? One of the best way is give an example with number.

Formular as below:

$$P(\epsilon_\mathcal{D}(f) - \epsilon(f) \geq t) < \exp\left( - 2n t^2 \right).$$

**Condition: 95% confidence that the distance between our estimate and the true error rate does not exceed 0.01.**

JS code to calculate the **n**:

```
Math.log(0.05)/(-2*0.01*0.01)
```

Which 0.05 is the value of the right-hand side of the inequality calculated by 1-95% (95% is the **confidence**), 0.01 is the **t**