Fix the order for "A number of burning questions demand immediate attention: 1 … 1… 1… "
Can someone help me understand the problem of false discovery discussed in 4.6.2. I am unable to understand why using the same test set with multiple classifiers/models is a problem. As long as hyper parameter tuning is done on a validation set what is the problem in bench-marking performance of different models on the same test set. Similarly what exactly is adaptive over-fitting and how is it different from the problem of false discovery