05-Optimzation: Enhancements and fixes

The following items can be improved in this notebook: 
- [ ] The early sections "Recap" and "Dataset" are almost identical, so redundant
- [ ] Exercise 1
Presumably the expectation is to separate the train/test sets for the classifier and also for the voxel selection. It might be worth emphasizing that using all the data for voxel selection is a common but subtle error. There are probably quite a few good examples in the literature that got past less technical reviewers
In this example, I consistently get slightly below chance performance. I believe that this is driven by the cross-validation, see:
Classification based hypothesis testing in neuroscience: Below‐chance level classification rates and overlooked statistical properties of linear parametric classifiers. HBM 2016
Another subtle example of bias is given in the following by Watts et al 😊 Potholes and Molehills: Bias in the Diagnostic Performance of Diffusion-Tensor Imaging in Concussion. Radiology 2014
- [ ] In 3.1 Grid search
Strictly, the dependence of the number of combinations on granularity of the grid search is not exponential 

- [x] 3.2 Regularization Example: L2 vs L1
L1 regularization now requires solver='saga' in LogisticRegression call for L1 penalty. This is probably a change in the default behavior of Scikit Learn
- [x] 4. Build a Pipeline
As with 3.1, there seem to be a lot of parameters that give perfect accuracy. Maybe classifying by blocks is too easy, and the number of blocks is relatively low, so big steps in accuracy
- [ ] c_steps = [10e-1, 10e0, 10e1, 10e2] is confusing notation for exponents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

05-Optimzation: Enhancements and fixes #57

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

05-Optimzation: Enhancements and fixes #57

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions