Feature Selection
A way to reduce the dimensionality of a Dataset.
Methods
-
Remove redundant attributes or irrelevant attributes (e.g. IDs)
- manul selection with expert knowledge
-
Univariate selection by Correlation coefficient between feature and target (minimum threshold to keep feature)
- might miss other interactions
-
Forward Selection
- train with one feature, select the best feature
- add feature
- repeat until number of desired features is reached
-
Backward Selection
- start with all features
- remove one feature, throw away with best performance
- repeat