RuleFit
A combination of Linear Regression and Decision Tree. This will let Linear Regression also use interactions between features which it can’t on its own.
Algorithm
- Fit many random depth decision trees and extract rules (paths)
- Add extracted rules as new (binary) features that can either be fullfilled (1) or not (0)
- Clip original features on corresponding Quantile
- Standardize clipped features to be on same scale as the binary rule features
- Fit LASSO on new feature matrix
Explainability
As the output of RuleFit is a Linear Regression model, we can just use its interpretation of the weights of the original features.
For the new features we can calculate the importance using
where is the rule support for rule and is the Linear Regression weight.
So to get the Feature Importance for exactly one datapoint (Local Explanation) we calculate To get a Global Explanation for a feature we can sum over all datapoints.
Resources
There are many other methods in the Python package imodels.