Extending Pyspark's MLlib native feature selection function by using a feature importance score generated from a machine learning model and extracting the variables that are plausibly the most important
Spark
- Read article
- Read article
Custom cross-validation class written in PySpark with support for user-defined category such as by time, geographical or consumer segments.