Home About Blog pRojects

FSelectorRcpp on CRAN

FSelectorRcpp - Rcpp (free of Java/Weka) implementation of FSelector entropy-based feature selection algorithms with a sparse matrix support, has finally arrived on CRAN after a year of development. It is also equipped with a parallel backend.

Big thanks to the main architect: Zygmunt Zawadzki, zstat, and our reviewer: Krzysztof Słomczyński.

If something is missing or not clear - please chat with us on our slack?

Get started: Motivation, Installation and Quick Workflow

Provided functionalities

Blog posts history with use cases

Quick Workflow

A simple entropy based feature selection workflow. Information gain is an easy, linear algorithm that computes the entropy of a dependent and explanatory variables, and the conditional entropy of a dependent variable with a respect to each explanatory variable separately. This simple statistic enables to calculate the belief of the distribution of a dependent variable when we only know the distribution of a explanatory variable.

       
# install.packages(c('magrittr', 'FSelectorRcpp'))
library(magrittr)
library(FSelectorRcpp)       
information_gain(               # Calculate the score for each attribute
    formula = Species ~ .,      # that is on the right side of the formula.
    data = iris,                # Attributes must exist in the passed data.
    type  = "infogain",         # Choose the type of a score to be calculated.
    threads = 2                 # Set number of threads in a parallel backend.
  ) %>%                          
  cut_attrs(                    # Then take attributes with the highest rank.
    k = 2                       # For example: 2 attrs with the higehst rank.
  ) %>%                         
  to_formula(                   # Create a new formula object with 
    attrs = .,                  # the most influencial attrs.
    class = "Species"           
  ) %>%
  glm(
    formula = .,                # Use that formula in any classification algorithm.
    data = iris,                
    family = "binomial"         
)
       

Orly cover

Acknowledgements

The cover photo of this blog posts comes from https://newevolutiondesigns.com/20-fire-art-wallpapers