It applies to any data type whereas caret::nearZeroVar() is only for numeric columns.
Usage
rmNZV(df1, minUniPerc = 0.05, minUniCount = 5)
Arguments
- df1
a data.frame or matrix
- minUniPerc, minUniCount
criteria to remove columns
unique values are all the values except the most common value, e.g. 1,4,2 in c(1,2,3,3,4).
uniCount and uniPerc are the count and percentage of samples having the unique values
failure to match either leads to removal
Value
a cleaned df1; data type is kept even if 0 or 1 column remains