public class LassoAttributeSelector extends PythonBasedAttributeSelector implements Citable
This method works by first using compressed sensing (i.e., LASSO) to select a subset of attributes that best describe a property, and then selecting a smaller set of attributes from the LASSO set by finding a set with maximum fitness with a linear regression model.
There are a few options for how to do this and all are implemented by this class
Usage: -n_lasso <# lasso> -max_dim <dim>
[-corr_downselect <# corr>] [-cv_method <cv frac> <cv iters>]
[-pick_best] [-debug]
# lasso: Number of attributes to select with LASSO
max dim: Maximum dimension of final set
# corr: Set size after removing strongly-correlated attributes (by default,
this step is not run)
cv frac: Fraction of entries to withhold during cross validation
(by default, cross-validation is not performed)
cv iter: Number of cross-validation tests to run
-pick_best: Whether to pick dimension based on cross-validation results
-debug: Print status messages from the underlying Python script to screen
| Modifier and Type | Field and Description |
|---|---|
protected double |
CVFraction
Fraction of entries to withhold of cv test set.
|
protected int |
CVIterations
Number of times to repeat CV test.
|
protected int |
MaxCount
Maximum number of attributes to select.
|
protected int |
NDownselect
Number of entries to downselect to based on correlations.
|
protected int |
NLASSO
Number of attributes to pick via LASSO.
|
protected boolean |
SelectSizeAutomatically
Whether to pick dataset size via cross-validation.
|
Debugtrained| Constructor and Description |
|---|
LassoAttributeSelector() |
| Modifier and Type | Method and Description |
|---|---|
protected java.util.List<java.lang.String> |
assembleSystemCall(java.io.File codePath,
Dataset data)
Prepare the system call with all command-line arguments
|
java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,Citation>> |
getCitations()
Return a list of citations for this object and any underlying objects.
|
protected java.lang.String |
getScriptPath()
Get the path to the Python script to be run.
|
java.lang.String |
printDescription(boolean htmlFormat)
Print full name of object, and a simple description of the options.
|
java.lang.String |
printUsage()
Print out required format for options.
|
void |
setCVFraction(double cvFraction)
Set the fraction of entries to without during cross-validation.
|
void |
setCVIterations(int nIter)
Set the number of iterations to perform during cross-validation.
|
void |
setMaximumDimension(int count)
Set the maximum number of attributes to select
|
void |
setNDownselect(int num)
Set number of attributes to downselect to by removing strongly-correlated
entries.
|
void |
setNLASSO(int NLASSO)
Set the number of parameters to determine via LASSO
|
void |
setOptions(java.util.List<java.lang.Object> Options)
Set any options for this object.
|
void |
setSelectSizeAutomatically(boolean x)
Set whether to determine number of attributes through cross-validation.
|
spawnStderrReader, train_protectedabout, applyAttributeSelection, clone, getSelectionNames, getSelections, isTrained, printCommand, printSelections, run, runCommand, trainprotected int NLASSO
protected int NDownselect
protected int MaxCount
protected double CVFraction
protected int CVIterations
protected boolean SelectSizeAutomatically
public void setOptions(java.util.List<java.lang.Object> Options)
throws java.lang.Exception
OptionssetOptions in interface OptionsOptions - Array of options as Objects - can be nulljava.lang.Exception - if problem with inputspublic java.lang.String printUsage()
OptionsprintUsage in interface Optionsprotected java.lang.String getScriptPath()
PythonBasedAttributeSelectorgetScriptPath in class PythonBasedAttributeSelectorpublic void setCVFraction(double cvFraction)
cvFraction - Fraction of entries to use as test setpublic void setCVIterations(int nIter)
nIter - Number of iterationspublic void setMaximumDimension(int count)
count - Maximum number of attributespublic void setNDownselect(int num)
num - Size of dataset after downselectionpublic void setSelectSizeAutomatically(boolean x)
x - Desired settingpublic void setNLASSO(int NLASSO)
NLASSO - Number of attributesprotected java.util.List<java.lang.String> assembleSystemCall(java.io.File codePath,
Dataset data)
assembleSystemCall in class PythonBasedAttributeSelectorcodePath - Path to executable or script to be rundata - Dataset being used to train attribute selectorpublic java.lang.String printDescription(boolean htmlFormat)
PrintableExample: For a model training a separate WekaRegression for intermetallics
magpie.models.regression.SplitRegression
printDescription in interface PrintableprintDescription in class BaseAttributeSelectorhtmlFormat - Whether format for output to an HTML page
(e.g., <div> to create indentation) or for printing to screen.#printModel()public java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,Citation>> getCitations()
CitablegetCitations in interface Citable