public class LassoAttributeSelector extends PythonBasedAttributeSelector implements Citable
This method works by first using compressed sensing (i.e., LASSO) to select a subset of attributes that best describe a property, and then selecting a smaller set of attributes from the LASSO set by finding a set with maximum fitness with a linear regression model.
There are a few options for how to do this and all are implemented by this class
Usage: -n_lasso <# lasso> -max_dim <dim>
[-corr_downselect <# corr>] [-cv_method <cv frac> <cv iters>]
[-pick_best] [-debug]
# lasso: Number of attributes to select with LASSO
max dim: Maximum dimension of final set
# corr: Set size after removing strongly-correlated attributes (by default,
this step is not run)
cv frac: Fraction of entries to withhold during cross validation
(by default, cross-validation is not performed)
cv iter: Number of cross-validation tests to run
-pick_best: Whether to pick dimension based on cross-validation results
-debug: Print status messages from the underlying Python script to screen
Modifier and Type | Field and Description |
---|---|
protected double |
CVFraction
Fraction of entries to withhold of cv test set.
|
protected int |
CVIterations
Number of times to repeat CV test.
|
protected int |
MaxCount
Maximum number of attributes to select.
|
protected int |
NDownselect
Number of entries to downselect to based on correlations.
|
protected int |
NLASSO
Number of attributes to pick via LASSO.
|
protected boolean |
SelectSizeAutomatically
Whether to pick dataset size via cross-validation.
|
Debug
trained
Constructor and Description |
---|
LassoAttributeSelector() |
Modifier and Type | Method and Description |
---|---|
protected java.util.List<java.lang.String> |
assembleSystemCall(java.io.File codePath,
Dataset data)
Prepare the system call with all command-line arguments
|
java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,Citation>> |
getCitations()
Return a list of citations for this object and any underlying objects.
|
protected java.lang.String |
getScriptPath()
Get the path to the Python script to be run.
|
java.lang.String |
printDescription(boolean htmlFormat)
Print full name of object, and a simple description of the options.
|
java.lang.String |
printUsage()
Print out required format for options.
|
void |
setCVFraction(double cvFraction)
Set the fraction of entries to without during cross-validation.
|
void |
setCVIterations(int nIter)
Set the number of iterations to perform during cross-validation.
|
void |
setMaximumDimension(int count)
Set the maximum number of attributes to select
|
void |
setNDownselect(int num)
Set number of attributes to downselect to by removing strongly-correlated
entries.
|
void |
setNLASSO(int NLASSO)
Set the number of parameters to determine via LASSO
|
void |
setOptions(java.util.List<java.lang.Object> Options)
Set any options for this object.
|
void |
setSelectSizeAutomatically(boolean x)
Set whether to determine number of attributes through cross-validation.
|
spawnStderrReader, train_protected
about, applyAttributeSelection, clone, getSelectionNames, getSelections, isTrained, printCommand, printSelections, run, runCommand, train
protected int NLASSO
protected int NDownselect
protected int MaxCount
protected double CVFraction
protected int CVIterations
protected boolean SelectSizeAutomatically
public void setOptions(java.util.List<java.lang.Object> Options) throws java.lang.Exception
Options
setOptions
in interface Options
Options
- Array of options as Objects - can be null
java.lang.Exception
- if problem with inputspublic java.lang.String printUsage()
Options
printUsage
in interface Options
protected java.lang.String getScriptPath()
PythonBasedAttributeSelector
getScriptPath
in class PythonBasedAttributeSelector
public void setCVFraction(double cvFraction)
cvFraction
- Fraction of entries to use as test setpublic void setCVIterations(int nIter)
nIter
- Number of iterationspublic void setMaximumDimension(int count)
count
- Maximum number of attributespublic void setNDownselect(int num)
num
- Size of dataset after downselectionpublic void setSelectSizeAutomatically(boolean x)
x
- Desired settingpublic void setNLASSO(int NLASSO)
NLASSO
- Number of attributesprotected java.util.List<java.lang.String> assembleSystemCall(java.io.File codePath, Dataset data)
assembleSystemCall
in class PythonBasedAttributeSelector
codePath
- Path to executable or script to be rundata
- Dataset being used to train attribute selectorpublic java.lang.String printDescription(boolean htmlFormat)
Printable
Example: For a model training a separate WekaRegression for intermetallics
magpie.models.regression.SplitRegression
printDescription
in interface Printable
printDescription
in class BaseAttributeSelector
htmlFormat
- Whether format for output to an HTML page
(e.g., <div> to create indentation) or for printing to screen.#printModel()
public java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,Citation>> getCitations()
Citable
getCitations
in interface Citable