public class ExhaustiveAttributeSelector extends BaseAttributeSelector
Iterates through all possible attribute subsets between two user-defined sizes. Each subset is rated by measuring which can be used to train a model with the highest score in cross-validation. For regression models, the subset with the lowest RMSE is selected and the model with the highest accuracy is used for classification models.
Usage: $<model> [-min_size <min>]
[-max_size <max>] [-k_fold <k>] | [-random_cv <test_frac>
<n_repeat>] | [-train]
model: BaseModel used to test attribute sets
min: Minimum attribute set size (default=1)
max: Maximum attribute set size (default=4)
k: Use k-fold CV to evaluate model: Number of folds to use during cross-validation
test_frac: Use random split CV for validation: Size of split for test set
n_repeat: Use random split CV for validation: Number of times to repeat test
By default, class uses 10-fold CV. Can specify only one option of "-k_fold"
(for k-fold CV), -random_cv (for multiple test/train set splits), and "-train"
(for using training set score).
Modifier and Type | Class and Description |
---|---|
static class |
ExhaustiveAttributeSelector.EvaluationMethod
List of methods used to evaluate model performance
|
Modifier and Type | Field and Description |
---|---|
protected int |
KFolds
If test method is K-Fold: Number of folds used in cross-validation test
|
protected int |
MaxSubsetSize
Maximum attribute subset size
|
protected int |
MinSubsetSize
Minimum attribute subset size
|
protected BaseModel |
Model
Model used for cross-validation
|
protected int |
RandomTestCount
If method is random split, number of times to repeat test
|
protected double |
RandomTestFraction
If method is random split, fraction of entries withheld for test set
|
protected java.util.Iterator<int[]> |
SetIterator
Iterator over subsets of attributes to be tested
|
protected ExhaustiveAttributeSelector.EvaluationMethod |
TestMethod
Method used to evaluate models
|
trained
Constructor and Description |
---|
ExhaustiveAttributeSelector() |
Modifier and Type | Method and Description |
---|---|
BaseAttributeSelector |
clone() |
java.lang.String |
printDescription(boolean htmlFormat)
Print full name of object, and a simple description of the options.
|
java.lang.String |
printUsage()
Print out required format for options.
|
protected void |
setCombinationIterator(Dataset data)
Create an iterator over all attribute subsets being considered.
|
void |
setMaxSubsetSize(int size)
Set maximum size of subset to be tested
|
void |
setMinSubsetSize(int size)
Set minimum size of subset to be tested
|
void |
setModel(BaseModel model)
Set model used to test subset performance
|
void |
setNFolds(int k)
Set the number of folds used in K-fold cross-validation.
|
void |
setOptions(java.util.List<java.lang.Object> Options)
Set any options for this object.
|
void |
setRandomSplit(double test_split,
int n_repeats)
Set the size of test set split and number of times to repeat CV.
|
void |
setTestMethod(ExhaustiveAttributeSelector.EvaluationMethod method)
Set method used to evaluate model performance
|
protected java.util.List<java.lang.Integer> |
train_protected(Dataset data)
Operation that actually does the work for training.
|
about, applyAttributeSelection, getSelectionNames, getSelections, isTrained, printCommand, printSelections, run, runCommand, train
protected BaseModel Model
protected int MinSubsetSize
protected int MaxSubsetSize
protected int KFolds
protected ExhaustiveAttributeSelector.EvaluationMethod TestMethod
protected double RandomTestFraction
protected int RandomTestCount
protected java.util.Iterator<int[]> SetIterator
public BaseAttributeSelector clone()
clone
in class BaseAttributeSelector
public void setOptions(java.util.List<java.lang.Object> Options) throws java.lang.Exception
Options
Options
- Array of options as Objects - can be null
java.lang.Exception
- if problem with inputspublic java.lang.String printUsage()
Options
public void setModel(BaseModel model)
model
- Model templatepublic void setMinSubsetSize(int size)
size
- Desired minimum sizepublic void setMaxSubsetSize(int size)
size
- Desired maximum sizepublic void setTestMethod(ExhaustiveAttributeSelector.EvaluationMethod method)
method
- Desired methodpublic void setNFolds(int k)
k
- Number of foldspublic void setRandomSplit(double test_split, int n_repeats)
test_split
- n_repeats
- protected java.util.List<java.lang.Integer> train_protected(Dataset data)
BaseAttributeSelector
train_protected
in class BaseAttributeSelector
data
- Dataset used to train selectorprotected void setCombinationIterator(Dataset data)
data
- Dataset being used to train attribute selectorpublic java.lang.String printDescription(boolean htmlFormat)
Printable
Example: For a model training a separate WekaRegression for intermetallics
magpie.models.regression.SplitRegression
printDescription
in interface Printable
printDescription
in class BaseAttributeSelector
htmlFormat
- Whether format for output to an HTML page
(e.g., <div> to create indentation) or for printing to screen.#printModel()