See Javadoc for complete documentation of this class.
Usage: *No options*
These commands can be used to perform a variety of tasks, ranging from defining important settings about the object to actually using it.
clone – Create a copy of this model
normalize [attributes] [class] <method> [<options...>] – Define how to normalize data (data is not normalized by default)
attributes: Whether to normalize attributes
class: Whether to normalize class variable
method: Method used to normalize attributes
options...: Any options for the normalizer
output = crossvalidate $<dataset> <split size>> [<n repeats>] – Cross-validation by splitting dataset into train and test sets. Test is repeated multiple times
dataset: Dataset to use for cross validation
folds: Fraction of entries used in test set
n repeats: Number of times to repeat test
output: Dataset, result of used to compute performance statistics
Same command structure as k-fold cross-validation. Runs if the number of folds is less than 1.
output = crossvalidate $<dataset> [<folds>] – Use k-fold cross-validation to assess model performance.
dataset: Dataset to use for cross validation
folds: Number of cross validation folds (default = 10)
output: Dataset, result of used to compute performance statistics
Splits dataset into folds parts. Trains model on folds - 1 parts, validates against remaining part. Repeats using each part as the validation set.
run $<dataset> – Use model to predict class values for each entry
dataset: Dataset to evaluate
set selector $<filter> – Define the BaseDatasetFilter used to filter data before attribute normalization, attribute selection, and model training.
filter: Filter to use
set selector $<selector> – Define the BaseAttributeSelector used to screen attributes before training
selector: Attribute selector to use
splitter <method> [<options...>] – Define splitter used to partition dataset between models
method: Method used to split data. Name of a BaseDatasetSplitter ("?" for options)
options: Any options for the splitter
submodel get <number> = <output> – Retrieve a specific submodel
number: Index of submodel to retrieve (list starts with 0) Returns a clone of the model - you cannot use this to edit the model.
submodel get generic = <output> – Retrieve the template for any unassigned submodels
submodel set <number> $<model> – Set a specific submodel
number: Index of the submodel to set (list starts with 0)
model: An instance of BaseModel to use for that model
submodel set generic $<model> – Define a model template to use for all submodels
model: An instance of BaseModel. Note: Do not use this command for CompositeRegression unless each model automatically uses a different random number seed. Otherwise, each submodel will be identical.
submodel – Print the number of submodels
train $<dataset> – Train model using measured class values
dataset: Dataset used to train this model
validate $<dataset> – Validate model against external dataset
dataset: - Dataset to use for validate
These commands are run by calling "print <variable name> <command> [<options>]". Any output from that command will be printed to standard output.
description – Print out short description of this model.
model – Print out the model
selector – Print out attributes used selected by internal BaseAttributeSelector, if defined
splitter – Print out the name of splitter used by this model
submodel <number> [<command...>] – Pass a print command to one of the submodels
number: Index of model to operate on (starts at 0)
command: Print command that gets passed to that submodel
submodel – Print out number of submodels
training [<command>] – Print out statistics generated during training
command: Command to be passed to internal BaseStatistics object.
validation [<command>] – Print out statistics generated during validation
command: Command to be passed to internal BaseStatistics object.
Variables of this type can be saved in the following formats:
training – Print out performance data for training set
validation – Print out performance data for validation set