CoulombSineMatrixRegression

See Javadoc for complete documentation of this class.

Usage: <lambda> <sigma>
lambda: Regularization parameter
sigma: Normalization parameter in kernel function.

Available Operations

These commands can be used to perform a variety of tasks, ranging from defining important settings about the object to actually using it.

clone – Create a copy of this model

match $<dataset> <n> – Find the entries in the training set that are closest to those in a user-provided dataset
dataset: Dataset to be matched
n: Number to print Will print out the n entries in the dataset that are closest to the entries in the dataset.

normalize [attributes] [class] <method> [<options...>] – Define how to normalize data (data is not normalized by default)
attributes: Whether to normalize attributes
class: Whether to normalize class variable
method: Method used to normalize attributes
options...: Any options for the normalizer

output = crossvalidate $<dataset> <split size>> [<n repeats>] – Cross-validation by splitting dataset into train and test sets. Test is repeated multiple times
dataset: Dataset to use for cross validation
folds: Fraction of entries used in test set
n repeats: Number of times to repeat test
output: Dataset, result of used to compute performance statistics
Same command structure as k-fold cross-validation. Runs if the number of folds is less than 1.

output = crossvalidate $<dataset> [<folds>] – Use k-fold cross-validation to assess model performance.
dataset: Dataset to use for cross validation
folds: Number of cross validation folds (default = 10)
output: Dataset, result of used to compute performance statistics
Splits dataset into folds parts. Trains model on folds - 1 parts, validates against remaining part. Repeats using each part as the validation set.

robust <Q> – Define False Positive Rate used for robust regression. See Motulsky and Brown
Q: Desired FPR during outlier detection.

run $<dataset> – Use model to predict class values for each entry
dataset: Dataset to evaluate

set selector $<filter> – Define the BaseDatasetFilter used to filter data before attribute normalization, attribute selection, and model training.
filter: Filter to use

set selector $<selector> – Define the BaseAttributeSelector used to screen attributes before training
selector: Attribute selector to use

train $<dataset> – Train model using measured class values
dataset: Dataset used to train this model

validate $<dataset> – Validate model against external dataset
dataset: - Dataset to use for validate

Available Print Commands

These commands are run by calling "print <variable name> <command> [<options>]". Any output from that command will be printed to standard output.

description – Print out short description of this model.

model – Print out the model

selector – Print out attributes used selected by internal BaseAttributeSelector, if defined

training [<command>] – Print out statistics generated during training
command: Command to be passed to internal BaseStatistics object.

validation [<command>] – Print out statistics generated during validation
command: Command to be passed to internal BaseStatistics object.

Available Save Formats

Variables of this type can be saved in the following formats:

training – Print out performance data for training set

validation – Print out performance data for validation set