public abstract class CSPEngine extends java.lang.Object implements Commandable, Printable, Options
Usage Guide
To predict the probability that a crystal structure will form:
Implemented Commands:
exclude <elements...> - Remove entries containing certain elements
from the list of known compounds
get model = $<model> - Get the classifier used to predict last crystal structure predict <composition> [<# to show>] - Predict what
structural prototypes are most likely for a certain compound
prototypes <filename> - Import list of known prototypes
examples <composition> - Get a list of other
possible prototypes for a certain composition and all known examples for
each prototype
validate <ncomp> [<folds>] - Evaluate performance of CSP algorithm
using cross-validation.
Implemented Print Commands:
stats - Print out the number of predictions used to generate
performance statistics. stats list-length [<min prob>] [<max length>] -
Print out the minimum number of prototypes that need to be calculated for
a certain prediction success probability
Implementation Guide:
In order to create your own crystal structure predictor, you just need to implement a single operation: makeClassifier(PhaseDiagramStatistics, PrototypeDataset) }. This operation generates a classifier that will predict the probability that a certain crystal structure will form given composition. Composition is supplied as a PrototypeEntry where the least-prevalent element is on "A" site, the second least is on the "B", and so on. Any sites with equal fractions are treated as equal (which is a key feature of PrototypeEntry).
Modifier and Type | Field and Description |
---|---|
protected java.util.Map<java.lang.Integer,PhaseDiagramStatistics> |
DiagramStatistics
Statistics about phase diagrams with certain number of constituents.
|
protected java.util.Map<CompositionEntry,java.lang.String> |
KnownCompounds
List of known compounds (used when making predictions)
|
protected int |
LastCompositionBin
Matched composition bin of last entry evaluated.
|
protected int |
LastNComponents
Number of elements in the last entry evaluated in this class.
|
protected CSPPerformanceStats |
PerformanceStats
Holds statistics about CSP algorithm performance.
|
Constructor and Description |
---|
CSPEngine() |
Modifier and Type | Method and Description |
---|---|
java.lang.String |
about()
Prints a simple status message about this object
|
void |
crossvalidate(int nComp,
int folds)
Validates predictive ability of this model.
|
protected void |
fillPrototypeEntry(PrototypeEntry entry,
int[] siteIdentity)
Adjust a PrototypeEntry so that each site is occupied by a specific
element
|
protected java.lang.Object |
getComponent(java.lang.String Name)
Get a clone of a specific component of this CSP engine.
|
java.util.List<java.lang.String> |
getPossibleStructures(CompositionEntry composition)
Get a list of possible structure types for a structure, given composition.
|
java.util.List<java.lang.String> |
getPossibleStructures(java.lang.String composition)
Get a list of possible structure types for a structure, given composition.
|
protected double[] |
getProbabilities(BaseClassifier classifier,
java.util.List<java.lang.String> knownPrototypes,
PrototypeSiteInformation siteInfo,
PrototypeEntry entryToPredict)
Calculate probability that a compound will form as each of the known prototypes.
|
PrototypeDataset |
getTrainingSet(CompositionEntry composition)
Get a training set of all examples of all possible crystal structures
at a certain composition
|
void |
importKnownCompounds(java.lang.String filename)
Gather a list of known compounds an their prototypes.
|
protected abstract BaseClassifier |
makeClassifier(PhaseDiagramStatistics statistics,
PrototypeDataset trainData)
Given the dataset of training examples, make a classifier to predict the probability that a prototype
will form at a certain composition.
|
protected PrototypeSiteInformation |
makeSiteInfo(double[] fracs)
Generate PrototypeSiteInformation appropriate for a certain
crystal prototype.
|
java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,java.lang.Double>> |
predictStructure(CompositionEntry composition)
Predict what structure is most likely at a certain composition
|
java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,java.lang.Double>> |
predictStructure(java.lang.String Composition)
Predict what structure is most likely at a certain composition
|
java.lang.String |
printCommand(java.util.List<java.lang.String> Command)
Handles more complicated printing commands.
|
java.lang.String |
printDescription(boolean htmlFormat)
Print full name of object, and a simple description of the options.
|
void |
removeKnownCompoundsContainingElements(java.util.List<java.lang.String> ElementList)
Given a list of elements, remove all entries that contain those elements
from the list of known compounds.
|
java.lang.Object |
runCommand(java.util.List<java.lang.Object> Command)
Process some command described by a list of Objects.
|
void |
setKnownCompounds(java.util.Map<CompositionEntry,java.lang.String> KnownCompounds)
Define the composition and prototype of already-known compounds.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
printUsage, setOptions
protected java.util.Map<CompositionEntry,java.lang.String> KnownCompounds
protected java.util.Map<java.lang.Integer,PhaseDiagramStatistics> DiagramStatistics
protected int LastNComponents
protected int LastCompositionBin
protected CSPPerformanceStats PerformanceStats
public void importKnownCompounds(java.lang.String filename)
<compound #1 composition>[tab]<compound #1 prototype>
<compound #2 composition>[tab]<compound #2 prototype>
[...]
filename
- Path to compound data filepublic void setKnownCompounds(java.util.Map<CompositionEntry,java.lang.String> KnownCompounds)
KnownCompounds
- Map of a compound's composition to the name of its prototypepublic void removeKnownCompoundsContainingElements(java.util.List<java.lang.String> ElementList)
ElementList
- List of elements to be removedpublic java.util.List<java.lang.String> getPossibleStructures(java.lang.String composition) throws java.lang.Exception
composition
- Composition of structure in questionjava.lang.Exception
public java.util.List<java.lang.String> getPossibleStructures(CompositionEntry composition)
composition
- Composition of structure in questionpublic PrototypeDataset getTrainingSet(CompositionEntry composition)
composition
- Composition of interestpublic java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,java.lang.Double>> predictStructure(java.lang.String Composition) throws java.lang.Exception
Composition
- Composition to be evaluatedjava.lang.Exception
- If Composition does not parse correctlypublic java.util.List<org.apache.commons.lang3.tuple.Pair<java.lang.String,java.lang.Double>> predictStructure(CompositionEntry composition)
composition
- Composition to be evaluatedprotected PrototypeSiteInformation makeSiteInfo(double[] fracs)
fracs
- Fractions of elements on each site. Sorted in ascending orderprotected double[] getProbabilities(BaseClassifier classifier, java.util.List<java.lang.String> knownPrototypes, PrototypeSiteInformation siteInfo, PrototypeEntry entryToPredict)
classifier
- Model trained to predict which prototype will form, given composition.knownPrototypes
- List of known prototypessiteInfo
- Information about the prototypesentryToPredict
- Entry to predictprotected abstract BaseClassifier makeClassifier(PhaseDiagramStatistics statistics, PrototypeDataset trainData)
statistics
- Statistics about all known phase diagrams (could be used predictively)trainData
- Training data, where each entry mapsprotected void fillPrototypeEntry(PrototypeEntry entry, int[] siteIdentity)
entry
- Entry to be modifiedsiteIdentity
- Identity of element present on each site.public void crossvalidate(int nComp, int folds)
folds
different subsets. Then, the CSP
algorithm is used to predict which compound will form for each compound in one of subsets given
all of the known compounds in the remaining folds - 1
. This process is then repeated
for each other subset. Any compound that has a different number of compounds is always kept
Note: If a compound in the testing set has a prototype that is not present in the remaining subsets, no attempt is made to predict its structure. The number of times this occurs will be recorded.
nComp
- Only attempt to predict structures of compounds with these many elementsfolds
- Number of folds to use (if ≤0, preform leave-one-out cross-validation)public java.lang.Object runCommand(java.util.List<java.lang.Object> Command) throws java.lang.Exception
Commandable
runCommand
in interface Commandable
Command
- Command as a list of objectsjava.lang.Exception
- If something goes wrongpublic java.lang.String about()
Printable
public java.lang.String printDescription(boolean htmlFormat)
Printable
Example: For a model training a separate WekaRegression for intermetallics
magpie.models.regression.SplitRegression
printDescription
in interface Printable
htmlFormat
- Whether format for output to an HTML page
(e.g., <div> to create indentation) or for printing to screen.#printModel()
public java.lang.String printCommand(java.util.List<java.lang.String> Command) throws java.lang.Exception
Printable
printCommand
in interface Printable
Command
- Command specifying what to printjava.lang.Exception
- If command not understoodprotected java.lang.Object getComponent(java.lang.String Name) throws java.lang.Exception
Name
- Name of componentjava.lang.Exception
- For various reasons (see text)