These examples highlight some of the main capabilities of Magpie. Before reviewing them, you may want to review how to install Magpie and read Magpie commands in the main part of the manual. If you have already installed it, each of these examples can be run by calling:
java -jar dist/Magpie.jar examples/<example name>
Imports a small dataset containing the properties of a few crystalline compounds (which was taken from OQMD) and generates attributes that describe the chemistry of those compounds (like the mean atomic radius)[LW 3Jun14: Will add a citation here when I have one]. Then, the script finds which of these attributes correlate best with the formation energy and determines a reduced set of those attributes that would be appropriate for building a linear model.
Trains a Reduced-Error Pruning Tree model designed to predict the formation energy of a small set of crystalline compounds. Then, validates the predictive capability of that model using ten-fold cross-validation.
Uses two distinct models to predict whether a compound will have a non-zero band gap. One model is trained on compounds that contain only metals, and the other on those with at least one non-metal constituent. Then, this example script validates the performance of this composite model using cross-validation and prints a receiver-operating characteristic curve.
First identifies a subset of attributes that can be used to model the formation energy of a compound. Then, applies a clustering algorithm to separate entries into groups with similar values of these attributes. After this, prints the centers of the each group (defined as the average of all entries in that group) and saves the entries contained in each data cluster into csv format.
Demonstrates the Clustering-Ranking-Modeling method of Meredig and Wolverton. First, this script separates data into 4 subsets using KMeans++ clustering. Then, it trains a quadratic model for dilute-limit solution energies in zirconia using on the attribute that best models that property for each subset.
Uses the Data-Mining Structure Prediction (DMSP) method of Fischer et al to predict the most-likely crystal structures of two example compounds: Na5Pb2 and Al2NiCu.
Performs nonlinear regression to fit parameters of an expression.
Uses a neural network model to predict the formation energy of a few compounds. Before constructing the model, adjusts attributes and class variable to exist on the same range.
Uses a RandomForest model from scikit-learn to predict the formation energies of a small set of compounds.
Shows how to use the evaluate command to predict the properties of entries directly from the text interface.
Finds a simple linear model that describes the formation enthalpy of crystalline compounds using the LASSO-based technique originally developed by Ghiringhelli et al.