About: This article describes the process of scoring a file with a predictive model in Predict.
Purpose: Use the Score tab to apply a Predict model formula to a dataset to score a variable for an outcome likelihood or predicted value.
Table of Contents
Introduction
The Score tab is a built-in tool to apply saved and/or memorized models to new datasets and score the records. The scoring process results in a probability score being calculated for each row in the selected dataset for binary y-variables (logistic regression) or a predicted value for each row in the selected dataset for continuous y-variables (Ordinary Least Squares linear regression).
Scoring a File
Open a File
To begin the scoring process, select the Score a File button from the Workspace tab. This launches the Open File window where users can select the data source and browse to the file to be scored.
Score Tab
Once the scoring file is located and saved, the Score tab will open. This is where the process of scoring a dataset takes place. The columns of the scoring dataset are listed on the left side of the screen. There are two options to import a model formula: Load Scoring Model or Select Memorized Model.
Select Load Scoring Model to select a previously saved or shared Veera Predict Scoring Model (.vpsm) file. Select Select Memorized Model to choose one of the predictive models that is memorized in an open Analysis Tab. If there is only one model memorized in the open Analysis tab, that model will automatically be selected. If there is more than one memorized model, a window will launch to select the desired model version.
Once the model file is located and selected, the formula will be imported into the Model Equation window. Variables used by the scoring equation must also be present in the scoring file. Columns from the scoring file that are required by the model and are present in the file are shown in green in the columns list. Columns that are required but are not present in the scoring file are shown in red in the columns list. Any columns not used by the model remain in black.
Output Options
If a column corresponding to the model's Y-variable appears in the scoring dataset, the output option Score the table and validate the model will be available. Selecting this option will generate a chart below the drop-down menu displaying the decile analysis or cumulative lift (decided by the toggle buttons to the right of the drop-down menu) of the model for predicting the given Y-variable when the dataset is scored.
Choosing the Just score the table option will list the number of records, number of variables and whether the scoring process was successful (or not). Quantile information can also be output by configuring the quantile drop down. This will add an additional column to the output containing the partition each record falls into.
Start Scoring
Once a model is loaded and the output options are configured, the Start Scoring button initiates the scoring process. Users are prompted to choose a location and file name for the scored dataset file to be saved. This is a duplicate of the original scoring file with an added column called Predicted Y where the probability score or predicted value calculated for each row is saved. Once the location is set, the scoring process may take a couple seconds to complete depending on the file size. When scoring is completed, a notice will be displayed at the bottom of the Score tab.
Related Articles
Comments
0 comments
Article is closed for comments.