PMML models
Predictive Model Markup Language
During implementation of a Pega Decision Management based project you will frequently discover that the customer already uses predictive models and would like to reuse their assets in the project to speed up its implementation. A data scientist is likely to be asked to evaluate the suitability of existing, Predictive Model Markup Language (PMML) compliant models and explain how they fit into Pega Decision Management landscape.
With PMML, it is easy to develop a model using the application of your choice and deploy the model in Prediction Studio, simply by transmitting an XML configuration file.
In this lesson you will get a high level introduction to PMML and how to use these third-party models in Pega Decision Management.
PMML is the leading standard for statistical and data mining models. The language was developed by the Data Mining Group, an independent, vendor led consortium which includes both commercial vendors as well as open source representatives.
PMML is an XML-based language used to represent predictive models created as the result of a predictive modelling process. It allows for predictive models to be easily shared between applications. This XML-based language is the de-facto standard to represent not only predictive and descriptive models, but also data transformations (data pre and post-processing).
PMML, like HTML is a Markup Language and as such is split into common components.
- Header – It contains general information about the PMML document
- Data Dictionary – It contains a definition of all raw data fields used by the model
- Data Transformation – It provides mapping of user data to a form used by the model
- Model – It contains a definition of the model itself
You will find a wealth of information detailing the content of each section on the Data Mining Group site.
Importing a PMML model in Prediction Studio
You can utilize PMML compliant models through an import mechanism supported by the Predictive Model rule. The Predictive Model is a Decision type record that is usually stored in the customer class.
During its configuration you upload the PMML model and map the predictors defined in the model to the customer properties. Some of the predictors may be mapped automatically. This happens if the name and the type of the property match those of the predictor.
Finally, you also select the desired output of the predictive model. The output of the model is mapped to the pxSegment strategy property when you reference the model in a decision strategy.
You can test the model by entering input values and running it. The model validates your inputs and returns the output or an error message in case an input value is Wide of Scheme.
Currently supported model types are:
- Clustering
- GeneralRegressionModel
- MiningModel
- NaiveBayesModel
- NearestNeighborModel
- NeuralNetwork
- RegressionModel
- RuleSetModel
- Scorecard
- SupportVectorMachineModel
- TreeModel
Once configured, the PMML model can be directly actioned in decision strategies by simply selecting the Predictive Model strategy component found under Decision Analytics classification.
The configuration involves entering the model name and mapping of the model output to strategy properties. The actual result of the model is automatically mapped to the pxSegment strategy properties. If desired, you can map the other outputs to user defined strategy properties.