How to model ROC curves - a credit scoring perspective

Abstract

ROC curves, which derive from signal detection theory, are widely used to assess binary classifiers in various domains. The AUROC (area under the ROC curve) ratio or its transformations (the Gini coefficient) belong to the most widely used synthetic measures of the separation power of classification models, such as medical diagnostic tests or credit scoring. Frequently a need arises to model an ROC curve. In the biostatistical context, modelling ROC curves was discussed mainly in the context of scarcity of available data and estimation, but in case of credit scoring the modelling may be required for other reasons. When a model for an ROC curve is needed, several options are available. In the article binormal, bilogistic, bigamma and bibeta models are defined, along with the novel approach: a bifractal ROC curve model. The models are tested against publicly presented empirical ROC data. As it turns out, taking into account goodness of fit, all presented models, except for the bilogistic curve, are comparable and fit the data quite well. The choice of the model should therefore be driven by other features of the curves under consideration.

Błażej Kochański
Błażej Kochański
Banking Risk Expert, Researcher and Management Consultant