Logistic Regression Diagnostics
This tutorial describes the implementation of tools for the diagnostic and the assessment of a logistic regression. These tools are available in Tanagra version 1.4.33 (and later). We deal with a credit scoring problem. We try to determine by using logistic regression the factors underlying the agreement or refusal of a credit to customers. We perform the following steps: - Estimating the parameters of the classifier; - Retrieving the covariance matrix of coefficients; - Assessment using the Hosmer and Lemeshow goodness of fit test; - Assessment using the reliability diagram; - Assessment using the ROC curve; - Analysis of residuals, detection of outliers and influential points. On the one hand, we use Tanagra 1.4.33 . Then, on the other hand, we perform the same analysis using the R 2.9.2 software [glm(.) procedure] . Keywords : logistic regression, residual analysis, outliers, influential points, pearson residual, deviance residual, leverage, cook's distance, dfbeta, dfbetas, hos...