Linear Regression Techniques

I used five different methods of linear fitting to map images of hand drawn digits: LASSO, robustfit (least squares), QR decomposition, Moore-Penrose pseudoinverse, and ridge regression. I compared the characteristics of each fit. For each method, I also determined the pixels which represented 90% of the total pixel weightings (summed over each image), and created a sparse fit. I then determined the pixels which represented 90% of the total pixel weightings for each individual digit and created a sparse fit.

The full paper describing this project is here.

The hand drawn digits came from the MNIST training data set which includes 60,000 labeled images of hand drawn digits 0 through 9.

QR Decomposition and pinv gave the highest accuracy (percentage true) results for the full fit, but gave very poor accuracy for the sparse fits. Robustfit gave the best sparsity to produce comparable accuracy, especially for the individual digit pixel masks. Ridges required by far the most pixels for 90% of the weightings. Robustfit only required 87 pixels, whereas ridges required 4638. However, ridges gave much better accuracy. Using individual digit pixel masks generally gave better accuracy.

Fun with Data Science!

Linear Regression Techniques

Leave a Reply Cancel reply