Image Recognition with Machine Learning¶
A classical machine learning approach to facial recognition of U.S. presidents using HOG features and an SVM-based scikit-learn pipeline, demonstrating that strong results are achievable with just hundreds of training images.
Abstract¶
We present image recognition using a classical machine learning algorithm, specifically Support Vector Machines (SVM), and show an image library of faces of recent U.S. presidents. We demonstrate that even with a limited number of samples -- on the order of hundreds -- good results can be obtained without deep learning, and very good results can be achieved if the number of samples is higher, on the order of thousands.
Key Contributions¶
- Built an end-to-end scikit-learn pipeline for facial recognition using Histogram of Oriented Gradients (HOG) features, StandardScaler normalization, and Stochastic Gradient Descent classification
- Created a custom dataset by crawling 1,000+ images of six recent U.S. presidents from Google and Bing, with automated preprocessing (face cropping, resizing to 200x200, grayscale conversion, normalization)
- Achieved approximately 70% average accuracy across six classes, with the best single-model accuracy reaching 76%, using only a few hundred training images per class
- Automated model selection via GridSearchCV with 10-fold cross-validation, comparing SVM, SGD, K-Nearest Neighbors, Decision Trees, and Random Forest classifiers
- Provided a practical comparison of traditional image processing vs. deep learning trade-offs, demonstrating when classical ML is sufficient and cost-effective
Links¶
- PDF: to be hosted
- arXiv / TechRxiv: to be added