Skip to content

Image Recognition with Machine Learning

A classical machine learning approach to facial recognition of U.S. presidents using HOG features and an SVM-based scikit-learn pipeline, demonstrating that strong results are achievable with just hundreds of training images.

Abstract

We present image recognition using a classical machine learning algorithm, specifically Support Vector Machines (SVM), and show an image library of faces of recent U.S. presidents. We demonstrate that even with a limited number of samples, on the order of hundreds, good results can be obtained without deep learning, and very good results can be achieved if the number of samples is higher, on the order of thousands.

Key Contributions

  • Built an end-to-end scikit-learn pipeline for facial recognition using Histogram of Oriented Gradients (HOG) features, StandardScaler normalization, and Stochastic Gradient Descent classification
  • Created a custom dataset by crawling 1,000+ images of six recent U.S. presidents from Google and Bing, with automated preprocessing (face cropping, resizing to 200x200, grayscale conversion, normalization)
  • Achieved approximately 70% average accuracy across six classes, with the best single-model accuracy reaching 76%, using only a few hundred training images per class
  • Automated model selection via GridSearchCV with 10-fold cross-validation, comparing SVM, SGD, K-Nearest Neighbors, Decision Trees, and Random Forest classifiers
  • Provided a practical comparison of traditional image processing vs. deep learning trade-offs, demonstrating when classical ML is sufficient and cost-effective
  • PDF: to be hosted
  • arXiv / TechRxiv: to be added

Back to Research & Papers