Blog
/Blog_
Notes on machine learning, software design, and the systems work that sits under it. Reverse-chronological.
Featured¶
Bayes' Theorem, visualized
Prior, likelihood, posterior, and the intuition traps that bite even after you've done the math by hand.
Information Theory, visualized
Entropy, cross-entropy, KL divergence, mutual information. Why cross-entropy is the loss function it is.
Sketch Classification with CNNs
Deep learning on 250 classes of hand-drawn sketches.
Senator Tweet Analysis
NLP over 419K+ tweets from the 117th Congress.
LRU Cache
O(1) get/put with a hash map and a doubly linked list. The classic.
Queue-Based Stage Decoupling
What to do when one stage is faster than the next.
2025¶
Information Theory
Entropy, cross-entropy, KL divergence, mutual information.
Bayes' Theorem
Prior, likelihood, posterior, sequential updating.
Sketch Classification with CNNs
Deep learning for 250-class hand-drawn sketch recognition.
Senator Tweet Analysis
NLP-driven content analysis of 419K+ tweets from the 117th Congress.
Image Recognition with Classical ML
HOG and SVM for facial recognition without deep learning.
2024¶
English to ASL Translator
Web app for ASL translation with a usability study.
Vulnerability Scanning of Web Applications
OWASP-based security scanning across 50 top sites.
Cloud Asset Management for Hospitals
Cloud adoption analysis for healthcare under HIPAA.
Statistics Fundamentals
Descriptive vs. inferential, measures of center and spread, correlation.
Random Variables and Distributions
Discrete vs. continuous, PMF, PDF, CDF, named distributions.
Compound Probability
Sequential events, independence, drawing without replacement, expected value.
Probability Basics
The probability scale, conditional probability, combining events.
2023¶
Multi-Dimensional Aggregation
Feature store rollups without reaching for pandas.
Fair Allocation with Priority
Constrained distribution with guaranteed minimums.
Graph Traversal with Command Pattern
Separating the data graph from the traversal strategy.
Queue-Based Stage Decoupling
High-throughput message processing architecture.
Ingest-Transform-Serve Pipeline
Scheduled ingestion to a REST API.
Adjacency List vs Nested Set
Hierarchy storage for cheap reads or cheap writes. You pick.
Hierarchical Tree Flattening
Tree-building from flat parent pointers.
LRU Cache
O(1) get/put with hash map + doubly linked list.
Large File Processing
Top-K from a 50GB file using heaps.
Rule Parser
String parsing with structured rules.
Sub Domain Hits
Counting subdomain visit frequencies.
Split String
Word-boundary-aware string splitting.
Session Window Feature Engineering
Sessionization for recommendation and churn models.
String Rotation
Detecting rotation with the concatenation trick.
Memory Allocation and Fragmentation
The malloc/free patterns behind GPU caching allocators.
String Match
Pattern matching algorithms.
Sort Algorithms
Merge sort, quicksort, when to use which.
Sequential Access Optimization (SCAN)
Disk scheduling for data loader performance.
Tree Algorithms
Traversal, search, tree manipulation.
Grid Traversal Strategies
Greedy, BFS, DFS compared.
Unbiased Random Permutation
Fisher-Yates for fair sampling in ML pipelines.
Graph Algorithms
Traversal, shortest path, related problems.
Composable Unit Expression Parser
Recursive parsing for mixed-unit feature engineering.
Single-Pass Map-Side Aggregation
Lightweight ETL with dictionary accumulation.
Stack, Queue, Heap
Core data structure implementations.
2022¶
K Complementary
Finding pair sums in arrays.
Connected Islands
BFS/DFS for the largest connected component.
Flatten Nested List
Iterator pattern over arbitrarily nested structures.
K-Nearest Points
Heap vs. sort for nearest-neighbor.
Merge Sorted Lists
Two-pointer merge.
FizzBuzz
The classic, done cleanly.