k-NN, Naive Bayes, and the chain rule — simple but powerful ideas that still matter today.
No training needed — classify a new point by majority vote of its k closest known samples. Simple yet surprisingly effective.
Assumes features are independent (they usually aren't!), yet amazingly effective for spam filtering and text classification.
Linnainmaa's chain rule in code — compute gradients backward from output to inputs. The mathematical foundation of ALL neural network training.