20 | May | 2013 | An Ergodic Walk

I’ve started doing more machine learning research lately, which means I’ve been ~~sullying my delicate theorist’s hands~~ testing out my algorithms on data. Perhaps the most (over) used dataset is the MNIST handwritten digits collection, which was been put into MATLAB form by Sam Roweis (RIP). As a baseline, I wanted to see how an SVM would perform after I projected the data (using PCA) into the top 100 dimensions. The primal program is

$\min_{\mathbf{w},b} \frac{1}{2} \| \mathbf{w} \|_2^2 + C \sum_{i=1}^{n} z_i$
s.t. $y_i (\mathbf{w}^T \mathbf{x}_i + b) \ge 1 - z_i)$

I chose some “reasonable” value for C and tried to train a classifier on all pairs of points and got the following error rates on the test set (in percentages, rounded).

0
0      0
0.56   0.43   0
0.33   0.45   2.37   0
0.04   0.06   1.17   0.23   0
1.02   0.11   1.89   3.77   0.72   0
0.52   0      1.31   0.08   0.60   1.66   0
0.01   0.15   1.01   0.80   0.80   0.42   0      0
0.43   1.15   2.22   2.69   0.38   3.41   0.54   0.47   0 
0.20   0.14   0.85   1.13   3.03   1.02   0      3.82   1.27   0

This is digits from 0 to 9, so for example, the training error for classifying 0 versus 1 was zero percent, but it’s about 3.8 percent error to decide between 9 and 7. I did this to try and get a sense of which digits were “harder” for SVM to distinguish between so that I could pick a good pair for experiments, or better yet, to pick a pair based on a target error criterion. Running experiments on Gaussian synthetic examples is all fine and good, but it helps to have a range of data sets to test out how resilient an algorithm is to more noise, for example.

	Zonghong Liu on A story about Canvas
	anonymousskimmer on “The needs of the many,…
	Chanterelle Recipes… on Broiled shrimp with chanterell…
	kvarsh on ICML 2019 encouraged code subm…
	Pulkit Grover on gender inclusivity in communic…

An Ergodic Walk

a process whose average over time converges to the true average

Daily Archives: May 20, 2013

MNIST error rates for SVM on projected data