SVM is mostly commonly used for binary classifications. But one branch of SVM, SVM regression or SVR, is able to fit a continuous function to data. This is particularly useful when the predicted variable is continuous. Here I tried some very simple cases using libsvm matlab package:
1. Feature 1D, use 1st half to train, 2nd half to test. The fitting is pretty good.
2. Still 1D, but apparently the data is nonlinear. So I use nonlinear SVR (radial basis). The fitting is good.
3. What if we have a lot of dimensions? Here I tried feature space with up to 100 dimensions and calculated the correlation between predicted values and the actual values. For linear SVR (blue), the number of dimension doesn’t affect the correlation much. (red: nonlinear, blue:linear, same data for both cases)
One property of SVR I like is that, when two features are similar (i.e. highly correlated), their weights are similar. This is in contrast with “winner take all” property of general linear model (GLM). This property is desired in brain imaging analysis: neighbor voxels have highly correlated signals and you want them to have similar weights.
About performance: If different features have different scales, then normalization of data will improve the speed of libsvm. Also, the cost parameter c also affects the speed. The larger c is, the slower libsvm is. For the simulated data I used, the parameters don’t affect the accuracy.
MatLab code: test_svr.m
The normalization function (copy and save it into normalize.m):
function data = normalize(d) % scale before svm % the data is normalized so that max is 1, and min is 0 data = (d -repmat(min(d,,1),size(d,1),1))*spdiags(1./(max(d,,1)-min(d,,1))’,0,size(d,2),size(d,2));
options: -s svm_type : set type of SVM (default 0) 0 -- C-SVC 1 -- nu-SVC 2 -- one-class SVM 3 -- epsilon-SVR 4 -- nu-SVR -t kernel_type : set type of kernel function (default 2) 0 -- linear: u'*v 1 -- polynomial: (gamma*u'*v + coef0)^degree 2 -- radial basis function: exp(-gamma*|u-v|^2) 3 -- sigmoid: tanh(gamma*u'*v + coef0) -d degree : set degree in kernel function (default 3) -g gamma : set gamma in kernel function (default 1/num_features) -r coef0 : set coef0 in kernel function (default 0) -c cost : set the parameter C of C-SVC, epsilon-SVR, and nu-SVR (default 1) -n nu : set the parameter nu of nu-SVC, one-class SVM, and nu-SVR (default 0.5) -p epsilon : set the epsilon in loss function of epsilon-SVR (default 0.1) -m cachesize : set cache memory size in MB (default 100) -e epsilon : set tolerance of termination criterion (default 0.001) -h shrinking: whether to use the shrinking heuristics, 0 or 1 (default 1) -b probability_estimates: whether to train a SVC or SVR model for probability estimates, 0 or 1 (default 0) -wi weight: set the parameter C of class i to weight*C, for C-SVC (default 1) The k in the -g option means the number of attributes in the input data.