matlab解决svr代码

更新时间:2024-03-19 09:13:01 阅读量: 综合文库 文档下载

说明:文章内容仅供预览,部分内容可能不全。下载后的文档,内容与下面显示的完全一致。下载之前请确认下面内容是否您想要的,是否完整无缺。

多元线性回归和BP神经网络及决策向量机之间的比较,个人理解:

多元线性回归:就是多个属性的线性组合,在组合时,通过不断调节每个属性的权重来使多元线性回归函数更多的适用于多个样本。

BP神经网络:通过使用最快速下降法,通过反向传播来不断调整网络中的权值和阈值,使网络的误差平方和最小。 决策向量机:它仍是对每个样本操作,使得所有样本距离最终生成的拟合曲线的间隔最小化。 算法比较:

pmm1BP目标函数: J?(?j)22j?1j

?yd权值调整:

k?1?wij????Jk?1?wij决策向量机目标函数:min1/2w^2

支持向量机(Support vector machines,SVM)与神经网络类似,都是学习型的机制,但与神经网络不同的是SVM使用的是数学方法和优化技术。

学习效率的比较:

导入数据: File->import data

参数优化常用方法:

[train_pca,test_pca] = pcaForSVM(train_data,test_data,97);//主元分析

[bestCVmse,bestc,bestg,ga_option]=gaSVMcgForRegress(train_label,train_pca); [bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data) cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 -p 0.01'];

train_label=data(1:50,1); train_data=data(1:50,2:14);

model=svmtrain(train_label,train_data,'-s 3 -t 2 -c 2.2 -g 2.8 -p 0.01'); test_label=data(51:100,1); test_data=data(51:100,2:14);

[predict_label,mse,dec_value]=svmpredict(test_label,test_data,model); [bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data) cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 -p 0.01'];

代码整理:

Part1:从核函数的角度出发,当选取不同核函数类型时,模型的效率是否有所提高 1.核函数为RBF核函数时:

优化前:

train_label=data(1:50,1); train_data=data(1:50,2:14);

model=svmtrain(train_label,train_data,'-s 3 -t 2 -c 2.2 -g 2.8 -p 0.01'); [predict_label,mse,dec_value]=svmpredict(train_label,train_data,model);

%上一行利用自身的值和预测值进行比较,求得模型实际结果和预测结果的均方值 test_label=data(51:100,1); test_data=data(51:100,2:14);

[predict_label,mse,dec_value]=svmpredict(test_label,test_data,model);

优化后:

train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data)%优化方法暂定为网格寻优 cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 –t 2 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model);

figure;%画图比较预测值和实际值 subplot(2,1,1);

plot(train_label,'-o'); hold on;

plot(ptrain,'r-s'); grid on;

legend('original','predict');

title('Train Set Regression Predict by SVM');

2.核函数为多项式核函数时 train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data); cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 -t 1 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse]=svmpredict(train_label,train_data,model); figure;%画图比较预测值和实际值

subplot(2,1,1);

plot(train_label,'-o'); hold on;

plot(ptrain,'r-s'); grid on;

legend('original','predict');

title('Train Set Regression Predict by SVM');

Mean squared error = 14505.6 (regression)

Squared correlation coefficient = 0.349393 (regression)

3.核函数为线性乘积0 -- linear: u'*v train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data); cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 -t 0 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse]=svmpredict(train_label,train_data,model); figure;%画图比较预测值和实际值 subplot(2,1,1);

plot(train_label,'-o'); hold on;

plot(ptrain,'r-s'); grid on;

legend('original','predict');

title('Train Set Regression Predict by SVM');

Mean squared error = 14537 (regression)

Squared correlation coefficient = 0.389757 (regression)

4.核函数为sigmoid: tanh(gamma*u'*v + coef0) 神经元的非线性作用函数

train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data); cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 -t 3 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse]=svmpredict(train_label,train_data,model); figure;%画图比较预测值和实际值 subplot(2,1,1);

plot(train_label,'-o'); hold on;

plot(ptrain,'r-s'); grid on;

legend('original','predict');

title('Train Set Regression Predict by SVM');

Mean squared error = 24326.5 (regression)

Squared correlation coefficient = 0.271859 (regression) 下图为江良学长的测试成本-因素结果

注意:第一部分在建模时仅采用的是前50组数据生成的测试效率-因素模型,当选取的训练集越多(接近100)时,他的效果是越差的,举例说明如下:核函数为RBF

Mean squared error = 20424.8 (regression)

Squared correlation coefficient = 0.527831 (regression)

选取的样本越多,得到的MSE越大(虽然mse增加,但对样本的预测效果肯定会更好,因为随着样本数增加,学习能力肯定提高),而相关系数反而有所提高(接近于1最佳); 问题提出:为什么bestmse =2.3162e+004与实际训练出来的Mean squared error = 20424.8 (regression)相距甚选????

Part2:从参数优化方法选取不同上比较那种参数选取更优 此比较基于RBF核函数而言 1.基于网格寻优方法 代码:

train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_data)%优化方法暂定为网格寻优 cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 –t 2 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model); 结果: bestmse =

1.5542e+004 bestc = 27.8576 bestg = 0.0039

Mean squared error = 14107.4 (regression)

Squared correlation coefficient = 0.386814 (regression)

2.基于遗传算法寻优

train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestCVmse,bestc,bestg,ga_option]=gaSVMcgForRegress(train_label,train_data) cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 –t 2 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model); 结果:

bestCVmse = 1.8944e+004 bestc = 59.5370 bestg = 778.3573 ga_option =

maxgen: 200 sizepop: 20

ggap: 0.9000 cbound: [0 100] gbound: [0 1000] v: 5

Mean squared error = 10426.1 (regression)

Squared correlation coefficient = 0.622133 (regression)

3.基于pso寻优(在这里使用启发式算法PSO来进行参数寻优,用网格划分(grid search)来寻找最佳的参数c和g,虽然采用网格搜索能够找到在CV意义下的最高的分类准确率,即全局最优解,但有时候如果想在更大的范围内寻找最佳的参数c和g会很费时,采用启发式算法就可以不必遍历网格内的所有的参数点,也能找到全局最优解) 代码:

train_label=data(1:50,1); train_data=data(1:50,2:14);

[bestCVmse,bestc,bestg,pso_option]=psoSVMcgForRegress(train_label,train_data) cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 –t 2 -p 0.01']; model=svmtrain(train_label,train_data,cmd);

[ptrain,mse,dec_value]=svmpredict(train_label,train_data,model); 结果:

bestCVmse = 1.5761e+004 bestc = 49.4305 bestg = 0.0100 pso_option =

c1: 1.5000 c2: 1.7000 maxgen: 200 sizepop: 20

k: 0.6000 wV: 1 wP: 1 v: 5 popcmax: 100 popcmin: 0.1000 popgmax: 1000 popgmin: 0.0100

Mean squared error = 12480.9 (regression)

Squared correlation coefficient = 0.434221 (regression) 注意:仍是以前50组样本为例

Part3:主元分析

仍然以RBF为核函数,进行主元分析之后参数优化,最终得到MSE 代码:

train_label=data(1:50,1); train_data=data(1:50,2:14); test_label=data(51:100,1); test_data=data(51:100,2:14);

[train_pca,test_pca] = pcaForSVM(train_data,test_data,97);%主元分析 [bestmse,bestc,bestg] = SVMcgForRegress(train_label,train_pca) cmd = ['-c ',num2str(bestc),' -g ',num2str(bestg),' -s 3 –t 1 -p 0.01']; [ptrain,mse,dec_value]=svmpredict(train_label,train_pca,model); 结果:

Mean squared error = 12555.9 (regression)

Squared correlation coefficient = 0.552186 (regression)

本文来源:https://www.bwwdw.com/article/pnt8.html

Top