La máquina de soporte vectorial como problema de programación cuadrática: análisis y un tutorial

Carlos Alberto Henao-Baena

doi:10.33571/rpolitec.v21n42a7

Autores/as

Carlos Alberto Henao-Baena Universidad de caldas https://orcid.org/0000-0001-9873-8211

DOI:

https://doi.org/10.33571/rpolitec.v21n42a7

Palabras clave:

Máquina de Soporte Vectorial, Programación Cuadrática, Clasificación, Regresión, Aprendizaje de Máquinas

Resumen

La transformación del problema de optimización en su forma dual a su equivalencia matricial en una máquina de soporte vectorial (SVM) es una etapa crucial en el desarrollo matemático del algoritmo. Este proceso es clave para lograr una correcta implementación de la SVM. Sin embargo, los detalles matemáticos de esta transformación suelen no estar documentados. Una exploración a mayor grado de profundidad permite identificar las operaciones matriciales y vectoriales involucradas, estableciendo la relación de la SVM con un problema de programación cuadrático. En este artículo, se identifican las operaciones vectoriales necesarias para transformar el problema de optimización en su forma dual a su notación matricial, determinado así la analogía entre la SVM y un QP. Se examinan los algoritmos C-SVM y R-SVM desarrollando el componente matemático completo. Finalmente, se implementa un caso de estudio para validar el desarrollo propuesto.

The transformation of the optimization problem in its dual form to its matrix equivalent in a Support Vector Machine (SVM) is a fundamental stage in the mathematical development of the algorithm. This process is crucial to achieve a correct implementation of the SVM. However, the mathematical details of this transformation are often not documented. A deeper exploration allows to identify the matrix and vector operations, establishing the relationships between the SVM and a Quadratic Programming problem (QP). In this article, the basic vectorial operations are identified to transform the optimization problem in its dual form to its matrix notation, determining the analogy between the SVM and a QP. The C-SVM and R-SVM algorithms are examined, fully developing the mathematical component. Finally, a case study is conducted to validate the proposal development.

Métricas de artículo

Resumen: 101 PDF: 44

Métricas PlumX

Biografía del autor/a

Carlos Alberto Henao-Baena, Universidad de caldas

Magister en Ingeniería Eléctrica, Ejecutor Técnico Programa Talento Tech

Citas

[1] Cortes, C., y Vapnik, V. (1995). Support-vector networks. Machine learning, 20, 273-297. DOI: https://doi.org/10.1023/A:1022627411411

https://doi.org/10.1007/BF00994018 DOI: https://doi.org/10.1007/BF00994018

[2] Guido, R., Ferrisi, S., Lofaro, D., y Conforti, D. (2024). An Overview on the Advancements of Support Vector Machine Models in Healthcare Applications: A Review. Information, 15(4), 235. https://doi.org/10.3390/info15040235 DOI: https://doi.org/10.3390/info15040235

[3] Win, K., Sato, T., Tsuyuki, S. Application of Multi-Source Remote Sensing Data and Machine Learning for Surface Soil Moisture Mapping in Temperate Forests of Central Japan. Information 2024, 15, 485. https://doi.org/10.3390/info15080485 DOI: https://doi.org/10.3390/info15080485

[4] Bishop, C. M., y Nasrabadi, N. M. (2006). Pattern recognition and machine learning (Vol. 4, No. 4, p. 738). New York: springer. https://doi.org/10.1117/1.2819119 DOI: https://doi.org/10.1117/1.2819119

[5] Burges, C. J. (1998). A tutorial on support vector machines for pattern recognition. Data mining and knowledge discovery, 2(2), 121-167. https://doi.org/10.1023/A:1009715923555 DOI: https://doi.org/10.1023/A:1009715923555

[6] Suykens, J. A., y Vandewalle, J. (1999). Least squares support vector machine classifiers. Neural processing letters, 9, 293-300. https://doi.org/10.1023/A:1018628609742 DOI: https://doi.org/10.1023/A:1018628609742

[7] Fung, G., y Mangasarian, O. L. (2001, August). Proximal support vector machine classifiers. in Proceedings of the seventh ACM SIGKDD international conference on Knowledge discovery and data mining (pp. 77-86). https://doi.org/10.1145/502512.502527 DOI: https://doi.org/10.1145/502512.502527

[8] Kumar, M. A., y Gopal, M. (2009). Least squares twin support vector machines for pattern classification. Expert systems with applications, 36(4), 7535-7543. https://doi.org/10.1016/j.eswa.2008.09.066 DOI: https://doi.org/10.1016/j.eswa.2008.09.066

[9] Kumar, M. A., Khemchandani, R., Gopal, M., y Chandra, S. (2010). Knowledge based least squares twin support vector machines. Information Sciences, 180(23), 4606-4618. https://doi.org/10.1016/j.ins.2010.07.034 DOI: https://doi.org/10.1016/j.ins.2010.07.034

[10] Chen, J., y Ji, G. (2010, February). Weighted least squares twin support vector machines for pattern classification. In 2010 the 2nd international conference on computer and automation engineering (ICCAE) (Vol. 2, pp. 242-246). IEEE. https://doi.org/10.1109/ICCAE.2010.5451483 DOI: https://doi.org/10.1109/ICCAE.2010.5451483

[11] Chong, E. K., y Żak, S. H. (2013). An introduction to optimization (Vol. 75). John Wiley & Sons. https://doi.org/10.1002/9781118033340 DOI: https://doi.org/10.1002/9781118033340

[12] Saunders, C., Gammerman, A., y Vovk, V. (1998). Ridge Regression Learning Algorithm in Dual Variables. International Conference on Machine Learning. https://dl.acm.org/doi/10.5555/645527.657464

[13] Fung, G.M., Mangasarian, O.L. Multicategory Proximal Support Vector Machine Classifiers. Mach Learn 59, 77–97 (2005). https://doi.org/10.1007/s10994-005-0463-6 DOI: https://doi.org/10.1007/s10994-005-0463-6

[14] Veropoulos, K., Campbell, C., y Cristianini, N. (1999, July). Controlling the sensitivity of support vector machines. In Proceedings of the international joint conference on AI (Vol. 55, p. 60).

[15] Schölkopf, B., Smola, A. J., Williamson, R. C., y Bartlett, P. L. (2000). New support vector algorithms. Neural computation, 12(5), 1207-1245. https://doi.org/10.1162/089976600300015565 DOI: https://doi.org/10.1162/089976600300015565

[16] Khemchandani, R., y Chandra, S. (2007). Twin support vector machines for pattern classification. IEEE Transactions on pattern analysis and machine intelligence, 29(5), 905-910. https://doi.org/10.1109/TPAMI.2007.1068 DOI: https://doi.org/10.1109/TPAMI.2007.1068

[17] Iranmehr, A., Masnadi-Shirazi, H., y Vasconcelos, N. (2019). Cost-sensitive support vector machines. Neurocomputing, 343, 50-64. https://doi.org/10.1016/j.neucom.2018.11.099 DOI: https://doi.org/10.1016/j.neucom.2018.11.099

[18] Pelckmans, K., Suykens, J. A., Van Gestel, T., De Brabanter, J., Lukas, L., Hamers, B., y Vandewalle, J. (2002). LS-SVMlab: a matlab/c toolbox for least squares support vector machines. Tutorial. KULeuven-ESAT. Leuven, Belgium, 142(1-2). [Computer Software]. http://www.esat.kuleuven.be/sista/lssvmlab DOI: https://doi.org/10.1142/5089

[19] De Brabanter, K., Karsmakers, P., Ojeda, F., Alzate, C., De Brabanter, J., Pelckmans, K., y Suykens, J. A. (2010). LS-SVMlab toolbox user's guide: version 1.7. [Computer Software]. Katholieke Universiteit Leuven.

[20] Franc, V., y Hlavác, V. (2004). Statistical pattern recognition toolbox for Matlab. Prague, Czech: Center for Machine Perception, Czech Technical University. [Computer Software]. https://cmp.felk.cvut.cz/cmp/software/stprtool/

[21] Hsu, C. W. (2003). A Practical Guide to Support Vector Classification. Department of Computer Science, National Taiwan University. Disponible https://www.csie.ntu.edu.tw/~cjlin/papers/guide/guide.pdf

[22] Chang, C. C., y Lin, C. J. (2011). LIBSVM: a library for support vector machines. ACM transactions on intelligent systems and technology (TIST), 2(3), 1-27. https://doi.org/10.1145/1961189.1961199 DOI: https://doi.org/10.1145/1961189.1961199

[23] Pedregosa, F., Varoquaux, G., Gramfort, A., Michel, V., Thirion, B., Grisel, O., ... y Duchesnay, É. (2011). Scikit-learn: Machine learning in Python. the Journal of machine Learning research, 12, 2825-2830. https://doi.org/10.48550/arXiv.1201.0490

[24] Lai, D., y Mani, N (2003). Technical Report MECSE-7-2003. Monash University, Departament of Electrical and Computer System Engineering.

[25] Horn, R. A., y Johnson, C. R. (2012). Matrix analysis. Cambridge university press. DOI: https://doi.org/10.1017/CBO9781139020411

https://doi.org/10.1017/CBO9780511810817 DOI: https://doi.org/10.1017/CBO9780511810817

[26] Boyd, S., y Vandenberghe, L. (2004). Convex optimization. Cambridge university press. https://doi.org/10.1017/CBO9780511804441 DOI: https://doi.org/10.1017/CBO9780511804441

[27] Piccialli, V., y Sciandrone, M. (2022). Nonlinear optimization and support vector machines. Annals of Operations Research, 314(1), 15-47. https://doi.org/10.1007/s10479-022-04655-x DOI: https://doi.org/10.1007/s10479-022-04655-x

[28] Du, K. L., Jiang, B., Lu, J., Hua, J., y Swamy, M. N. S. (2024). Exploring kernel machines and support vector machines: Principles, techniques, and future directions. Mathematics, 12(24), 3935. https://doi.org/10.3390/math12243935 DOI: https://doi.org/10.3390/math12243935

[29] Schölkopf, B., y Smola, A. J. (2002). Learning with kernels: support vector machines, regularization, optimization, and beyond. MIT press. https://doi.org/10.7551/mitpress/4175.001.0001 DOI: https://doi.org/10.7551/mitpress/4175.001.0001

[30] Fawcett, T. (2006). An introduction to ROC analysis. Pattern recognition letters, 27(8), 861-874. https://doi.org/10.1016/j.patrec.2005.10.010 DOI: https://doi.org/10.1016/j.patrec.2005.10.010

[31] Goodfellow, I. (2016). Deep learning (Vol. 196). MIT press. http://www.deeplearningbook.org/

[32] Monteiro, N. R., Oliveira, J. L., y Arrais, J. P. (2022). DTITR: End-to-end drug–target binding affinity prediction with transformers. Computers in Biology and Medicine, 147, 105772. https://doi.org/10.1016/j.compbiomed.2022.105772 DOI: https://doi.org/10.1016/j.compbiomed.2022.105772

[33] Caron, S., Arnström, D., Bonagiri, S., Dechaume, A., Flowers, N., Heins, A., Ishikawa, T., Kenefake, D., Mazzamuto, G., Meoli, D., O'Donoghue, B., Oppenheimer, A. A., Pandala, A., Quiroz Omaña, J. J., Rontsis, N., Shah, P., St-Jean, S., Vitucci, N., Wolfers, S., Yang, F., ... Khalil, A. (2024). qpsolvers: Quadratic Programming Solvers in Python (Version 4.3.3) [Computer software]. LGPL-3.0. https://github.com/qpsolvers/qpsolvers

[34] Harris, C. R., Millman, K. J., Van Der Walt, S. J., Gommers, R., Virtanen, P., Cournapeau, D., y Oliphant, T. E. (2020). Array programming with NumPy. Nature, 585(7825), 357-362. https://doi.org/10.1038/s41586-020-2649-2 DOI: https://doi.org/10.1038/s41586-020-2649-2

[35] Hunter, J. D. (2007). Matplotlib: A 2D graphics environment. Computing in science & engineering, 9(03), 90-95. https://doi.org/10.1109/MCSE.2007.55 DOI: https://doi.org/10.1109/MCSE.2007.55

[36] Waskom, M. L. (2021). Seaborn: statistical data visualization. Journal of Open Source Software, 6(60), 3021. https://doi.org/10.21105/joss.03021 DOI: https://doi.org/10.21105/joss.03021