Result: Subset selection by Mallows' Cp: A mixed integer programming approach
Title:
Subset selection by Mallows' Cp: A mixed integer programming approach
Authors:
Source:
Expert systems with applications. 42(1):325-331
Publisher Information:
Amsterdam: Elsevier, 2015.
Publication Year:
2015
Physical Description:
print, 1/4 p
Original Material:
INIST-CNRS
Subject Terms:
Computer science, Informatique, Sciences exactes et technologie, Exact sciences and technology, Sciences et techniques communes, Sciences and techniques of general use, Mathematiques, Mathematics, Probabilités et statistiques, Probability and statistics, Statistiques, Statistics, Inférence linéaire, régression, Linear inference, regression, Sciences appliquees, Applied sciences, Recherche operationnelle. Gestion, Operational research. Management science, Recherche opérationnelle et modèles formalisés de gestion, Operational research and scientific management, Programmation mathématique, Mathematical programming, Théorie de la décision. Théorie de l'utilité, Decision theory. Utility theory, Informatique; automatique theorique; systemes, Computer science; control theory; systems, Logiciel, Software, Organisation des mémoires. Traitement des données, Memory organisation. Data processing, Systèmes d'information. Bases de données, Information systems. Data bases, Analyse régression, Regression analysis, Análisis regresión, Modèle linéaire, Linear model, Modelo lineal, Modèle régression, Regression model, Modelo regresión, Problème sélection, Selection problem, Problema selección, Programmation partiellement en nombres entiers, Mixed integer programming, Programación mixta entera, Programmation quadratique, Quadratic programming, Programación cuadrática, Régression linéaire, Linear regression, Regresión lineal, Base donnée très grande, Very large databases, Base de datos a gran escala, Linear regression model, Mallows' Cp, Subset selection
Document Type:
Academic journal
Article
File Description:
text
Language:
English
Author Affiliations:
Department of Computer and Information Sciences, Institute of Engineering, Tokyo University of Agriculture and Technology, 2-24-16 Naka-cho, Koganei-shi, Tokyo 184-8588, Japan
Department of Industrial Engineering and Management, Graduate School of Decision Science and Technology, Tokyo Institute of Technology, 2-12-1-W9-77 Ookayama, Meguro-ku, Tokyo 152-8552, Japan
Department of Industrial Engineering and Management, Graduate School of Decision Science and Technology, Tokyo Institute of Technology, 2-12-1-W9-77 Ookayama, Meguro-ku, Tokyo 152-8552, Japan
ISSN:
0957-4174
Rights:
Copyright 2015 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Computer science; theoretical automation; systems
Mathematics
Operational research. Management
Mathematics
Operational research. Management
Accession Number:
edscal.28843404
Database:
PASCAL Archive
Further Information
This paper concerns a method of selecting the best subset of explanatory variables for a linear regression model. Employing Mallows' Cp as a goodness-of-fit measure, we formulate the subset selection problem as a mixed integer quadratic programming problem. Computational results demonstrate that our method provides the best subset of variables in a few seconds when the number of candidate explanatory variables is less than 30. Furthermore, when handling datasets consisting of a large number of samples, it finds better-quality solutions faster than stepwise regression methods do.