Result: Identifying multiple cluster structures in a data matrix

Title:
Identifying multiple cluster structures in a data matrix
Source:
Communications in statistics. Simulation and computation. 32(4):1151-1177
Publisher Information:
Colchester: Taylor & Francis, 2003.
Publication Year:
2003
Physical Description:
print, 2 p.3/4
Original Material:
INIST-CNRS
Document Type:
Academic journal Article
File Description:
text
Language:
English
Author Affiliations:
Dipartimento di Scienze Statistiche, University di Bologna, Bologna, Italy
ISSN:
0361-0918
Rights:
Copyright 2004 INIST-CNRS
CC BY 4.0
Sauf mention contraire ci-dessus, le contenu de cette notice bibliographique peut être utilisé dans le cadre d’une licence CC BY 4.0 Inist-CNRS / Unless otherwise stated above, the content of this bibliographic record may be used under a CC BY 4.0 licence by Inist-CNRS / A menos que se haya señalado antes, el contenido de este registro bibliográfico puede ser utilizado al amparo de una licencia CC BY 4.0 Inist-CNRS
Notes:
Mathematics
Accession Number:
edscal.15188892
Database:
PASCAL Archive

Further Information

An aspect of cluster analysis which has been widely studied in recent years is the weighting and selection of variables. Procedures have been proposed which are able to identify the cluster structure present in a data matrix when that structure is confined to a subset of variables. Other methods assess the relative importance of each variable as revealed by a suitably chosen weight. But when a cluster structure is present in more than one subset of variables and is different from one subset to another, those solutions as well as standard clustering algorithms can lead to misleading results. Some very recent methodologies for finding consensus classifications of the same set of units can be useful also for the identification of cluster structures in a data matrix, but each one seems to be only partly satisfactory for the purpose at hand. Therefore a new more specific procedure is proposed and illustrated by analyzing two real data sets; its performances are evaluated by means of a simulation experiment.