Service restrictions from February 12-22, 2026—more information on the University Library website

Result: Variable Selection Methods for Model-Based Clustering: Procedures for Functional Data and Bayesian Inference

Title:

Variable Selection Methods for Model-Based Clustering: Procedures for Functional Data and Bayesian Inference

Authors:

Singh, Kyra, Love, Tanzy

Publisher Information:

University of Rochester School of Medicine and Dentistry 10/07/2018 Sun, 7 Oct 2018 Tue, 6 Dec 2016 10:23:38 Year: 2016 Tue, 6 Dec 2016 10:23:38

Document Type:

Electronic Resource Electronic Resource

Index Terms:

Clustering, Functional data, Mixture models, Model selection, Reversible-jump MCMC, Variance selection, Thesis

URL:

http://hdl.handle.net/1802/31625

Availability:

Open access content. Open access content
This item is protected by copyright, with all rights reserved.

Note:

Number of Pages:xv, 146 pages
Illustrations:some color
English

Other Numbers:

RRR oai:urresearch.rochester.edu:31195
967277448

Contributing Source:

UNIV OF ROCHESTER
From OAIster®, provided by the OCLC Cooperative.

Accession Number:

edsoai.ocn967277448

Database:

OAIster

Further Information

Thesis (Ph.D.)--University of Rochester. School of Medicine & Dentistry. Dept. of Biostatistics and Computational Biology, 2016.
Data is becoming more readily available and collected in larger and more frequent amounts as technology advances. Discrete, continuous, and time-course data are all easily obtained from business, medical, and biological applications and fields. With the growth and accessibility of this diverse data, there is a larger demand for extracting important information to develop meaningful conclusions. Model-based clustering is a useful unsupervised learning technique that aims to identify subpopulations within the data using a parametric framework. However, in model-based clustering, it is possible that some variables in the dataset do not contribute to the clustering model and can mask true subgroup structure. We develop two novel model-based clustering variable selection procedures with motivating examples for this statistical problem. The first method addresses the lack of a simultaneous parametric clustering and variable selection technique for functional time-course data. The procedure we develop uses a greedy search algorithm to integrate variable selection into the clustering procedure by comparing two nested subsets to find a locally optimal solution for functional data. Our new method successfully identifies the most important variables for clustering in a simulation study. The procedure is also applied to a dataset of respiratory function measurements for irradiated and non-irradiated mice, where it is found that only a small subset of variables are necessary to classify the functional data reasonably well. The second method recognizes the disadvantages of the greedy search method and proposes a new simultaneous variable selection model-based clustering method under a fully Bayesian framework for non-functional data. This procedure enables more complete inference and the possibility of finding a set of globally optimal solutions, by successfully modeling the posterior distributions of the cluster specific means, variances, and proportion of cluster membership. Ou

Result: Variable Selection Methods for Model-Based Clustering: Procedures for Functional Data and Bayesian Inference

Further Information

Links

Additional functions