Identifying and Assessing Interesting Subgroups in a Heterogeneous Population.

PubChase

Add to Library

Identifying and Assessing Interesting Subgroups in a Heterogeneous Population.

Sep 04, 2015
BioMed Research International

Lee W, Alexeyenko A, Pernemalm M, Guegan J, Dessen P, Lazar V, Lehtiö J, Pawitan Y

Biological heterogeneity is common in many diseases and it is often the reason for therapeutic failures. Thus, there is great interest in classifying a disease into subtypes that have clinical significance in terms of prognosis or therapy response. One of the most popular methods to uncover unrecognized subtypes is cluster analysis. However, classical clustering methods such as k-means clustering or hierarchical clustering are not guaranteed to produce clinically interesting subtypes. This could be because the main statistical variability-the basis of cluster generation-is dominated by genes not associated with the clinical phenotype of interest. Furthermore, a strong prognostic factor might be relevant for a certain subgroup but not for the whole population; thus an analysis of the whole sample may not reveal this prognostic factor. To address these problems we investigate methods to identify and assess clinically interesting subgroups in a heterogeneous population. The identification step uses a clustering algorithm and to assess significance we use a false discovery rate- (FDR-) based measure. Under the heterogeneity condition the standard FDR estimate is shown to overestimate the true FDR value, but this is remedied by an improved FDR estimation procedure. As illustrations, two real data examples from gene expression studies of lung cancer are provided.

Add Public PDF

Upload my PDF

Downloading PDF to your library...

ADD A TAG 64 chars max

Make private

APPLIED TAGS

Uploading PDF...

PDF uploading

Delete tag:

The link you entered does not seem to be valid

Please make sure the link points to nature.com contains a valid shared_access_token

Sign in