Two-stage clustering in genotype-by-environment analyses with missing data

A. J R Godfrey*, G. R. Wood, S. Ganesalingam, M. A. Nichols, C. G. Qiao

*Corresponding author for this work

    Research output: Contribution to journalArticlepeer-review

    4 Citations (Scopus)

    Abstract

    Cluster analysis has been commonly used in genotype-by-environment (G × E) analyses, but current methods are inadequate when the data matrix is incomplete. This paper proposes a new method, referred to as two-stage clustering, which relies on a partitioning of squared Euclidean distance into two independent components, the G × E interaction and the genotype main effect. These components are used in the first and second stages of clustering respectively. Two-stage clustering forms the basis for imputing missing values in the G × E matrix, so that a more complete data array is available for other G × E analyses. Imputation for a given genotype uses information from genotypes with similar interaction profiles. This imputation method is shown to improve on an existing nearest cluster method that confounds the G × E interaction and the genotype main effect.

    Original languageEnglish
    Pages (from-to)67-77
    Number of pages11
    JournalJournal of Agricultural Science
    Volume139
    Issue number1
    DOIs
    Publication statusPublished - Aug 2002

    Fingerprint

    Dive into the research topics of 'Two-stage clustering in genotype-by-environment analyses with missing data'. Together they form a unique fingerprint.

    Cite this