Cancer microarray feature selection using support vector machines

comparing regularization techniques

Tim Peters, David W. Bulger, To-Ha Loi, Jean Yee Hwa Yang, David Ma

    Research output: Chapter in Book/Report/Conference proceedingConference proceeding contribution

    Abstract

    Microarray dataset dimensionality reduction is a prerequisite for avoiding overfitting, and hence developing diagnostic tools. Some previous work has selected features based, e.g., on their individual Fisher discriminants (F-values), or path-based training algorithms optimising the power of the resulting classi_er. We show that a generic method, using a simple stepwise regression with the linear support vector machine penalised margin width as the objective function, subject to regularization parameter grid-search, gives superior performance to three other feature-selection methods (least-angle regression, Random Forest, and stepwise regression on Fisher discriminants). We use a hierarchical validation method, applying leave-one-out cross-validation within the training subset, and applying the trained classi_er to a separate test subset, on each of four two-class gene expression cancer datasets. The generic method shows superior results when classifying unseen samples, compared to three other feature selection methods, and a fixed regularisation value appears nearly optimal for all four datasets.
    Original languageEnglish
    Title of host publication2009 JSM proceedings
    Subtitle of host publicationpapers presented at the Joint Statistical Meetings, Washington, DC, August 1-6, 2009, and other ASA-sponsored conferences; Statistics: from evidence to policy
    Place of PublicationAlexandria, VA
    PublisherAmerican Statistical Association
    Pages2951-2965
    Number of pages15
    ISBN (Print)9781223000848
    Publication statusPublished - 2009
    EventJoint Statistical Meetings : Statistics : from evidence to policy - Washington, DC
    Duration: 1 Aug 20096 Aug 2009

    Conference

    ConferenceJoint Statistical Meetings : Statistics : from evidence to policy
    CityWashington, DC
    Period1/08/096/08/09

    Keywords

    • feature selection
    • microarrays
    • support vector machines
    • path- based algorithms
    • regularization

    Fingerprint Dive into the research topics of 'Cancer microarray feature selection using support vector machines: comparing regularization techniques'. Together they form a unique fingerprint.

  • Cite this

    Peters, T., Bulger, D. W., Loi, T-H., Yang, J. Y. H., & Ma, D. (2009). Cancer microarray feature selection using support vector machines: comparing regularization techniques. In 2009 JSM proceedings: papers presented at the Joint Statistical Meetings, Washington, DC, August 1-6, 2009, and other ASA-sponsored conferences; Statistics: from evidence to policy (pp. 2951-2965). Alexandria, VA: American Statistical Association.