### Abstract

Microarray dataset dimensionality reduction is a prerequisite for avoiding overfitting, and hence developing diagnostic tools. Some previous work has selected features based, e.g., on their individual Fisher discriminants (F-values), or path-based training algorithms optimising the power of the resulting classi_er. We show that a generic method, using a simple stepwise regression with the linear support vector machine penalised margin width as the objective function, subject to regularization parameter grid-search, gives superior performance to three other feature-selection methods (least-angle regression, Random Forest, and stepwise regression on Fisher discriminants). We use a hierarchical validation method, applying leave-one-out cross-validation within the training subset, and applying the trained classi_er to a separate test subset, on each of four two-class gene expression cancer datasets. The generic method shows superior results when classifying unseen samples, compared to three other feature selection methods, and a fixed regularisation value appears nearly optimal for all four datasets.

Original language | English |
---|---|

Title of host publication | 2009 JSM proceedings |

Subtitle of host publication | papers presented at the Joint Statistical Meetings, Washington, DC, August 1-6, 2009, and other ASA-sponsored conferences; Statistics: from evidence to policy |

Place of Publication | Alexandria, VA |

Publisher | American Statistical Association |

Pages | 2951-2965 |

Number of pages | 15 |

ISBN (Print) | 9781223000848 |

Publication status | Published - 2009 |

Event | Joint Statistical Meetings : Statistics : from evidence to policy - Washington, DC Duration: 1 Aug 2009 → 6 Aug 2009 |

### Conference

Conference | Joint Statistical Meetings : Statistics : from evidence to policy |
---|---|

City | Washington, DC |

Period | 1/08/09 → 6/08/09 |

### Keywords

- feature selection
- microarrays
- support vector machines
- path- based algorithms
- regularization

## Fingerprint Dive into the research topics of 'Cancer microarray feature selection using support vector machines: comparing regularization techniques'. Together they form a unique fingerprint.

## Cite this

Peters, T., Bulger, D. W., Loi, T-H., Yang, J. Y. H., & Ma, D. (2009). Cancer microarray feature selection using support vector machines: comparing regularization techniques. In

*2009 JSM proceedings: papers presented at the Joint Statistical Meetings, Washington, DC, August 1-6, 2009, and other ASA-sponsored conferences; Statistics: from evidence to policy*(pp. 2951-2965). Alexandria, VA: American Statistical Association.