Scientific Computing Seminar

Date:
Friday, January 23, 2004
Time:
1:00pm-2:00pm
Location:
50B-4205
Seminar Speaker:
Hanchuan Peng
Computer Science and Mathematics Division
Oak Ridge National Laboratory
Email: penghanchuan at yahoo.com
http://www.hpeng.net
Title:
Minimum Redundancy Feature Selection in DNA Gene Expression Classification
Abstract:
In many pattern recognition and data mining application, e.g. cancer classfication using microarray data, it often needs to select the most characterizing genes so that they jointly have high discriminative strength to predict the target classification variable (i.e. the cancer type in the example). We developed a minimal-redundancy-maximal-relevance feature selection method. Based on information theory, we proved that for incremental gene selection, a combination of low-dimensional max-releveance and min-redundancy criteria is equivalent to the high-dimensional max-dependency criterion. Our comprehensive experimental results on five microarray gene expression datasets, one handwritten character dataset, and one arrhythmia dataset show that this novel method is very effective in selecting a small set of features for accurate prediction for a variety of applications.
Sponsor of Seminar:
Chris Ding
Scientific Computing

Contact Esmond G. Ng EGNg@lbl.gov