Anda belum login :: 23 Nov 2024 05:57 WIB
Home
|
Logon
Hidden
»
Administration
»
Collection Detail
Detail
Semi Supervised Under-Sampling: A Solution to the Class Imbalance Problem for Classification and Feature Selection
Oleh:
Rahman, M. Mostafizur
;
Davis, Darryl N.
Jenis:
Article from Books - E-Book
Dalam koleksi:
Transactions on Engineering Technologies: Special Volume of the World Congress on Engineering 2013
,
page 611-625.
Topik:
Class imbalance
;
Clustering
;
Over sampling
;
Reflief
;
SMOTE
;
Under sampling
Fulltext:
44_978-94-017-8831-1_Rahman_Davis.pdf
(681.42KB)
Isi artikel
Most medical datasets are not balanced in their class labels. Furthermore, in some cases it has been noticed that the given class labels do not accurately represent characteristics of the data record. Most existing classification methods tend not to perform well on minority class examples when the dataset is extremely imbalanced. This is because they aim to optimize the overall accuracy without considering the relative distribution of each class. The class imbalance problem can also affect the feature selection process. In this paper we propose a cluster based under-sampling technique that solves the class imbalance problem for our cardiovascular data. Data prepared using this technique shows significant better performance than existing methods. A feature selection framework for unbalanced data is also proposed in this paper. The research found that ReliefF can be used to select fewer attributes, with no degradation of subsequent classifier performance, for the data balanced by the proposed under-sampling method.
Opini Anda
Klik untuk menuliskan opini Anda tentang koleksi ini!
Kembali
Process time: 0 second(s)