Small and Unbalanced Data Set Problem in Classification

Par, Öznur Esra; Sezer, Ebru Akçapınar; Sever, Hayri

DSpace Home
→
Mühendislik Fakültesi
→
Yazılım Mühendisliği Bölümü
→
Yazılım Mühendisliği Bölümü Yayın Koleksiyonu
→
View Item

Small and Unbalanced Data Set Problem in Classification

Par, Öznur Esra; Sezer, Ebru Akçapınar; Sever, Hayri

URI: http://hdl.handle.net/20.500.12416/6019

Date: 2019

Abstract:

Classification of data is difficult in case of small and unbalanced data set and this problem directly affects the classification performance. Small and / or the imbalance dataset has become a major problem in data mining. Classification algorithms are developed based on the assumption that the data sets are balanced and large enough. The most of the algorithms ignore or misclassify examples of the minority class, focus on the majority class. Small and unbalanced data set problem is frequently encountered in medical data mining due to some limitations. Within the scope of the study, the public accessible data set, hepatitis, was divided into small and imblanced data subsets, each of the data subsets were oversampled by distance based data generation methods. The oversampled data sets were classified by using four different machine learning algorithms (Artificial Neural Networks, Support Vector Machines, Naive Bayes and Decision Tree) and the classification scores were compared.

Show full item record

Files in this item

Files	Size	Format	View
There are no files associated with this item.

This item appears in the following Collection(s)

Yazılım Mühendisliği Bölümü Yayın Koleksiyonu [42]
Yazılım Mühendisliği Bölümü yayınlarını içerir.

Search DSpace

Advanced Search

Browse

All of DSpace
- Communities & Collections
- By Issue Date
- Authors
- Titles
- Subjects
- Type
- Language
- Department
- Publisher
- Citation
This Collection
- By Issue Date
- Authors
- Titles
- Subjects
- Type
- Language
- Department
- Publisher
- Citation

Small and Unbalanced Data Set Problem in Classification

Small and Unbalanced Data Set Problem in Classification

Abstract:

Files in this item

This item appears in the following Collection(s)

Search DSpace

Browse

All of DSpace

This Collection

My Account