Gender identification of authors of turkish text

Yaşar Öntürk, Ceren

dc.contributor.author	Yaşar Öntürk, Ceren
dc.date.accessioned	2020-04-27T20:24:52Z
dc.date.available	2020-04-27T20:24:52Z
dc.date.issued	2019
dc.identifier.citation	Ceren Yaşar Öntürk (2019). Gender identification of authors of turkish text / Türkçe metinlerde yazarın cinsiyet tahmini. Yayımlanmış yüksek lisans tezi. Ankara: Çankaya Üniversitesi Fen bilimleri Enstitüsü.	tr_TR
dc.identifier.uri	http://hdl.handle.net/20.500.12416/3456
dc.description.abstract	The number of documents that are stored in a computerized environment is increasing day by day. Following the widespread use of the internet, the number of users of text-based social media applications is also expected to increase. In view of this, the content of text classification and the gender identification of authors of short texts have become an active research subject, due to the use of social media. This field has become popular since users often hide their genders in an internet environment. A dataset is created of articles on different subjects, chosen randomly from the internet. The property of gender is used for classification in this generated dataset. The sentence, word, character and punctuation features of these articles are utilized in a dataset created in this work. Following this, the performance of five different classification methods is compared, and the results show that the most successful method is the random forest algorithm.	tr_TR
dc.description.abstract	Geçtiğimiz yıllara baktığımızda, bilgisayar ortamında depolanan belgelerin sayısı her geçen gün daha da artmaktadır. İnternetin yaygınlaşması ile birlikte metin tabanlı sosyal medya uygulamalarındaki kullanıcı sayısı da artış göstermektedir. Sosyal medyanın kullanımının aktif olması nedeniyle, kısa metinlerde yazar cinsiyetinin belirlenmesi, metin sınıflama kapsamında güncel bir araştırma konusu durumuna gelmiştir. İnternet ortamında kişiler cinsiyetlerini sakladıkların dolayı, bu çalışma alanı günümüzde popüler hale gelmiştir. Bu çalışmada, internet üzerinden rastgele seçilmiş ve farklı konulardan oluşan makalelerden yararlanılarak veri seti oluşturulmuştur. Oluşturulan veri setinde sınıflandırma için cinsiyet özelliği kullanılmıştır. Çalışma sırasında oluşturulan veri seti üzerinde cümle özellikleri, kelime özellikleri, karakter özellikleri ve noktalama işaretleri özelliklerinden yararlanılmıştır. Çıkan sonuçlara beş farklı sınıflandırma metodu kullanılarak, performansları birbirleriyle karşılaştırılmıştır. Çıkan sonuçlara göre en başarılı metot Rastgele Orman algoritmasıdır.	tr_TR
dc.language.iso	eng	tr_TR
dc.rights	info:eu-repo/semantics/openAccess	tr_TR
dc.subject	Gender Identification	tr_TR
dc.subject	Naive Bayes	tr_TR
dc.subject	Decision Tree	tr_TR
dc.subject	WEKA	tr_TR
dc.subject	Logistic Regression	tr_TR
dc.subject	Random Forest	tr_TR
dc.subject	Cinsiyet Belirleme	tr_TR
dc.subject	Naive Bayes	tr_TR
dc.subject	Karar Ağaçları	tr_TR
dc.subject	Weka	tr_TR
dc.subject	Support Vector Machine	tr_TR
dc.subject	Rastegele Orman	tr_TR
dc.subject	Linear	tr_TR
dc.title	Gender identification of authors of turkish text	tr_TR
dc.title.alternative	Türkçe metinlerde yazarın cinsiyet tahmini	tr_TR
dc.type	masterThesis	tr_TR
dc.identifier.startpage	1	tr_TR
dc.identifier.endpage	55	tr_TR
dc.contributor.department	Çankaya Üniversitesi, Fen Bilimleri Enstitüsü, Bilgisayar Mühendisliği Bölümü	tr_TR

Bu öğenin dosyaları:

Ad: Yaşar Öntürk, ...

Boyut: 1.155Mb

Biçim: PDF

Açıklama: Yazar sürümü

Göster/Aç

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

Bilgisayar Mühendisliği Bölümü Tezleri [253]
Bilgisayar Mühendisliği Bölümü Tezlerini İçerir.

Gender identification of authors of turkish text

Bu öğenin dosyaları:

Bu öğe aşağıdaki koleksiyon(lar)da görünmektedir.

DSpace'de Ara

Göz at

Tüm DSpace

Bu Koleksiyon

Hesabım