Result: Category Prediction of Turkish Poems Using Artificial Intelligence and Natural Language Processing Methods With Mlp and Svm Algorithms

Title:
Category Prediction of Turkish Poems Using Artificial Intelligence and Natural Language Processing Methods With Mlp and Svm Algorithms
Publication Year:
2023
Document Type:
Conference conference object
Language:
English
Relation:
Konferans Öğesi - Uluslararası - Kurum Öğretim Elemanı; https://hdl.handle.net/20.500.13091/6456; N/A
Rights:
open
Accession Number:
edsbas.E2A3B3C9
Database:
BASE

Further Information

People are able to communicate with each other through language. The languages that people use are called natural languages. Natural languages such as English, Turkish, French, etc. are used for communication. Similarly, people can communicate with machines, and for this purpose, natural languages can be made understandable by machines by subjecting them to a series of processes. For this purpose, it is necessary to analyze the canonical structures of natural languages and make them understandable. This process is basically carried out on four levels of analysis: Lexical Analysis, Syntactic Analysis, Semantic Analysis, and Discourse Analysis. Natural Language Processing (NLP) is a branch of artificial intelligence that deals with the processing of natural language input in the form of speech and text. The use of NLP is prevalent in a variety of fields, such as intelligent virtual assistants, search engines, social media monitoring platforms, automatic translation systems, text summarization systems, and text categorization systems. This study presents a model for predicting the categories of Turkish poems using natural language processing and machine learning methods. The project code was written in Python using the Anaconda development environment. The Zemberek library was used to perform various operations on Turkish texts. The dataset used consisted of 4198 poems taken from a website and categorized into 21 categories. During the data preprocessing stage, the texts were converted to lower case, punctuation marks, spaces, and stop-words were removed and root extraction was performed. The Term Frequency-Inverse Document Frequency (TF-IDF) method was used for text representation and evaluated the success rates of models created using Support Vector Machine (SVM) and Multilayer Perceptron (MLP) classifiers. The findings indicated that the SVM classifier outperformed the MLP classifier.