تطوير منهجية تعتمد على تنقيب الأنماط المتكررة المرنة للكشف عن الأحداث الهامة في المدونات العربية المصغرة pdf

تفاصيل الدراسة

تطوير منهجية تعتمد على تنقيب الأنماط المتكررة المرنة للكشف عن الأحداث الهامة في المدونات العربية المصغرة pdf

تطوير منهجية تعتمد على تنقيب الأنماط المتكررة المرنة للكشف عن الأحداث الهامة في المدونات العربية المصغرة pdf

ملخص الدراسة:

Recently, Microblogs have become the new communication medium between users. It allows millions of users to post and share content of their own activities, opinions about different topics. Posting about occurring real-world events has attracted people to follow events through microblogs instead of mainstream media. As a result, there is an urgent need to detect events from microblogs so that users can identify events quickly, also and more importantly to aid higher authorities to respond faster to occurring events by taking proper actions._x000D__x000D_
While considerable researches have been conducted for event detection on the English language. Arabic context have not received much research even though there are millions of Arabic users. Also existing approaches rely on platform dependent features such as hashtags, mentions, retweets etc. which make their approaches fail when these features are not present in the process. In addition to that, approaches that depend on the presence of frequently used words only do not always detect real events because it cannot differentiate events and general viral topics. _x000D__x000D_
In this thesis, we propose an approach for Arabic event detection from microblogs. We first collect the data, then a preprocessing step is applied to enhance the data quality and reduce noise. The sentence text is analyzed and the part-of-speech tags are identified. Then a set of rules are used to extract event indicator keywords called event triggers. The frequency of each event triggers is calculated, where event triggers that have frequencies higher than the average are kept, or removed otherwise. We detect events by clustering similar event triggers together. An Adapted soft frequent pattern mining is applied to the remaining event triggers for clustering._x000D__x000D_
We used a dataset called Evetar to evaluate the proposed approach. The dataset contains tweets that cover different types of Arabic events that occurred in a one month period. We split the dataset into different subsets using different time intervals, so that we can mimic the streaming behavior of microblogs. We used precision, recall and fmeasure as evaluation metrics. The highest average f-measure value achieved was 0.717. Our results were acceptable compared to three popular approaches applied to the same dataset.

توثيق المرجعي (APA)

خصائص الدراسة

  • المؤلف

    Jehad H., Zendah

  • سنة النشر

    2018-08

  • الناشر:

    الجامعة الإسلامية بغزة

  • المصدر:

    المستودع الرقمي للجامعة الإسلامية بغزة

  • نوع المحتوى:

    رسالة ماجستير

  • اللغة:

    English

  • محكمة:

    نعم

  • الدولة:

    فلسطين

  • النص:

    دراسة كاملة

  • نوع الملف:

    pdf

0المراجعات

أترك تقييمك

درجة تقييم