Please use this identifier to cite or link to this item:
http://univ-bejaia.dz/dspace/123456789/25740
Title: | Developing and evaluating a large-scale entity linking system. |
Authors: | Belalta, Ramla Meziane, Farid ; directeur de thèse |
Keywords: | Named Entity Disambiguation : Entity linking : Clique Partitioning : Semantic Relatedness : Graph-Based Approaches |
Issue Date: | 2025 |
Publisher: | Université Abderramane Mira-Bejaia |
Abstract: | Disambiguating name mentions in text is a crucial task in Natural Language Processing, especially in entity linking. The credibility and efficiency of such systems largely depend on this task. For a given name entity mention in the text, there are many potential candidate entities that may refer to this mention in the knowledge base. Therefore, it is very difficult to assign the correct candidate from the whole candidate entities set to this mention. To solve this problem, collective entity disambiguation is a prominent approach. In this thesis we present a new algorithm called CPSR for collective entity disambiguation which is based on the graph approach and semantic relatedness. A clique partitioning algorithm is used to find the best clique that contains a set of candidate entities. These candidate entities provide the answers to the corresponding mentions in the disambiguation process. To evaluate our algorithm, we carried out a series of experiments on seven well-known datasets namely, AIDA/CoNLL2003-TestB, IITB ,MSNBC, AQUAINT, ACE2004, Cweb and Wiki. The Kensho Derived Wikimedia Dataset (KDWD) is used as the knowledge base for our system. From the experimental results our CPSR algorithm outperforms both the baselines and other well known state of the art approaches. |
Description: | Option : Cloud Computing |
URI: | http://univ-bejaia.dz/dspace/123456789/25740 |
Appears in Collections: | Thèses de Doctorat |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
Thèse Ramla BELALTA.pdf | 10.59 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.