Title: Plink-LDA: Using link as prior information in topic modeling
Authors: Xia, Huan
Li, Juanzi
Tang, Jie
Moens, Marie-Francine #
Issue Date: 2012
Publisher: Springer
Host Document: Lecture Notes in Computer Science vol:7238 pages:213-227
Conference: International conference on database systems for advanced applications - 17th International conference edition:17 location:Busan, South Korea date:15-19 April, 2012
Abstract: Citations are highly valuable for analyzing documents and have been
widely studied in recent years. Among the document modeling, the citations are treated as documents’ attributes just like the words in the documents; or as the degrees in graph theory. These methods add citations into word sampling process to reform the document representation but they miss the impact of the citations in the generation of content. In this paper, we view the citations as the prior information which authors have had. In the generation of document, content of the document is split into two parts: the idea of the author and the knowledge from the cited papers. We proposed a prior information enabled topic model-PLDA. In the modeling, both the document and its citations play the important role of generating the topic layer. Our experiments on two linked datasets show that our model greatly outperforms basic LDA procedures on a clustering task while also maintaining the dependencies among documents. In addition, we also show the feasibility by the task of citation recommendation.
ISSN: 0302-9743
Publication status: published
KU Leuven publication type: IC
Appears in Collections:Informatics Section
# (joint) last author

Files in This Item:
File Description Status SizeFormat
Xiaetal2012.pdfmain article Published 6198KbAdobe PDFView/Open Request a copy

These files are only available to some KU Leuven Association staff members


All items in Lirias are protected by copyright, with all rights reserved.