RefSeer Data


You are welcome to use the code under the terms of the license, however please acknowledge its use by citation:
W. Huang, Z. Wu, C. Liang, P. Mitra, and C. Lee Giles. A Neural Probabilistic Model for Context Based Citation Recommendation. In the Twenty-Ninth AAAI Conference on Artificial Intelligence (AAAI'15), 2015.


The shared data is a SQL dump of citeseerx database with 3 tables: citations, citationContexts, and papers.

  • Important fields of table papers:
    • (1) id: each pdf will have a different id, this id is referred to as paperid in table citations;
    • (2) cluster: same paper (may be have more pdfs in our databases) will have a unique cluster number.
  • Important fields of table citations:
    • (1) id: this id is referred to as citationid in table citationContexts;
    • (2) cluster: the cluster number of the cited document;
    • (3) paperid: the id of citing document.
  • Important fields of table citationContexts:
    • (1) citationid: link to the citations table.
    • (2) context: citation contexts, citations are surrounded by =-= and -=-.

Downloading Link: