BLENDEET: Cross-document Cross-lingual Event Extraction and Tracking


  » Overview

  » Publications

  » Presentations
  » Demos
  » Softwares  » Awards


  • Award No. IIS-0953149
  • Duration: 2010-2016 (expected)
  • Title: Cross-Document Cross-Lingual Event Extraction and Tracking
  • Institution: Rensselaer Polytechnic Institute
  • Abstract

The goal of this research project is to define several new extensions to the state-of-the-art Information Extraction paradigm beyond ‘slot filling’, and achieve more accurate, salient, complete, concise and coherent extract results by exploiting dynamic background knowledge and cross-document cross-lingual event ranking and tracking. The approach consists of cross-document inference, unknown implicit event time prediction and reasoning, cross-document entity coreference resolution with global contexts, centroid entity detection, event attribute extraction and graph-based clustering algorithms for redundancy and contradiction detection, automatic new event clustering and active learning, abstractive summary generation based on extraction results, name translations with comparable corpora and cross-lingual co-training.

The broader impacts of the project are two-fold. The experimental research is linked to educational activities including project-related curriculum development. This project supports two PhD students and two undergraduate students in each of the five years, involves non-CS undergraduate students through utility evaluation and corpus annotation, and attracts elementary school and high school students by tutorials, regular research seminars and an extensive summer workshop. The results of this project will also have a benefit in E-Science and E-Learning by extracting and tracking the related knowledge from scientific literatures and learning materials used in elementary schools and high schools.

  • Research Challenges
    • Cross-document event coreference resolution
    • Event ranking by salience and novelty
    • Event organization by participant, time, and place
    • Name translation
    • Knowledge Discovery for IE
    • Domain Adpatation techniques for Information Extraction
  • Point of Contact

Prof. Heng Ji (

  • Acknowledgement

This material is based upon work supported by the U.S. National Science Foundation under Grant No. IIS-0953149. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author and do not recessarily reflect the views of the National Science Foundation.

  • Date of Last Update: 01/12//2015