• 1
  • 2
  • 3
  • 4
  • 5


  • September 2016 We ran a panel with UIUC and UCLA on biomedical knowledge discovery at SciDataCon2016.
  • August 2016 Lifu Huang will present Liberal IE at ACL16, one of the best papers from our group.
  • August 2016 Three of our IE/KBP algorithms are being featured in DARPA DEFT program-wide end-to-end demonstration system. We are also awarded with extra grants to extend our DEFT research for one more year and build a powerful cold-start KBP system called "Tinker Bell" together with Stanford, UIUC and Columbia.
  • July 2016 Two tasks are being renewed and funded by ARL NS-CTA. The knowledge networks construction task has started since 2009 and now engages great new social scientists from USC.
  • June 2016 We ran a panel at World Economic Forum about Scientific Knowledge Discovery.

Recent years have witnessed a big data boom that includes a wide spectrum of heterogeneous data types, from image, speech, and multimedia signals to text documents and labels. Much of this information is encoded in natural language, which makes it accessible to some people—for example, those who can read that particular language—but much less amenable to computer processing beyond a simple keyword search. The research area of Blender Lab, cross-source information extraction (IE) on a massive scale, aims to create the next generation of information access in which humans can communicate with computers in natural languages beyond keyword search, and computers can discover the accurate, concise, and trustable information embedded in big data from heterogeneous sources.

Traditional IE techniques pull information from individual documents in isolation, but users might need to gather information that’s scattered among a variety of sources (for example, in multiple languages, documents, genres, and data modalities). Complicating matters, these facts might be redundant, complementary, incorrect, or ambiguously worded; the extracted information might also need to augment an existing Knowledge Base (KB), which requires the ability to link events, entities, and associated relations to KB. In our research, we aim to define several new extensions to the state-of-the-art IE paradigm beyond “slot filling,” getting to the point where we systematically develop the foundation, methodologies, algorithms, and implementations needed for more accurate, coherent, complete, concise, and most importantly, dynamic and resilient extraction capabilities.

The general principle of RPI Blender Lab is to do creative, ground-breaking and enjoyable research. Each member is a serious researcher, critical thinker as well as efficient engineer. We aim to deliver each research project as a piece of art, and have great fun on each creative idea.