Home page of Hristo Tanev


Go to content

Projects

Relation Extraction

The scope of this project is detection of relations between named entities. Several novel algorithms for syntactic pattern learning and matching were developed. A patent was filed about one of these algorithms. The Social Network Browser is an on-line service which visualises an automatically learned social network of famous people.



Event Extraction

The scope of this project is detection of violent events and disasters in news articles. The current outcome of this project is the NEXUS event extraction system. Currently several event extraction application are running on-line and collect data about crisis events. For each detected event NEXUS finds the victims, the perpetrators, the type of the event, the number of killed, injured, kidnapped and other information. A real-time media derived crisis map (best viewed with Internet Explorer) shows tviolent events and disasters detected in real time and geo-located on the World map. A Google Earth interface for event visualisation is also provided.



Paraphrase Acquisition - the TEASE system, GSL

This was a cooperative activity with Bar Illan and Tel Aviv university. It is also a research line in the MoreWeb project. The scope of the project was the development of a framework for automatic acquisition of paraphrases. Our efforts resulted in TEASE - a system for web-based paraphrase acquisition and GSL - a General Structure Learning algorithm which learns repeating graph structures.



MoreWeb (Multlingual Question Anwering on the Web)

This projects started January 2003 and lasted three years. The project outcome were several Multilingual QA modules, capable of finding answers to questions in natural language. We built the DIOGENE system which answers to questions in English and Italian from local collections. It also searches for definitions on the Web. DIOGENE can be used in cross-language Italian-English mode. This means that the system answers questions in Italian translating them into English and searches afterwards in English collection. We also developed SOCRATES - a QA prototype for Bulgarian. We developed multilingual libraries for QA in the frame of the project. We built such a library for definition questions, it is implemented for English, Italian and Bulgarian. Moreover, we built specific libraries for the different languages, using the output of GSL.



LINGUA

LINGUA is a language engine for the Bulgarian language. The engine was developed in Delphi. LINGUA performs sentence splitting, part-of-speech tagging, clause chunking, anaphora resolution and shallow parsing. This language engine backs up the QA prototype SOCRATES.




Back to content | Back to main menu