Resolve coreference using Stanford CoreNLP

Coreference resolution is the task of finding all expressions that refer to the same entity in a text. Stanford CoreNLP coreference resolution system is the state-of-the-art system to resolve coreference in the text. To use the system, we usually create a pipeline, which requires tokenization, sentence splitting, part-of-speech tagging, lemmarization, named entity recoginition, and parsing. However sometimes, we use others tools for preprocessing, particulaly when we are working on a specific domain. In these cases, we need a stand-alone coreference resolution system. This post demenstrates how to create such a system using Stanford CoreNLP. Load properties In general, we can just create an empty Properties, because the Stanford CoreNLP tool… Read More »

How to access file resources in JUnit tests

In maven, any file under src/test/resources is copied to target/test-classes. How to access these resource files in JUnit? Using the class’s resource. It will locate the file in the test’s classpath /target/test-classes. URL url = this.getClass().getResource(“/” + TEST_FILENAME); File file = new File(url.getFile());

Referee’ quotes published by Environmental Microbiology

I recently read a series of referee’ quotes published by Environmental Microbiology. As stated in every article, the referees are busy, serious individuals who give selflesslyof their precious time to improve manuscript. But, once in a while, their humour (or admiration) gets the better of them. Here are some quotes that I like most. 2007 I recommend the authors to get in contact with, e.g. sanitary engineers or fermentation/process engineersand not to try and invent the wheel again. I only am willing to read this again if it has less than 20pages (Ed.: the original submission had 54)! read more

Video recordings of CLSP JHU summer workshop 2014

The Summer Workshop brought together three recurring themes: improved recognition of conversational speech, probabilistic representations of linguistic meaning, and abstract meaning representations for machine translation. Some highlights Martha Palmer – Designing Abstract Meaning Representations for Machine Translation Percy Liang – The State of the Art in Semantic Parsing Shalom Lappin – A Rich Probabilistic Type Theory for the Semantics of Natural Language David McAllester – The Problem of Reference Stephan Oepen – Broad-Coverage Semantic Dependency Parsing Giorgio Satta – Synchronous Rewriting for Natural Language Processing read more

How to view differences of branches with meld?

git-meld is a git command that allows you to compare and edit treeishs between revisions using meld or any other diff tool that supports directory comparison. git meld is a frontend to git diff and accepts the same options and arguments. It is essentially an extended git-difftool for tools that support comparing directories rather than having git call the external tool for every file that has changed. read more

How to Install SmartGit via PPA in Ubuntu

[SmartGit] is a graphical Git and Mercurial client. It runs on Linux, Mac OS X (10.5 or newer) and Windows (XP or newer). sudo add-apt-repository ppa:eugenesan/ppa sudo apt-get update sudo apt-get install smartgithg

How to use DecimalFormat in Java

The DecimalFormat class in Java is used to format numbers based on the pattern you specify yourself. This post explains how to use the DecimalFormat class to format different types of numbers.

[FWD] Two stories from a research paper: Content Without Context is Meaningles

Two stories from a research paper: Content Without Context is Meaningless. 1.1 Machine Learning Hammer Mark Twain once said: “To a man with a hammer, everything looks like a nail.” His observation is definitely very relevant to current trends in content analysis. We have a Machine Learning Hammer (ML Hammer) that we want to use for solving any problem that needs to be solved. The problem is neither with learning nor with the hammer; the problem is with people who fail to learn that not every problem is a new learning problem [1]. … If we can identify such a feature set, then we can easily model each object by… Read More »