Category Archives: toolkits

How to use Whatizit Web Services in Java

Whatizit is a text processing system that allows you to do textmining tasks on text. It is also available as a Web Service whose underlying idea is to ensure that software from various sources work well together. Whatizit is built on open standards of Simple Object Access Protocol (SOAP) and Web Services Description Language (WSDL). For the transport layer itself, Web Services uses most of the commonly available network protocols, especially Hypertext Transfer Protocol (HTTP). For more information on WSDL please refer to the W3C WSDL v1.1 Document. read more

How to convert sourcecode to HTML, RTF, SVG, etc.

I am working on my paper which needs a piece of XML to syntax highlighted. I’ve found Sublime with “Copy as RTF” plugins is useful, but as a programmer I prefer something that being done via commend line, and more importantly being easily customizable.

So I did a some searches and came across highlight. To install it on Ubuntu is quite simple

sudo apt-get install highlight

Then I can use highlight to convert the XML file to RTF and copy it to the paper I am working on. read more

Resolve coreference using Stanford CoreNLP

Coreference resolution is the task of finding all expressions that refer to the same entity in a text. Stanford CoreNLP coreference resolution system is the state-of-the-art system to resolve coreference in the text. To use the system, we usually create a pipeline, which requires tokenization, sentence splitting, part-of-speech tagging, lemmarization, named entity recoginition, and parsing. However sometimes, we use others tools for preprocessing, particulaly when we are working on a specific domain. In these cases, we need a stand-alone coreference resolution system. This post demenstrates how to create such a system using Stanford CoreNLP.

Load properties

In general, we can just create an empty Properties, because the Stanford CoreNLP tool can automatically load the default one in the model jar file, which is under edu.stanford.nlp.pipeline.

In other cases, we would like to use specific properties. The following code shows one example of loading the property file from the working directory.

private static final String PROPS_SUFFIX = ".properties"; private Properties loadProperties(String name) { return loadProperties(name, Thread.currentThread().getContextClassLoader()); } private Properties loadProperties(String name, ClassLoader loader) { if (name.endsWith(PROPS_SUFFIX)) name = name.substring(0, name.length() - PROPS_SUFFIX.length()); name = name.replace('.', '/'); name += PROPS_SUFFIX; Properties result = null; // Returns null on lookup failures System.err.println("Searching for resource: " + name); InputStream in = loader.getResourceAsStream(name); try { if (in != null) { InputStreamReader reader = new InputStreamReader(in, "utf-8"); result = new Properties(); result.load(reader); // Can throw IOException } } catch (IOException e) { result = null; } finally { IOUtils.closeIgnoringExceptions(in); } return result; } read more

How to view differences of branches with meld?

git-meld is a git command that allows you to compare and edit treeishs between revisions using meld or any other diff tool that supports directory comparison. git meld is a frontend to git diff and accepts the same options and arguments.

It is essentially an extended git-difftool for tools that support comparing directories rather than having git call the external tool for every file that has changed. read more

Hosting a Maven repository on github (with sources and javadoc)

How to make a small open sourced library available to other developers via maven? One way is to deploy it on Maven Central Repository. What I’d like to do is to deploy it to github, so I can modify it freely. This post will tell you how to do that.

The typical way I deploy artifacts to a github is to use mvn deploy. Here are steps:

  • Use site-maven-plugin to push the artifacts to github
  • Use maven-javadoc-plugin to push the javadoc
  • Use maven-source-plugin to push the source
  • Configure maven to use the remote mvn-repo as a maven repository

Configure maven-deploy-plugin

First, I add the following snippnet to tell maven to deploy artifacts to a temporary location inside my target directory:

<distributionManagement> <repository> <id>internal.repo</id> <name>Temporary Staging Repository</name> <url>file://${}/mvn-repo</url> </repository> </distributionManagement> <plugins> <plugin> <artifactId>maven-deploy-plugin</artifactId> <version>2.8.1</version> <configuration> <altDeploymentRepository> internal.repo::default::file://${}/mvn-repo </altDeploymentRepository> </configuration> </plugin> </plugins> read more

Mockito 101

Mockito is a mocking framework that lets you write beatiful tests with clean and simple API. It biases toward minimal specifications, makes different behaviors look different, and displays clear error messages.

Creating Mocks

To create a mock using Mockito, simply annotate mocks with @Mock and call MockitoAnnotations.initMocks(this).

import org.mockito.Mock;
import org.mockito.MockitoAnnotations;

public class FooClassTest {

  public void setUp() {

Stubbing values

Stubbing values can stimulate the behavior of exsiting code or be a temporary substitute for yet-to-be-developed code. By default, for all methods that return value, mock returns null, an empty collection or appropriate primitive/primitive wrapper value (e.g: 0, false, …). You can override the stubbing values as below. Once stubbed, the method will always return stubbed value regardless of how many times it is called. For a method with a void return, ususally we do not need to stub it.

import static org.mockito.Mockito.doThrow; import static org.mockito.Mockito.when; ... // a method that returns values when(mockFoo.someCall()).thenReturn(someValue); when(mockFoo.someCall()).thenThrow(new FooException()); // a method with a void return doThrow(new FooException()).when(mockFoo).voidMethodThatThrows(); read more

Install brat on Apache2

Install brat

download, unzip and run ./install.

Change the webapp location in Apache2

  1. in etc/apache2/sites-avialble, in default add Alias /brat "/home/brat"
  2. restart apache2:
sudo service apache2 reload