Monthly Archives: September 2013

Split a string into pairs of words

From stackoverflow:

Question

Given a string such as “aa bb cc dd ee ff“, is there a regex that works with String.split() to extract two words at a time? The expected result is:

[aa bb, cc dd, ee ff]

Note: This question is about the split regex. It is not about “finding a work-around” or other “making it work another way” solutions.

Solution

The regex expression is (?<!Gw+)s. Here are some explanation of the regex:

read more

Tikz example – SVM trained with samples from two classes

In machine learning, Support Vector Machines are supervised learning models used for classification and regression analysis. The basic SVM takes a set of input data and predicts, for each given input, which of two possible classes forms the output, making it a non-probabilistic binary linear classifier. To classify examples, we choose the hyperplane so that the distance from it to the nearest data point on each side is maximized. If such a hyperplane exists, it is known as the maximum-margin hyperplane and the linear classifier it defines is known as a maximum margin classifier.

read more

Yet another way to use Chinese charecter in Latex

First, install texlive-lang-cjk or texlive-lang-chinese or texlive-lang-all. Then in tex file, add:

usepackage[T1]{fontenc}  
usepackage{CJKutf8}  
newenvironment{SChinese}{%
  CJKfamily{gbsn}%
  CJKtilde
  CJKnospace}{}

Whenever a Chinese character is needed, use

begin{SChinese}凡end{SChinese}

Other examples like underline, underdot, etc., can be found in

/usr/share/texmf/doc/latex/latex-cjk/examples

Change gedit embedded terminal colors

For Ubuntu only:

  1. install dconf-tools and gconf-editor
  2. in gconf-editor, navigate to apps → gnome-terminal → profiles → Default
  3. in dconf-tools, navigate to org → gnome → gedit → plugins → terminal
  4. uncheck “use-theme-colors”
  5. copy values of “background-color“, “foreground-color“, and “palette“, from gconf-editor to dconf-tools

Install Gnuplot 4.6 with PDF on Ubuntu

It is always hard to install gnuplot manually on Ubuntu, especially if you want to plot diagram in PDF, JPEG, or PNG formats. This short 101 article describes one way to install gnuplot with PDF on Ubuntu.

  1. download pdflib-light and extract to $PDFLIB
  2. compile and install pdflib-light

    cd $PDFLIB
    ./configure  
    make  
    sudo make install
    
  3. refresh lib cache: sudo ldconfig

  4. download gunplot and extract to $GNUPLOT

  5. compile and install gnuplot

    cd $GNUPLOT  
    ./configure --with-pdf  
    make  
    sudo make install  
    

Note: other packages which can be installed via apt-get

read more

The best way to place figures side-by-side in Latex

There are different way of placing figures side by side in Latex, subcaption, subfig, subfigure, or even minipage. This post will tell you which one is the best.

subcaption

A useful extension is the subcaption package (the subfigure and subfig packages are deprecated and shouldn’t be used any more), which uses subfloats within a single float. This gives the author the ability to have subfigures within figures, or subtables within table floats. Subfloats have their own caption, and an optional global caption. An example will best illustrate the usage of this package:

\usepackage{subcaption} 
... 
\begin{figure}
  \begin{subfigure}[b]{0.4\textwidth}
    \includegraphics[width=\textwidth]{1.png}
    \caption{Picture 1}
    \label{fig:1}
  \end{subfigure}
  %
  \begin{subfigure}[b]{0.4\textwidth}
    \includegraphics[width=\textwidth]{2.png}
    \caption{Picture 2}
    \label{fig:2}
  \end{subfigure}
\end{figure}

minipage

The minipage can be used to place figures side-by-side too. But it is not a floating environment, thus has to be placed in a figure environment. Another disadvantage of minipage is that it does not align fi gures. Therefore, subcaption is still the best package you should use.

\begin{figure}
  \begin{minipage}[b]{0.4\textwidth}
    \includegraphics[width=\textwidth]{1.png}
    \caption{Picture 1}
    \label{fig:1}
  \end{minipage}

  \begin{minipage}[b]{0.4\textwidth}
    includegraphics[width=textwidth]{2.png}
    \caption{Picture 2}
    \label{fig:2}
  \end{minipage}
\end{figure}

read more

Best Markdown Editors for Windows, Linux, and the web

Markdown is a lightweight markup language, allowing people “to write using an easy-to-read, easy-to-write plain text format, then convert it to structurally valid XHTML (or HTML)”. An excellent Markdown Syntax Guide is by Daring Fireball. Sites such as GitHub, reddit, Diaspora, Stack Overflow, OpenStreetMap, and SourceForge use Markdown to facilitate discussion between users. GitHub uses “GitHub Flavored Markdown” (GFM) for messages, issues, and comments. It differs from standard Markdown (SM) in a few significant ways and adds some additional functionality.

read more

函数式编程的另类指南(4)

The following part is not maintained anymore. Please go to 函数式程序设计的另类指南 for the whole translation.

以下内容不再更新,浏览全部翻译,请访问 函数式程序设计的另类指南

原文链接:Functional Programming For The Rest of Us
原文作者:Vyacheslav Akhmechet

函数式编程

函数式程序设计是对阿隆左·丘奇思想的一种实现。但并非所有的lambda演算都被实现了,因为lambda演算原本不是为有物理限制的计算机设计的。因此,函数式像面向对象程序设计一样,只是一系列理念,而不是严格的使用手册。如今有很多种函数式编程语言,它们各自采用了不同的方法。在本文中,我将使用Java来编写函数式程序,并且解释函数式语言的常用特性(的确,如果你有受虐倾向,你可以用Java写函数式程序)。在下面几章中,我将会对Java稍作修改,以使其成为一个可用的函数式编程语言。那我们开始吧。

read more

Book review: Introduction to Machine Learning (2ed)

Introduction to Machine Learning (2ed), by Ethem Alpaydin, MIT Press, 2010. ISBN 0-262-01243-X.

This book provides students, researchers, and developers a comprehensive introduction to the machine learning techniques. It is structured primarily as coursebook, which is a valuable teaching textbook for graduates or undergraduates. This book is also a good resources for self-study by researches and developers, but they have to be familiar with AI and advanced mathematics.

This book begins with an introduction chapter, followed by 18 chapters plus an appendix. Each chapter presents a stand-alone topic, beginning with a brief introduction and ending with notes. Therefore, the readers can quickly obtain an overview for the topic and catch the possible direction to further development in this subject area. The book covers a variety of machine learning techniques: supervised and unsupervised learning, parametric and nonparametric methods. All of these are followed by methods of how to assess and compare classification algorithms, combine multiple learners, and reinforce learning procedure.

read more