Recommend: Rule-based Information Extraction is Dead! Long Live Rule-based Information Extraction Systems!

By | October 12, 2013

Publications of EMNLP 2013 are released: http://aclweb.org/anthology/D/D13/

On the list, I found a very interested article “Rule-based Information Extraction is Dead! Long Live Rule-based Information Extraction Systems!“. It discusses the disconnect between industry and academia: while rule-based IE dominates the commercial world, it is widely regarded as dead-end technology by the academia. The following table summarizes the pros and cons of machine learning and rule-based information extraction technologies (reproduced from the above paper).

Pros Cons
Rule-based
  • Declarative
  • Easy to comprehend
  • Easy to maintain
  • Easy to incorporate domain knowledge
  • Easy to trace and fix the cause of errors
  • Heuristic
  • Requires tedious manual labor
ML-based
  • Trainable
  • Adaptable
  • Reduces manual effort
  • Requires labeled data
  • Requires retaining for domain adaptation
  • Requires ML expertise to use or maintain
  • Opaque

Leave a Reply

Your email address will not be published. Required fields are marked *