Interesante, desde nuestra perspectiva (aunque algo anticuado, es de 1996), el trabajo de Tomaz Erjavec “Public Domain Generic Tools: An Overview” ( http://citeseer.nj.nec.com/430552.html)
Paai's text utilities: A set of utilities consisting of unix-scripts and c-programs for frequency-counts and lexical cohesion. De J.J. Paijmans ( http://pi0959.kub.nl:2080/Paai/Publiek). Last additions: 23 December 2000.
tea (a KWIC —KeyWord In Context— tool), de Masao Utiyama mutiyama@crl.go.jp, última versión de mayo de 2002 ( http://www2.crl.go.jp/jt/a132/members/mutiyama/software.html ) It displays keywords along with their contexts. Tea allows you to: search multiple text files, list search-words in a tree structure, sort retrieved contexts in various ways, etc.
textseg ( http://www2.crl.go.jp/jt/a132/members/mutiyama/software.html )
openNLP ( http://opennlp.sourceforge.net/)
GATE (General Architecture for Text Engineering, http://gate.ac.uk/)