changeset 13210:f7b10bfa6bbf

fts-lucene: Added missing textcat.conf
author Timo Sirainen <tss@iki.fi>
date Wed, 10 Aug 2011 16:50:19 +0300
parents 1fb6cc545575
children eb6d2fcca15b
files src/plugins/fts-lucene/textcat.conf
diffstat 1 files changed, 25 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/src/plugins/fts-lucene/textcat.conf	Wed Aug 10 16:50:19 2011 +0300
@@ -0,0 +1,25 @@
+#
+# A sample config file for the language models 
+# provided with Gertjan van Noords language guesser
+# (http://odur.let.rug.nl/~vannoord/TextCat/)
+#
+# Notes: 
+# - You may consider eliminating a couple of small languages from this
+# list because they cause false positives with big languages and are
+# bad for performance. (Do you really want to recognize Drents?)
+# - Putting the most probable languages at the top of the list
+# improves performance, because this will raise the threshold for
+# likely candidates more quickly.
+#
+LM/english.lm			english
+LM/italian.lm			italian
+LM/danish.lm			danish
+LM/dutch.lm			dutch
+LM/finnish.lm			finnish
+LM/french.lm			french
+LM/german.lm			german
+LM/norwegian.lm			norwegian
+LM/portuguese.lm		portuguese
+LM/russian.lm			russian
+LM/spanish.lm			spanish
+LM/swedish.lm			swedish