log src/lib-fts/fts-tokenizer-generic.c @ 22656:1789bf2a1e01

age author description
Wed, 11 Jan 2017 02:51:13 +0100 Stephan Bosch Updated copyright notices to include the year 2017.
Tue, 15 Mar 2016 10:47:20 +0200 Teemu Huovila lib-fts: Lift helper function out of generic tokenizer.
Wed, 13 Jan 2016 12:24:03 +0200 Timo Sirainen global: freshen copyright
Tue, 17 Nov 2015 11:45:24 +0200 Teemu Huovila lib-fts: Removed TODO comment.
Tue, 17 Nov 2015 11:44:53 +0200 Teemu Huovila lib-fts: Minor code cleanup - Rename some internal functions.
Mon, 17 Aug 2015 13:18:03 +0300 Teemu Huovila lib-fts: Add Unicode TR29 rule WB5a setting to tokenizer.
Mon, 17 Aug 2015 13:14:44 +0300 Teemu Huovila lib-fts: Update comment on tr29 rules.
Wed, 03 Jun 2015 01:04:07 +0300 Timo Sirainen lib-fts: Use UTF8_IS_START_SEQ()
Wed, 03 Jun 2015 00:39:11 +0300 Timo Sirainen lib-fts: Moved IS_APOSTROPHE() to fts-common.h
Tue, 02 Jun 2015 22:01:07 +0300 Timo Sirainen lib-fts: Optimized truncation of partial trailing UTF-8 characters in tokenizers.
Tue, 02 Jun 2015 20:50:23 +0300 Timo Sirainen lib-fts: Fixed tr29 tokenizer to delete last character correctly when it's preceded by non-ASCII
Mon, 01 Jun 2015 22:14:19 +0300 Timo Sirainen lib-fts: Use new uni_utf8_get_char*() interface
Mon, 01 Jun 2015 21:58:30 +0300 Timo Sirainen lib-fts: tokenizers - Fixed removal of trailing character in truncated tokens.
Mon, 01 Jun 2015 21:51:33 +0300 Timo Sirainen lib-fts: Optimize tokenizers - Rewrite of apostrophe handling.
Mon, 01 Jun 2015 21:49:18 +0300 Timo Sirainen lib-fts: tr29 tokenizer - rename variable in preparation for the next patch
Mon, 01 Jun 2015 21:48:59 +0300 Timo Sirainen lib-fts: tokenizers - don't include removed apostrophes as part of the token size
Mon, 01 Jun 2015 21:35:39 +0300 Timo Sirainen lib-fts: simple tokenizer minor cleanup - removed unnecessary token length > 0 check
Mon, 01 Jun 2015 21:33:11 +0300 Timo Sirainen lib-fts: tr29 tokenizer cleanup - Avoid unnecessary goto.
Mon, 01 Jun 2015 21:28:42 +0300 Timo Sirainen lib-fts: simple tokenizer optimization - don't check unicode word breaks for ASCII chars.
Mon, 01 Jun 2015 21:27:09 +0300 Timo Sirainen lib-fts: simple tokenizer cleanup - make prev_letter updating more explicit.
Mon, 01 Jun 2015 21:19:47 +0300 Timo Sirainen lib-fts: simple tokenizer cleanup - removed unnecessary variables
Mon, 01 Jun 2015 21:16:35 +0300 Timo Sirainen lib-fts: tr29 cleanup - consistently call valid chars "token" and "non-token" chars.
Mon, 01 Jun 2015 21:11:55 +0300 Timo Sirainen lib-fts: tr29 cleanup - Avoid i++ in the for loop to avoid extra calculations
Mon, 01 Jun 2015 21:10:11 +0300 Timo Sirainen lib-fts: tr29 cleanup - token can never be empty by the time it's being returned.
Mon, 01 Jun 2015 21:08:27 +0300 Timo Sirainen lib-fts: Optimization for tr29 - we don't need to track last_size explicitly
Mon, 01 Jun 2015 18:35:58 +0300 Teemu Huovila lib-fts: Correct internal helper function for tr29.
Mon, 01 Jun 2015 18:35:58 +0300 Teemu Huovila lib-fts: Change TR29 tokenizer to break at full stop (and others).
Thu, 21 May 2015 06:35:59 -0400 Timo Sirainen lib-fts: Fixed handling tokens that contain only apostrophes
Thu, 21 May 2015 06:29:15 -0400 Teemu Huovila lib-fts: Fix simple tokenizer apostrophe handling.
Thu, 21 May 2015 06:17:32 -0400 Teemu Huovila lib-fts: Fix tr29 tokenizer apostrophe handling.
Mon, 11 May 2015 14:42:18 +0300 Timo Sirainen lib-fts: Fixed assert-crash in fts-tokenizer-generic
Sat, 09 May 2015 19:26:01 +0300 Timo Sirainen lib-fts: Changed fts_tokenizer_next/final() to return error string.
Sat, 09 May 2015 19:21:45 +0300 Timo Sirainen lib-fts: Minor code cleanup - avoid functions always returning same value
Sat, 09 May 2015 17:34:59 +0300 Timo Sirainen lib-fts: Fixed token truncation.
Sat, 09 May 2015 15:06:45 +0300 Timo Sirainen lib-fts: fts-tokenizer-generic-private.h had content that didn't really belog there.
Sat, 09 May 2015 14:56:33 +0300 Timo Sirainen lib-fts: Removed tokenizer name macros from fts-tokenizer.h
Sat, 09 May 2015 13:52:37 +0300 Timo Sirainen lib-fts: Added fts_tokenizer_reset()
Sat, 09 May 2015 13:31:14 +0300 Timo Sirainen fts: When tokenizing a search word, give "search" parameter to all the tokenizers.
Sat, 09 May 2015 13:20:29 +0300 Timo Sirainen lib-fts: Use case-sensitive settings comparisons in fts-tokenizer
Sat, 09 May 2015 11:17:03 +0300 Teemu Huovila lib-fts: Improve using max_length in tr29 tokenizer
Sat, 09 May 2015 11:16:22 +0300 Teemu Huovila lib-fts: Fixed using max_length setting in simple tokenizer
Sat, 09 May 2015 11:15:50 +0300 Teemu Huovila lib-fts: Default to simple tokenizer algorithm
Sat, 09 May 2015 11:05:04 +0300 Teemu Huovila fts: Change tokenizer API to be able to return errors
Mon, 20 Apr 2015 16:19:07 +0300 Timo Sirainen Initial import for lib-fts.