annotate src/lib-charset/charset-utf8.h @ 4605:e6cb9f75b76a HEAD

Added charset_is_utf8() and charset_to_ucase_utf8_full().
author Timo Sirainen <tss@iki.fi>
date Sat, 16 Sep 2006 16:50:21 +0300
parents 928229f8b3e6
children e5451501ff2f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
568
f2aa58c2afd0 SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
1 #ifndef __CHARSET_UTF8_H
f2aa58c2afd0 SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
2 #define __CHARSET_UTF8_H
f2aa58c2afd0 SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
3
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
4 enum charset_result {
765
553f050c8313 Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents: 608
diff changeset
5 CHARSET_RET_OK = 1,
553f050c8313 Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents: 608
diff changeset
6 CHARSET_RET_OUTPUT_FULL = 0,
553f050c8313 Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents: 608
diff changeset
7 CHARSET_RET_INCOMPLETE_INPUT = -1,
553f050c8313 Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents: 608
diff changeset
8 CHARSET_RET_INVALID_INPUT = -2
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
9 };
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
10
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
11 /* Begin translation to UTF-8. */
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
12 struct charset_translation *charset_to_utf8_begin(const char *charset,
3863
55df57c028d4 Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents: 903
diff changeset
13 bool *unknown_charset);
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
14
3879
928229f8b3e6 deinit, unref, destroy, close, free, etc. functions now take a pointer to
Timo Sirainen <tss@iki.fi>
parents: 3863
diff changeset
15 void charset_to_utf8_end(struct charset_translation **t);
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
16
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
17 void charset_to_utf8_reset(struct charset_translation *t);
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
18
4605
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
19 /* Returns TRUE if charset is UTF-8 or ASCII */
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
20 bool charset_is_utf8(const char *charset);
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
21
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
22 /* Translate src to UTF-8. src_size is updated to contain the number of
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
23 characters actually translated from src. Note that dest buffer is used
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
24 only up to its current size, for growing it automatically use
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
25 charset_to_ucase_utf8_full(). */
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
26 enum charset_result
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
27 charset_to_ucase_utf8(struct charset_translation *t,
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
28 const unsigned char *src, size_t *src_size,
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
29 buffer_t *dest);
4605
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
30 enum charset_result
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
31 charset_to_ucase_utf8_full(struct charset_translation *t,
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
32 const unsigned char *src, size_t *src_size,
e6cb9f75b76a Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents: 3879
diff changeset
33 buffer_t *dest);
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
34
792
d573c53946ac Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents: 785
diff changeset
35 /* Simple wrappers for above functions. If utf8_size is non-NULL, it's set
765
553f050c8313 Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents: 608
diff changeset
36 to same as strlen(returned data). */
608
debb8468514e SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents: 568
diff changeset
37 const char *
3863
55df57c028d4 Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents: 903
diff changeset
38 charset_to_utf8_string(const char *charset, bool *unknown_charset,
792
d573c53946ac Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents: 785
diff changeset
39 const unsigned char *data, size_t size,
d573c53946ac Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents: 785
diff changeset
40 size_t *utf8_size_r);
d573c53946ac Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents: 785
diff changeset
41 const char *
3863
55df57c028d4 Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents: 903
diff changeset
42 charset_to_ucase_utf8_string(const char *charset, bool *unknown_charset,
785
d96cbba73a8b Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents: 766
diff changeset
43 const unsigned char *data, size_t size,
d96cbba73a8b Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents: 766
diff changeset
44 size_t *utf8_size_r);
568
f2aa58c2afd0 SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
45
766
03832c7f389b Compiles again without iconv()
Timo Sirainen <tss@iki.fi>
parents: 765
diff changeset
46 void _charset_utf8_ucase(const unsigned char *src, size_t src_size,
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 792
diff changeset
47 buffer_t *dest, size_t destpos);
785
d96cbba73a8b Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents: 766
diff changeset
48 const char *_charset_utf8_ucase_strdup(const unsigned char *data, size_t size,
d96cbba73a8b Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents: 766
diff changeset
49 size_t *utf8_size_r);
766
03832c7f389b Compiles again without iconv()
Timo Sirainen <tss@iki.fi>
parents: 765
diff changeset
50
568
f2aa58c2afd0 SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
51 #endif