Mercurial > dovecot > core-2.2
annotate src/lib-charset/charset-utf8.h @ 4605:e6cb9f75b76a HEAD
Added charset_is_utf8() and charset_to_ucase_utf8_full().
author | Timo Sirainen <tss@iki.fi> |
---|---|
date | Sat, 16 Sep 2006 16:50:21 +0300 |
parents | 928229f8b3e6 |
children | e5451501ff2f |
rev | line source |
---|---|
568
f2aa58c2afd0
SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff
changeset
|
1 #ifndef __CHARSET_UTF8_H |
f2aa58c2afd0
SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff
changeset
|
2 #define __CHARSET_UTF8_H |
f2aa58c2afd0
SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff
changeset
|
3 |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
4 enum charset_result { |
765
553f050c8313
Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents:
608
diff
changeset
|
5 CHARSET_RET_OK = 1, |
553f050c8313
Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents:
608
diff
changeset
|
6 CHARSET_RET_OUTPUT_FULL = 0, |
553f050c8313
Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents:
608
diff
changeset
|
7 CHARSET_RET_INCOMPLETE_INPUT = -1, |
553f050c8313
Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents:
608
diff
changeset
|
8 CHARSET_RET_INVALID_INPUT = -2 |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
9 }; |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
10 |
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
11 /* Begin translation to UTF-8. */ |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
12 struct charset_translation *charset_to_utf8_begin(const char *charset, |
3863
55df57c028d4
Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents:
903
diff
changeset
|
13 bool *unknown_charset); |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
14 |
3879
928229f8b3e6
deinit, unref, destroy, close, free, etc. functions now take a pointer to
Timo Sirainen <tss@iki.fi>
parents:
3863
diff
changeset
|
15 void charset_to_utf8_end(struct charset_translation **t); |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
16 |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
17 void charset_to_utf8_reset(struct charset_translation *t); |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
18 |
4605
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
19 /* Returns TRUE if charset is UTF-8 or ASCII */ |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
20 bool charset_is_utf8(const char *charset); |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
21 |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
22 /* Translate src to UTF-8. src_size is updated to contain the number of |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
23 characters actually translated from src. Note that dest buffer is used |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
24 only up to its current size, for growing it automatically use |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
25 charset_to_ucase_utf8_full(). */ |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
26 enum charset_result |
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
27 charset_to_ucase_utf8(struct charset_translation *t, |
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
28 const unsigned char *src, size_t *src_size, |
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
29 buffer_t *dest); |
4605
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
30 enum charset_result |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
31 charset_to_ucase_utf8_full(struct charset_translation *t, |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
32 const unsigned char *src, size_t *src_size, |
e6cb9f75b76a
Added charset_is_utf8() and charset_to_ucase_utf8_full().
Timo Sirainen <tss@iki.fi>
parents:
3879
diff
changeset
|
33 buffer_t *dest); |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
34 |
792
d573c53946ac
Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents:
785
diff
changeset
|
35 /* Simple wrappers for above functions. If utf8_size is non-NULL, it's set |
765
553f050c8313
Added buffer API. Point is to hide all buffer writing behind this API which
Timo Sirainen <tss@iki.fi>
parents:
608
diff
changeset
|
36 to same as strlen(returned data). */ |
608
debb8468514e
SEARCH CHARSET now works properly with message bodies, and in general body
Timo Sirainen <tss@iki.fi>
parents:
568
diff
changeset
|
37 const char * |
3863
55df57c028d4
Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents:
903
diff
changeset
|
38 charset_to_utf8_string(const char *charset, bool *unknown_charset, |
792
d573c53946ac
Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents:
785
diff
changeset
|
39 const unsigned char *data, size_t size, |
d573c53946ac
Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents:
785
diff
changeset
|
40 size_t *utf8_size_r); |
d573c53946ac
Full not-too-well-tested support for SORT extension. Required a few
Timo Sirainen <tss@iki.fi>
parents:
785
diff
changeset
|
41 const char * |
3863
55df57c028d4
Added "bool" type and changed all ints that were used as booleans to bool.
Timo Sirainen <tss@iki.fi>
parents:
903
diff
changeset
|
42 charset_to_ucase_utf8_string(const char *charset, bool *unknown_charset, |
785
d96cbba73a8b
Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents:
766
diff
changeset
|
43 const unsigned char *data, size_t size, |
d96cbba73a8b
Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents:
766
diff
changeset
|
44 size_t *utf8_size_r); |
568
f2aa58c2afd0
SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff
changeset
|
45 |
766 | 46 void _charset_utf8_ucase(const unsigned char *src, size_t src_size, |
903
fd8888f6f037
Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents:
792
diff
changeset
|
47 buffer_t *dest, size_t destpos); |
785
d96cbba73a8b
Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents:
766
diff
changeset
|
48 const char *_charset_utf8_ucase_strdup(const unsigned char *data, size_t size, |
d96cbba73a8b
Don't use Buffers with read-only data, just makes it more difficult without
Timo Sirainen <tss@iki.fi>
parents:
766
diff
changeset
|
49 size_t *utf8_size_r); |
766 | 50 |
568
f2aa58c2afd0
SEARCH CHARSET support. Currently we do it through iconv() and only ASCII
Timo Sirainen <tss@iki.fi>
parents:
diff
changeset
|
51 #endif |