annotate src/lib-mail/message-parser.h @ 1322:97f8c00b8d4c HEAD

Better handling for multiline headers. Before we skipped headers larger than input buffer size (8k with read (default), 256k with mmap). The skipping was also a bit buggy. Now we parse the lines one at a time. There's also a way to read the header fully into memory before parsing it, if really needed.
author Timo Sirainen <tss@iki.fi>
date Wed, 26 Mar 2003 19:29:01 +0200
parents 60646878858e
children 676995d7c0ca
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
1 #ifndef __MESSAGE_PARSER_H
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
2 #define __MESSAGE_PARSER_H
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
3
988
8028c4dcf38f mail-storage.h interface changes, affects pretty much everything.
Timo Sirainen <tss@iki.fi>
parents: 953
diff changeset
4 #include "message-size.h"
8028c4dcf38f mail-storage.h interface changes, affects pretty much everything.
Timo Sirainen <tss@iki.fi>
parents: 953
diff changeset
5
896
21ffcce83c70 Rewrote rfc822-tokenize.c to work one token at a time so it won't uselessly
Timo Sirainen <tss@iki.fi>
parents: 764
diff changeset
6 #define IS_LWSP(c) \
21ffcce83c70 Rewrote rfc822-tokenize.c to work one token at a time so it won't uselessly
Timo Sirainen <tss@iki.fi>
parents: 764
diff changeset
7 ((c) == ' ' || (c) == '\t')
21ffcce83c70 Rewrote rfc822-tokenize.c to work one token at a time so it won't uselessly
Timo Sirainen <tss@iki.fi>
parents: 764
diff changeset
8
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
9 enum message_part_flags {
106
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
10 MESSAGE_PART_FLAG_MULTIPART = 0x01,
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
11 MESSAGE_PART_FLAG_MULTIPART_DIGEST = 0x02,
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
12 MESSAGE_PART_FLAG_MESSAGE_RFC822 = 0x04,
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
13
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
14 /* content-type: text/... */
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
15 MESSAGE_PART_FLAG_TEXT = 0x08,
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
16
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
17 /* content-transfer-encoding: binary */
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
18 MESSAGE_PART_FLAG_BINARY = 0x10
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
19 };
106
5fe3e04ca8d9 Added support for caching of MessagePart data. This is useful for fetching
Timo Sirainen <tss@iki.fi>
parents: 105
diff changeset
20
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
21 struct message_part {
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
22 struct message_part *parent;
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
23 struct message_part *next;
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
24 struct message_part *children;
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
25
105
31034993473c there was no need for MessagePart->pos.virtual_pos, so removed it.
Timo Sirainen <tss@iki.fi>
parents: 50
diff changeset
26 uoff_t physical_pos; /* absolute position from beginning of message */
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
27 struct message_size header_size;
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
28 struct message_size body_size;
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
29
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
30 enum message_part_flags flags;
10
82b7de533f98 s/user_data/context/ and some s/Data/Context/
Timo Sirainen <tss@iki.fi>
parents: 9
diff changeset
31 void *context;
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
32 };
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
33
1322
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
34 struct message_header_parser_ctx;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
35
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
36 struct message_header_line {
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
37 const char *name;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
38 size_t name_len;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
39
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
40 const unsigned char *value;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
41 size_t value_len;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
42
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
43 const unsigned char *full_value;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
44 size_t full_value_len;
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
45
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
46 unsigned int continues:1; /* multiline header, continues in next line */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
47 unsigned int continued:1; /* multiline header, continues */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
48 unsigned int eoh:1; /* "end of headers" line */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
49 unsigned int no_newline:1; /* no \n after this line */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
50 unsigned int use_full_value:1; /* set if you want full_value */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
51 };
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
52
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
53 /* called once with hdr = NULL at end of headers */
1038
60646878858e Function typedefs now define them as functions, not function pointers.
Timo Sirainen <tss@iki.fi>
parents: 988
diff changeset
54 typedef void message_header_callback_t(struct message_part *part,
1322
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
55 struct message_header_line *hdr,
1038
60646878858e Function typedefs now define them as functions, not function pointers.
Timo Sirainen <tss@iki.fi>
parents: 988
diff changeset
56 void *context);
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
57
953
411006be3c66 Naming change for function typedefs.
Timo Sirainen <tss@iki.fi>
parents: 903
diff changeset
58 /* callback is called for each field in message header. */
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
59 struct message_part *message_parse(pool_t pool, struct istream *input,
1038
60646878858e Function typedefs now define them as functions, not function pointers.
Timo Sirainen <tss@iki.fi>
parents: 988
diff changeset
60 message_header_callback_t *callback,
953
411006be3c66 Naming change for function typedefs.
Timo Sirainen <tss@iki.fi>
parents: 903
diff changeset
61 void *context);
903
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
62 void message_parse_header(struct message_part *part, struct istream *input,
fd8888f6f037 Naming style changes, finally got tired of most of the typedefs. Also the
Timo Sirainen <tss@iki.fi>
parents: 898
diff changeset
63 struct message_size *hdr_size,
1038
60646878858e Function typedefs now define them as functions, not function pointers.
Timo Sirainen <tss@iki.fi>
parents: 988
diff changeset
64 message_header_callback_t *callback, void *context);
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
65
1322
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
66 struct message_header_parser_ctx *
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
67 message_parse_header_init(struct istream *input, struct message_size *hdr_size);
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
68 void message_parse_header_deinit(struct message_header_parser_ctx *ctx);
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
69
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
70 /* Read and return next header line. */
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
71 struct message_header_line *
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
72 message_parse_header_next(struct message_header_parser_ctx *ctx);
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
73
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
74 #endif