annotate src/lib-mail/message-parser.h @ 23017:c1d36f2575c7 default tip

lib-imap: Fix "Don't accept strings with NULs" cherry-pick
author Timo Sirainen <timo.sirainen@open-xchange.com>
date Thu, 29 Aug 2019 09:55:25 +0300
parents 653a6a1bfd61
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6410
e4eb71ae8e96 Changed .h ifdef/defines to use <NAME>_H format.
Timo Sirainen <tss@iki.fi>
parents: 6156
diff changeset
1 #ifndef MESSAGE_PARSER_H
e4eb71ae8e96 Changed .h ifdef/defines to use <NAME>_H format.
Timo Sirainen <tss@iki.fi>
parents: 6156
diff changeset
2 #define MESSAGE_PARSER_H
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
3
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
4 #include "message-header-parser.h"
17237
63f12bb366b0 lib-mail: Moved struct message_part to a separate message-part.h
Timo Sirainen <tss@iki.fi>
parents: 14921
diff changeset
5 #include "message-part.h"
988
8028c4dcf38f mail-storage.h interface changes, affects pretty much everything.
Timo Sirainen <tss@iki.fi>
parents: 953
diff changeset
6
5522
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
7 enum message_parser_flags {
6156
e18086698ebf By default assume MIME message if Content-Type: exists even if Mime-Version:
Timo Sirainen <tss@iki.fi>
parents: 5522
diff changeset
8 /* Don't return message bodies in message_blocks. */
14605
6846c2e50eba message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES flag.
Timo Sirainen <tss@iki.fi>
parents: 14147
diff changeset
9 MESSAGE_PARSER_FLAG_SKIP_BODY_BLOCK = 0x01,
6156
e18086698ebf By default assume MIME message if Content-Type: exists even if Mime-Version:
Timo Sirainen <tss@iki.fi>
parents: 5522
diff changeset
10 /* Buggy software creates Content-Type: headers without Mime-Version:
e18086698ebf By default assume MIME message if Content-Type: exists even if Mime-Version:
Timo Sirainen <tss@iki.fi>
parents: 5522
diff changeset
11 header. By default we allow this and assume message is MIME if
e18086698ebf By default assume MIME message if Content-Type: exists even if Mime-Version:
Timo Sirainen <tss@iki.fi>
parents: 5522
diff changeset
12 Content-Type: is found. This flag disables this. */
14605
6846c2e50eba message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES flag.
Timo Sirainen <tss@iki.fi>
parents: 14147
diff changeset
13 MESSAGE_PARSER_FLAG_MIME_VERSION_STRICT = 0x02,
14147
e7854f8d7213 message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_MULTIPART_BLOCKS.
Timo Sirainen <tss@iki.fi>
parents: 7243
diff changeset
14 /* Return multipart (preamble and epilogue) blocks */
14605
6846c2e50eba message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES flag.
Timo Sirainen <tss@iki.fi>
parents: 14147
diff changeset
15 MESSAGE_PARSER_FLAG_INCLUDE_MULTIPART_BLOCKS = 0x04,
6846c2e50eba message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES flag.
Timo Sirainen <tss@iki.fi>
parents: 14147
diff changeset
16 /* Return --boundary lines */
6846c2e50eba message parser: Added MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES flag.
Timo Sirainen <tss@iki.fi>
parents: 14147
diff changeset
17 MESSAGE_PARSER_FLAG_INCLUDE_BOUNDARIES = 0x08
5522
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
18 };
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
19
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
20 struct message_parser_ctx;
1322
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
21
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
22 struct message_block {
4265
75d5843153f1 Added message_part to struct message_block and some cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 4259
diff changeset
23 /* Message part this block belongs to */
75d5843153f1 Added message_part to struct message_block and some cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 4259
diff changeset
24 struct message_part *part;
75d5843153f1 Added message_part to struct message_block and some cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 4259
diff changeset
25
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
26 /* non-NULL if a header line was read */
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
27 struct message_header_line *hdr;
2404
8ef002a26f1c Added struct message_header_line.middle and middle_len to contain the ':'
Timo Sirainen <tss@iki.fi>
parents: 2150
diff changeset
28
17693
051a96b3960f lib-mail: Added comments to message-parser.h
Timo Sirainen <tss@iki.fi>
parents: 17237
diff changeset
29 /* hdr = NULL, size = 0 block returned at the end of headers for the
051a96b3960f lib-mail: Added comments to message-parser.h
Timo Sirainen <tss@iki.fi>
parents: 17237
diff changeset
30 empty line between header and body (unless the header is truncated).
051a96b3960f lib-mail: Added comments to message-parser.h
Timo Sirainen <tss@iki.fi>
parents: 17237
diff changeset
31 Later on data and size>0 is returned for blocks of mail body that
051a96b3960f lib-mail: Added comments to message-parser.h
Timo Sirainen <tss@iki.fi>
parents: 17237
diff changeset
32 is read (see message_parser_flags for what is actually returned) */
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
33 const unsigned char *data;
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
34 size_t size;
1322
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
35 };
97f8c00b8d4c Better handling for multiline headers. Before we skipped headers larger than
Timo Sirainen <tss@iki.fi>
parents: 1038
diff changeset
36
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
37 /* called once with hdr = NULL at the end of headers */
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
38 typedef void message_part_header_callback_t(struct message_part *part,
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
39 struct message_header_line *hdr,
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
40 void *context);
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
41
4903
204d7edc7cdc Added context parameter type safety checks for most callback APIs.
Timo Sirainen <tss@iki.fi>
parents: 4603
diff changeset
42 extern message_part_header_callback_t *null_message_part_header_callback;
204d7edc7cdc Added context parameter type safety checks for most callback APIs.
Timo Sirainen <tss@iki.fi>
parents: 4603
diff changeset
43
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
44 /* Initialize message parser. part_spool specifies where struct message_parts
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
45 are allocated from. */
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
46 struct message_parser_ctx *
5522
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
47 message_parser_init(pool_t part_pool, struct istream *input,
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
48 enum message_header_parser_flags hdr_flags,
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
49 enum message_parser_flags flags);
5506
6cd889c652b0 Removed message_parse_from_parts(). Added message_parser_init_from_parts()
Timo Sirainen <tss@iki.fi>
parents: 4906
diff changeset
50 /* Use preparsed parts to speed up parsing. */
6cd889c652b0 Removed message_parse_from_parts(). Added message_parser_init_from_parts()
Timo Sirainen <tss@iki.fi>
parents: 4906
diff changeset
51 struct message_parser_ctx *
6cd889c652b0 Removed message_parse_from_parts(). Added message_parser_init_from_parts()
Timo Sirainen <tss@iki.fi>
parents: 4906
diff changeset
52 message_parser_init_from_parts(struct message_part *parts,
5522
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
53 struct istream *input,
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
54 enum message_header_parser_flags hdr_flags,
5dee807e53cf Header parser has now flags parameter to tell it how to handle linefeeds.
Timo Sirainen <tss@iki.fi>
parents: 5506
diff changeset
55 enum message_parser_flags flags);
7243
289765861d66 Changed message_parser_deinit() to return -1 if the parser was using
Timo Sirainen <tss@iki.fi>
parents: 6529
diff changeset
56 /* Returns 0 if parts were returned, -1 we used preparsed parts and they
289765861d66 Changed message_parser_deinit() to return -1 if the parser was using
Timo Sirainen <tss@iki.fi>
parents: 6529
diff changeset
57 didn't match the current message */
289765861d66 Changed message_parser_deinit() to return -1 if the parser was using
Timo Sirainen <tss@iki.fi>
parents: 6529
diff changeset
58 int message_parser_deinit(struct message_parser_ctx **ctx,
289765861d66 Changed message_parser_deinit() to return -1 if the parser was using
Timo Sirainen <tss@iki.fi>
parents: 6529
diff changeset
59 struct message_part **parts_r);
19883
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
60 /* Same as message_parser_deinit(), but return an error message describing
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
61 why the preparsed parts didn't match the message. This can also safely be
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
62 called even when preparsed parts weren't used - it'll always just return
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
63 success in that case. */
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
64 int message_parser_deinit_from_parts(struct message_parser_ctx **_ctx,
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
65 struct message_part **parts_r,
653a6a1bfd61 lib-mail: Added message_parser_deinit_from_parts()
Timo Sirainen <timo.sirainen@dovecot.fi>
parents: 17693
diff changeset
66 const char **error_r);
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
67
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
68 /* Read the next block of a message. Returns 1 if block is returned, 0 if
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
69 input stream is non-blocking and more data needs to be read, -1 when all is
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
70 done or error occurred (see stream's error status). */
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
71 int message_parser_parse_next_block(struct message_parser_ctx *ctx,
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
72 struct message_block *block_r);
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
73
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
74 /* Read and parse header. */
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
75 void message_parser_parse_header(struct message_parser_ctx *ctx,
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
76 struct message_size *hdr_size,
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
77 message_part_header_callback_t *callback,
14629
c93ca5e46a8a Marked functions parameters that are allowed to be NULL. Some APIs were also changed.
Timo Sirainen <tss@iki.fi>
parents: 14605
diff changeset
78 void *context) ATTR_NULL(4);
14921
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
79 #define message_parser_parse_header(ctx, hdr_size, callback, context) \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
80 message_parser_parse_header(ctx, hdr_size + \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
81 CALLBACK_TYPECHECK(callback, void (*)( \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
82 struct message_part *, \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
83 struct message_header_line *, typeof(context))), \
4906
0c3c948412c5 Type safe callbacks weren't as easy as I thought. Only callback(void
Timo Sirainen <tss@iki.fi>
parents: 4903
diff changeset
84 (message_part_header_callback_t *)callback, context)
0c3c948412c5 Type safe callbacks weren't as easy as I thought. Only callback(void
Timo Sirainen <tss@iki.fi>
parents: 4903
diff changeset
85
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
86 /* Read and parse body. If message is a MIME multipart or message/rfc822
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
87 message, hdr_callback is called for all headers. body_callback is called
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
88 for the body content. */
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
89 void message_parser_parse_body(struct message_parser_ctx *ctx,
4259
fd315deac28f Rewrote the message bodystructure parser to allow parsing from non-blocking streams. Also did a couple of API changes and cleanups.
Timo Sirainen <timo.sirainen@movial.fi>
parents: 3879
diff changeset
90 message_part_header_callback_t *hdr_callback,
14629
c93ca5e46a8a Marked functions parameters that are allowed to be NULL. Some APIs were also changed.
Timo Sirainen <tss@iki.fi>
parents: 14605
diff changeset
91 void *context) ATTR_NULL(3);
14921
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
92 #define message_parser_parse_body(ctx, callback, context) \
4906
0c3c948412c5 Type safe callbacks weren't as easy as I thought. Only callback(void
Timo Sirainen <tss@iki.fi>
parents: 4903
diff changeset
93 message_parser_parse_body(ctx, \
14921
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
94 (message_part_header_callback_t *)callback, \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
95 (void *)((char *)context + CALLBACK_TYPECHECK(callback, \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
96 void (*)(struct message_part *, \
d3db2ba15d00 Removed CONTEXT_TYPE_SAFETY macro and reimplemented its functionality better.
Timo Sirainen <tss@iki.fi>
parents: 14629
diff changeset
97 struct message_header_line *, typeof(context)))))
1697
ef79ce6507ff Message parsing can now be done in two parts - header and body. We're now
Timo Sirainen <tss@iki.fi>
parents: 1689
diff changeset
98
0
3b1985cbc908 Initial revision
Timo Sirainen <tss@iki.fi>
parents:
diff changeset
99 #endif