Mercurial > dovecot > original-hg > dovecot-1.2
comparison doc/thread-refs.txt @ 7800:524b4acad38b HEAD
Added documentation about how thread indexes work.
author | Timo Sirainen <tss@iki.fi> |
---|---|
date | Mon, 09 Jun 2008 05:55:51 +0300 |
parents | |
children | 04a3be30e5a6 |
comparison
equal
deleted
inserted
replaced
7799:4aa830b4f8bc | 7800:524b4acad38b |
---|---|
1 Optimistic incremental THREAD=REFERENCES | |
2 | |
3 Step (1) is the slowest stage for building a THREAD=REFERENCES tree. If its | |
4 result tree is permanently saved, the following thread builds can be based | |
5 on it by updating the tree incrementally. | |
6 | |
7 Adding new messages to the tree is simple: simply follow the normal rules | |
8 as when building a new tree from scratch. Expunging messages gets more | |
9 problematic though. | |
10 | |
11 Each node in the tree keeps a "link reference count" which specifies how | |
12 many messages contain a "this message" -> "parent message" reference. The | |
13 first reference is usually added by the message's own References: or | |
14 In-Reply-To: header and the latter references are added by References: | |
15 headers. This link refcount must be updated when adding and expunging | |
16 messages. When the link refcount drops to zero, the message becomes a root. | |
17 The link refcount doesn't tell much about the number of children the node | |
18 has, because References: headers may reference any number of its ancestors. | |
19 | |
20 The optimistic approach assumes that usually there are no problematic | |
21 links. In such case expunging a message simply updates the link refcounts | |
22 and marks the message's node as expunged. Expunged messages may still act | |
23 as dummy nodes. The node may be removed only after there are no nodes which | |
24 point to it. | |
25 | |
26 Problematic links are handled by marking nodes affected by them. If such a | |
27 node is expunged or its link is unreferenced, the thread tree must be | |
28 rebuilt. This is assumed to be a rarely occurring event. The problematic | |
29 cases are: | |
30 | |
31 1) Duplicate Message-ID: headers. If the first message having it is | |
32 expunged, the thread tree must be rebuilt. | |
33 | |
34 2) Message-ID loops. If a message referencing the looping path gets | |
35 expunged, the loop may break and the thread tree must be rebuilt. Because | |
36 it can't be determined easily which loops get broken by which expunges, | |
37 this case can be handled in a bit easier way: When detecting a loop between | |
38 parent and child, rebuild the tree if any link between the parent and child | |
39 gets unreferenced. | |
40 | |
41 3) A message changes its parent because an earlier message's References: | |
42 header contained a different link. If the message gets expunged, the thread | |
43 tree must be rebuilt to get the original parent back. | |
44 | |
45 4) A link in a message's References: header is ignored, because the | |
46 referenced child already specified a different parent to itself. If the | |
47 message gets expunged, the thread tree must be rebuilt to determine its new | |
48 parent. | |
49 | |
50 5) A link in a message's References: header is ignored, because an earlier | |
51 message's References: header already specified a different link. If the | |
52 earlier message gets expunged, the parent may change. The earlier message | |
53 could be found out quickly by keeping some extra state (or with a slow | |
54 scan), but since this is assumed to be a rare problem, there's an easier | |
55 (but less specific) way to handle this: Rebuild the tree if a link to the | |
56 child node is unreferenced (or alternatively if a link to the original | |
57 parent node is unreferenced, but it probably happens more often). | |
58 | |
59 Pseudocode: | |
60 | |
61 node { | |
62 parent: Pointer to parent node. Children pointers aren't required. | |
63 uid: Message's UID (0 = expunged or never even existed) | |
64 | |
65 link_refcount: Number of messages referencing "this node" -> "parent node" | |
66 (e.g. "References: a b" would increase this message's reference as well as | |
67 b's reference, regardless of how the links are really added to the thread | |
68 tree). | |
69 expunge_rebuilds: If this message gets expunged, rebuild the thread tree. | |
70 parent_unref_rebuilds: If a link between this node and its child gets | |
71 unreferenced, rebuild the thread tree. | |
72 } | |
73 | |
74 link_reference(parent_node, child_node) | |
75 parent_node.link_refcount++ | |
76 if is_ancestor(child_node, parent_node) | |
77 // child_node is an ancestor of parent_node. Adding child_node -> | |
78 // parent_node would introduce a loop. If any messages referencing the | |
79 // path between parent_node's parent and child_node get expunged, we | |
80 // have to rebuild the tree because the loop might break. For example: | |
81 // #1: a -> b (a.ref=1, b.ref=1) | |
82 // #2: b -> a (a.ref=2, b.ref=2) | |
83 // #3: c -> a -> b (a.ref=3, b.ref=3, c.ref=1) | |
84 // Expunging #3 wouldn't break the loop, but expunging #1 would. | |
85 for node in nodes[parent_node.parent .. child_node] | |
86 node.parent_unref_rebuilds = true | |
87 else if child_node.parent == parent_node | |
88 // The same link already exists | |
89 else | |
90 // Set parent_node as child_node's parent | |
91 if child_node.parent == NIL | |
92 child_node.parent = parent_node | |
93 else | |
94 // Conflicting parent already exists, keep the original. | |
95 // We get here only when handling References: header. | |
96 if child_node.uid != 0 | |
97 // We've already seen this message. It specifies its own parent and | |
98 // it doesn't matter what any other reference says. The only way its | |
99 // parent can change is if the message itself gets expunged. | |
100 child_node.expunge_rebuilds = true | |
101 else | |
102 // Message doesn't exist, so it was one of the node's children | |
103 // that created the original reference. If that reference gets | |
104 // dropped, the parent is changed. We could catch this in one of | |
105 // several ways: | |
106 // a) Link to original parent node gets unreferenced | |
107 // b) Link to this node gets unreferenced | |
108 // c) Any of the child nodes gets expunged | |
109 // b) is probably the least likely to happen, so use it | |
110 child_node.parent_unref_rebuilds = true | |
111 | |
112 thread_add_msg(uid) | |
113 // get the Message-IDs as specified by the thread spec | |
114 (msgid, parent_msgid, references) = message_get_thread_headers(uid) | |
115 | |
116 if msgid != NIL | |
117 if nodes[msgid].uid == 0 | |
118 nodes[msgid].uid = uid | |
119 else | |
120 // duplicate Message-ID. if the original ever gets expunged, | |
121 // rebuild the thread tree | |
122 nodes[msgid].expunge_rebuilds = true | |
123 msgid = NIL | |
124 | |
125 if msgid == NIL | |
126 msgid = get_unique_msg_id() | |
127 | |
128 node = nodes[msgid] | |
129 if node.parent != NIL and | |
130 (parent_msgid == NIL or node.parent.msgid != parent_msgid) | |
131 // Conflicting parent, remove it. If this message gets expunged, we have | |
132 // to revert back to the original parent. | |
133 node.parent = NIL | |
134 node.expunge_rebuilds = true | |
135 | |
136 if parent_msgid != NIL | |
137 link_reference(nodes[parent_msgid], node) | |
138 | |
139 // go through References (skipping the last one) | |
140 for (parent_msgid, child_msgid) in references | |
141 link_reference(nodes[parent_msgid], nodes[child_msgid]) | |
142 | |
143 unref_link(parent, child) | |
144 if parent.parent_unref_rebuilds | |
145 return false | |
146 | |
147 child.link_refcount-- | |
148 if child.link_refcount == 0 | |
149 child.parent = NIL | |
150 return true | |
151 | |
152 // returns false if thread tree needs to be rebuilt | |
153 thread_expunge_msg(uid) | |
154 // get the Message-IDs as specified by the thread spec | |
155 (msgid, in_reply_to_msgid, references) = message_get_thread_headers(uid) | |
156 | |
157 node = nodes[msgid] | |
158 if node.uid != uid | |
159 // Removing a duplicate Message-ID | |
160 return false | |
161 | |
162 if node.expunge_rebuilds | |
163 return false | |
164 | |
165 if parent_msgid != NIL and | |
166 not unref_link(nodes[parent_msgid], nodes[child_msgid]) | |
167 return false | |
168 | |
169 if references != NIL | |
170 // go through References | |
171 for (parent_msgid, child_msgid) in references | |
172 if not unref_link(nodes[parent_msgid], nodes[child_msgid]) | |
173 return false | |
174 last_parent_msgid = references.last | |
175 else if in_reply_to_msgid != NIL | |
176 last_parent_msgid = in_reply_to_msgid | |
177 | |
178 if last_parent_msgid != NIL and | |
179 not unref_link(nodes[last_parent_msgid], node) | |
180 return false | |
181 | |
182 // mark this node as expunged | |
183 node.uid = 0 | |
184 return true | |
185 | |
186 thread_iterate() | |
187 root_nodes = [] | |
188 referenced = [] | |
189 children = [][] | |
190 // Steps (2) and (3) | |
191 for node in nodes | |
192 if node.parent != NIL | |
193 root_nodes[] = node | |
194 else | |
195 referenced[node.parent] = true | |
196 if node.uid != 0 | |
197 // Find the node's first non-dummy parent and add the node as its child. | |
198 // If there are no non-dummy parents, add it as the highest dummy's | |
199 // child. | |
200 nondummy_parent = node.parent | |
201 while nondummy_parent.uid == 0 and nondummy_parent.parent != NIL | |
202 nondummy_parent = nondummy_parent.parent | |
203 children[nondummy_parent][] = node | |
204 | |
205 for node in root_nodes | |
206 if node.uid == 0 | |
207 if children[node] == NIL | |
208 // remove dummy roots that have no children. | |
209 delete(node) | |
210 else if count(children[node]) == 1 | |
211 // dummy root has a single child, replace the root with its child | |
212 node = children[node] | |
213 | |
214 for node in nodes | |
215 if node.uid == 0 and !referenced[node] | |
216 free(node) | |
217 | |
218 // root_nodes and children now contain a tree equivalent to a tree built by | |
219 // THREAD=REFERENCES specification steps (1)-(3). The rest of the steps | |
220 // can be performed using them. Note that node.parent should not (and need | |
221 // not) be used because it points its parent before steps (2) and (3). |