Skip to content

Commit 746c74a

Browse files
committed
Spitballing ideas for better indexing
There are three ways in which an object may be modified: 1) write/update by client, 2) handoff, 3) read-repair. A post-commit hook only handles the first case. By adding an "object modified" hook to the vnode all 3 cases can be handled in the same manner. After all in every case the ulitmate goal is to update the object and when the object is updated it's indexes should be updated as well.
1 parent 4387a91 commit 746c74a

File tree

1 file changed

+76
-0
lines changed

1 file changed

+76
-0
lines changed

IMPLEMENTATION.md

+76
Original file line numberDiff line numberDiff line change
@@ -5,6 +5,82 @@ Notes on the implementation _before_ it is implemented. Think of it
55
something like [readme driven development] [rdd].
66

77

8+
Indexing
9+
----------
10+
11+
### Avoid Post-Commit Hook
12+
13+
* The object must be sent `2 * N` times. It needs to be `N` times for
14+
the KV write and another `N` times for indexing. In the worst case
15+
all of those messages have to traverse the network and in the best
16+
case `2N - 2` messages have to. If the object size is 5MB--the
17+
block size in RiakCS--then 30MB of data must traverse the network,
18+
15MB of which is redundant.
19+
20+
* Post-commit is executed on the coordinator after the object is
21+
written and a client reply is sent. This provides no back pressure
22+
to the client.
23+
24+
* Post-commit error handling is wrong. It hides errors and just
25+
increments a counter kept by stats. You must alter the logging
26+
levels at runtime to discover the cause of the errors.
27+
28+
* Post-commit is only invoked during user put requests. Indexing
29+
changes also need to occur during read-repair and handoff events.
30+
Any time the KV object changes the index needs to change as well (at
31+
minimum the object hash must be updated).
32+
33+
### Add Event Hooks to VNodes
34+
35+
* Gives Yokozuna access to all object modifications.
36+
37+
* Exploits locality, avoids redundant transmission of the object
38+
across the network.
39+
40+
* Provides back-pressure during client writes.
41+
42+
* Could set the stage for atomic commits between KV and other
43+
services if that's something we wanted to pursue.
44+
45+
* A downside is that now more is happening on the KV vnode which is a
46+
high contention point as it is. Measuring and careful thought is
47+
needed here.
48+
49+
### Ideas for Implementation
50+
51+
* I'm not sure if this is a generic vnode thing or specific to the KV
52+
vnode. Right now I'm leaning towards the latter.
53+
54+
* The events Yokozuna needs to react to: put, read-repair (which is
55+
ultimately a put), and handoff (which once again is just a put).
56+
Maybe all I need is a low-level hook into the KV backend put. Might
57+
help to think of Yokozuna as a backend to KV that compliments the
58+
primary backend. Using a low-level put hook covers all cases since
59+
it is invoked any time the object is modified. It also provides
60+
some future proofing as it should always be the least common
61+
denominator for any future object mutation (e.g. if some new type of
62+
event was added to KV that causes the object to be modified).
63+
64+
* Invoke the hook IFF the object has changed and the write to the
65+
backend was successful. Have to look at `PrepPutRes` from
66+
`prepare_put` and `Reply` from `perform_put`.
67+
68+
* The deletion of an object from a backend is a separate code path.
69+
Need to hook into that as well.
70+
71+
* The handoff object put is a different code path, see
72+
`do_diffobj_put`. Need to hook into this.
73+
74+
* Yokozuna handoff piggy-backs KV handoff (by re-indexing on put
75+
versus sending across Solr index data) and therefore Yokozuna vnode
76+
handoff is simple matter of dropping the index. Actually, this is a
77+
lie. If the KV vnode doesn't handoff first then an entire partition
78+
of replicas is lost temporarily. The Yokozuna vnode needs a way to
79+
tell the handoff system that it cannot start handoff until the KV
80+
service for the same partition performs handoff. This could be done
81+
by returning `{waiting_for, [riak_kv]}`. The vnode manager will
82+
probably have to be modified.
83+
884
Searching
985
----------
1086

0 commit comments

Comments
 (0)