Skip to content

router: make full map_callrw with split args#644

Open
mrForza wants to merge 5 commits into
tarantool:masterfrom
mrForza:gh-559-full-map-call-rw-with-split-args
Open

router: make full map_callrw with split args#644
mrForza wants to merge 5 commits into
tarantool:masterfrom
mrForza:gh-559-full-map-call-rw-with-split-args

Conversation

@mrForza
Copy link
Copy Markdown
Contributor

@mrForza mrForza commented Mar 4, 2026

This patch introduces a new way of map_callrw execution by which we can
pass some arguments to all storages and split buckets' arguments to those
storages that have at least one bucket of bucket_ids. To achieve this we
introduce a new string option - mode to map_callrw api.

Also we change the logic of router_ref_storage_all ref function. Firstly
we ref all storages and get back an amount of "moved" buckets according
to the previously built router's cache. Then if there are no "moved"
buckets we accumulate and check total amount of buckets on all storages
and finish map_callrw ref stage. Otherwise, if there are some "moved"
buckets we perform the second network hop by checking on which replicasets
do the remaining "moved" buckets reside on.

Closes #559

Copy link
Copy Markdown
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well done, only one major comment (upgrade), we must address, other ones are nits and smth to think about)

Comment thread vshard/router/init.lua Outdated
Comment thread vshard/router/init.lua
end
-- Netbox async requests work only with active connections.
-- So, we need to wait for the master connection explicitly.
timeout = deadline - fiber_clock()
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's consider returning the remaining timeout from grouped_buckets, so that we don't have to do that crutchy things with deadline. Moreover, fiber_clock is not so cheap, as it seems (it goes to C and back), and if you return the remaining timeout from grouped_buckets you'll need to make one additional fiber_clock there and not two of them inside router_ref_prepare

Consider moving that to the separate commit, since the same applies even before refactoring

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The grouped_buckets function is used in router API. If our new version of grouped_buckets will return timeout in addition to grouped_buckets and err it can cause some problems of our clients who used this function in their services and don't expect returning of 3 params instead of 2.

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Firstly, buckets_group is internal, since its name starts with _ in public API, so we can change it however we like, user application should not rely on that function. Secondly, if we really care, we can always make a wrapper around buckets_group, so that it returns only 2 arguments, as it was before.

But it's too minor to care about, so feel free to resolve. Up to you

Comment thread vshard/router/init.lua Outdated
Comment thread vshard/router/init.lua
Comment thread vshard/router/init.lua Outdated
-- high-level ref functions (such as router_ref_storage_all and router_ref_
-- storage_by_buckets).
--
local function router_ref_send(router, timeout, args_builder, grouped_buckets)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, did you consider making it even more general: router_map_callrw_send and router_map_callrw_collect and also reuse them in the replicasets_map_reduce too? There we have almost the same code. Not a call to action, just something to think about.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, router_map_callrw_prepare/send/collect sounds better.

But I don't think that we can easily reuse these functions in replicasets_map_reduce because:

  1. We need to change api of router_map_callrw_send by
    • changing router param into replicasets_all, as replicasets_map_reduce does not have router variable in it.
    • adding extra argument - return_raw
  2. The logic of sending map stage in replicasets_map_reduce is more complex than logic of router_map_callrw_send. In last one we only do RPC with no arguments or with grouped_buckets[rs_id] and that's all. But in replicasets_map_reduce we need to dynamically change those arguments which should be passed in RPC. Before RPC we add bucket arguments to args and after RPC we delete them. The main problem here is that we can only easily add groupped buckets in args builder, but not delete, because the deletion is happened after RPC. If we want to do it in our new router_map_callrw_send function we need to complicate args_builder and use table.deepcopy (it can slow down perf).
  3. We need to change api of router_map_callrw_collect by passing extra return_raw argument.
  4. Also router_map_callrw_collect will be complicated as we should add two different ways of extracting results.

Copy link
Copy Markdown
Collaborator

@Serpentian Serpentian Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changing router param into replicasets_all, as replicasets_map_reduce does not have router variable in it.

1.1 Yes, and it will become replicasets_map_send, which is way cleaner, since you will explicitly pass the replicasets and not use some non obvious loigic to create the list of replicasets based on optional argument grouped_buckets (which won't be needed). You can return replicasets from the router_map_prepare.

adding extra argument - return_raw

1.2. Let's just pass the opts and add is_async explicitly inside the function. And call the funciton router_map_callrw_send_async. The name should always say, what the function does.

The logic of sending map stage in replicasets_map_reduce is more complex than logic of router_map_callrw_send.

  1. Agree, see no other way than deepcopy.

We need to change api of router_map_callrw_collect by passing extra return_raw argument.
Also router_map_callrw_collect will be complicated as we should add two different ways of extracting results.

  1. You can always just create the new function router_map_callrw_collect_raw and not pass the raw to it

Again, up to you, if you think, that the current variant is better, I'm ok

Comment thread vshard/storage/init.lua Outdated
Comment thread vshard/storage/init.lua
return res
end

local function storage_ref_check_existent(rid, bucket_ids)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe storage_ref_check_with_present_buckets, so that the name is similar to storage_ref_make_with_buckets

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

storage_ref_check_with_present_buckets is very long name. May be a little shorter?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But storage_ref_check_existent doesn't say, what the function does. Maybe smb will propose better naming, I don't really like the current one, but I don't insist

Comment thread vshard/storage/init.lua Outdated
local allstatus = consts.BUCKET
for _, bucket_id in pairs(bucket_ids) do
local bucket = box.space._bucket:get{bucket_id}
if bucket and bucket.status ~= allstatus.GARBAGE and
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for allstatus, we have BGARGABE. If you don't mind, we can also refactor that small thing in the bucket_get_moved

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Copy link
Copy Markdown
Collaborator

@Serpentian Serpentian Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And in the buckets_are_all_rw_not_cache) Will be ok in the second commit

@Serpentian Serpentian requested a review from Gerold103 March 12, 2026 18:23
@Serpentian Serpentian assigned Gerold103 and mrForza and unassigned Serpentian Mar 12, 2026
Copy link
Copy Markdown
Contributor

@kamenkremen kamenkremen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great patch! Left some minor comments below

Comment thread vshard/router/init.lua Outdated
local futures = {}
local opts_async = {is_async = true}
local replicasets_all = router.replicasets
local rs_ids = grouped_buckets and grouped_buckets or replicasets_all
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't that equivalent to local rs_ids = grouped_buckets or replicasets_all?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua

--
-- Perform Ref stage of the Ref-Map-Reduce process on a subset of all the
-- replicasets, which contains all the listed bucket IDs.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment got cut out

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua Outdated
--
-- Ref stage: collect.
--
futures = futures or {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

futures is already either {}(because of line 820) or some table, so we don't need this

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua Outdated
-- Group the buckets by replicasets according to the router cache.
grouped_buckets, err = buckets_group(router, bucket_ids, timeout)
if err ~= nil then
return nil, err
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: triple space in indentation

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua
Comment on lines +1157 to +1158
return nil, lerror.make('Router can\'t execute map_callrw with ' ..
'\'partial\' mode and nil bucket_ids')
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return nil, lerror.make('Router can\'t execute map_callrw with ' ..
'\'partial\' mode and nil bucket_ids')
return nil, lerror.make("Router can't execute map_callrw with " ..
"'partial' mode and nil bucket_ids")

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer to use single quotes for string literals in lua. The same for other comments below.

Comment thread vshard/router/init.lua
Comment on lines +1161 to +1162
return nil, lerror.make('Router can\'t execute map_callrw with ' ..
'\'full\' mode and numeric bucket_ids')

This comment was marked as duplicate.

Comment on lines +696 to +697
t.assert_equals(res.err.message, 'Router can\'t execute map_callrw ' ..
'with \'full\' mode and numeric bucket_ids')

This comment was marked as duplicate.

Comment on lines +708 to +709
t.assert_equals(res.err.message, 'Router can\'t execute map_callrw ' ..
'with \'partial\' mode and nil bucket_ids')

This comment was marked as duplicate.

@kamenkremen kamenkremen removed their assignment Mar 20, 2026
@mrForza mrForza force-pushed the gh-559-full-map-call-rw-with-split-args branch from 1bbdade to 8962951 Compare March 30, 2026 10:54
mrForza added 5 commits April 1, 2026 15:50
Before this patch the main `map_callrw` ref functions such as
`router_ref_storage_all` and `router_ref_storage_by_buckets` were
enormous (71 and 108 lines of code). Also these functions have a large
number of similar functional code blocks such as "sending refs",
"collecting refs" e.t.c. Since in tarantool#559 patch we will
extend the logic of full map_callrw making it able to work with split
args, the `router_ref_storage_all` can double in size. It can lead to
degradation of our codebase due to less readability.

To fix it we firstly determine general and repeated code blocks in ref
functions:
1) `ref-prepare`: groups buckets by replicasets with router's cache,
   builds a table of "target" replicasets and waits necessary masters.
2) `ref-send`: sends refs to the remote storage asynchronously and builds
   a table of future objects for the next processing.
3) `ref-collect`: waits until future objects are ready in order to
   extract payload from it (responses of storages' functions).
4) `ref-process`: a custom logic for `full` or `partial` map_callrw
   modes which describes how we should process results from future
   objects.

After defining the main stages of ref map_callrw functions we should
unify them so that we can use them in both `router_ref_storage_all` and
`router_ref_storage_by_buckets`.

Needed for tarantool#559

NO_TEST=refactoring
NO_DOC=refactoring
In this patch we change `allstatus.GARBAGE/SENT` on `BACTIVE/BSENT` to
not repeat the code.

Needed for tarantool#214

NO_DOC=refactoring
NO_TEST=refactoring
This patch takes initialization of `rid` out to `router_map_callrw` and
passes this variable to ref-functions. It is needed for future features
tidiness, for example - `make full map_callrw with split args` in which
the logic of `router_map_callrw` becomes more complex.

Needed for tarantool#559

NO_DOC=refactoring
NO_TEST=refactoring
Before this patch the `router-luatest/reload_test` checked router's
services only with old routers. However in future patch (tarantoolgh-214) we
need to check map_callrw with old storages.

In order to make it able we:
1) change `vtest.cluster_new` so that we can pass server_config with
   certain ENV (LUA_PATH) variable into it. It can help us to create
   a new cluster on old version of vshard.
2) change `reload_router` to more general `reload_server` in order to
   unify the process of servers (router / storage) upgrade in
   `router-luatest/reload_test`.
3) unify the process of cluster creation -
   `create_cluster_on_specific_version` and the process of getting
   server's config with new ENV (LUA_PATH) variable -
   `get_config_for_specific_vshard_version`.

Needed for tarantool#214

NO_DOC=refactoring
NO_TEST=refactoring
This patch introduces a new way of `map_callrw` execution by which we can
pass some arguments to all storages and split buckets' arguments to those
storages that have at least one bucket of `bucket_ids`. To achieve this we
introduce a new string option - `mode` to `map_callrw` api.

Also we change the logic of `router_ref_storage_all` ref function. Firstly
we ref all storages and get back an amount of "moved" buckets according
to the previously built router's cache. Then if there are no "moved"
buckets we accumulate and check total amount of buckets on all storages
and finish map_callrw ref stage. Otherwise, if there are some "moved"
buckets we perform the second network hop by checking on which replicasets
do the remaining "moved" buckets reside on.

Closes tarantool#559

@TarantoolBot document

Title: vshard: `mode` option for `router.map_callrw()`

This string option regulates on which storages the user function will be
executed via `map_callrw`. Possible values:
1) mode = 'partial'. In this mode user function will be executed on
   storages that have at least one bucket of 'bucket_ids'. The
   'bucket_ids' option can be presented in two ways: like a numeric array
   of buckets' ids or like a map of buckets' arguments. In first one user
   function will only receive args, in second one it will additionally
   receive buckets' arguments.
2) mode = 'full'. In this mode user function will be executed with args on
   all storages in cluster. If we pass 'bucket_ids' like a map of bucket's
   arguments the user function will additionally receive buckets'
   arguments on those storages that have at least one bucket of
   'bucket_ids'.

If we didn't specify the 'mode' option, then it is set based on
'bucket_ids' option - if 'bucket_ids' is presented, the mode will be
'partial' otherwise 'full'. Also now `map_callrw` ends with error in
cases of `<mode = 'full', bucket_ids = {1, 2, ...}>` and `<mode =
'partial', bucket_ids = nil>`.
@mrForza mrForza force-pushed the gh-559-full-map-call-rw-with-split-args branch from 8962951 to 8f5223c Compare April 1, 2026 12:58
@mrForza mrForza requested a review from Serpentian April 1, 2026 13:33
@mrForza mrForza assigned Serpentian and unassigned Gerold103 and mrForza Apr 1, 2026
Copy link
Copy Markdown
Collaborator

@Serpentian Serpentian left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any major comments, I think it's time to ask @Gerold103 for review, so that we can be sure, that there no major flaws, which I've missed

local server = g.cluster:build_server({
alias = replica_name,
box_cfg = box_cfg,
env = server_config.env
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a server config, if you use just one variable from it, let's instead set server_config.alias and server_config.box_cfg and pass it fully to build_server

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

local global_cfg

local function get_config_for_specific_vshard_version(hash)
git_util.exec('checkout', {args = hash .. ' -f', dir = g.vshard_copy_path})
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is definitely not about get config, it checkouts the repo. From the function name it looks as read-only, but it's not. Let's either say about that in the function name or move it out of the function (I'd prefer the latter, it's not that widely used)

-- The test works in the following directory
local vardir = vtest.vardir or fio.tempdir()
g.vshard_copy_path_load = vardir .. '/vshard_copy'
t.assert_equals(fio.mkdir(g.vshard_copy_path_load), true)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The catch_flaky workflow fails on that line, it looks like the test doesn't support parallel runs, since the vardir points to /tmp/t/00X_router-luatest , which is shared among several workers. We should use smth else


g.before_test('test_map_callrw', function(g)
g.cluster:drop()
-- Full mapp_callrw with split args was introduced just right after
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: mapp

And let's be consistent and specify the name of the commit:

    -- Latest meaningful commit:
    --     "router: fix reload problem with global function refs".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

-- storage versions in order to check that there will be no crashes
-- of storages due to changes in storage_ref_* functions.
create_cluster_on_specific_version(
'1be7b8e1055ecd2f2033d6304408e246b0f2ba46')
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not an option, the hash of commits change on merge, let's pick some commit from the master and not your branch

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


g.after_test('test_map_callrw', function(g)
g.cluster:drop()
create_cluster_on_specific_version(g.latest_hash)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need to recreate it, it's enough to drop your cluster, it'll be automatically created in before_each later

Comment thread vshard/storage/init.lua
return res
end

local function storage_ref_check_existent(rid, bucket_ids)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But storage_ref_check_existent doesn't say, what the function does. Maybe smb will propose better naming, I don't really like the current one, but I don't insist

Comment thread vshard/router/init.lua Outdated
local replicasets_all = router.replicasets
local deadline = fiber_clock() + timeout
if bucket_ids then
bucket_ids = bucket_ids or {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be in the first commit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua
local bucket_ids = {}
for _, res in pairs(results) do
for _, bucket in pairs(res.moved) do
if type(res) ~= 'table' then
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we start checking for return type and why we don't do that in the router_map_callrw_process_existent

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread vshard/router/init.lua
args, grouped_args, {
timeout = timeout, return_raw = do_return_raw
})
opts = {timeout = timeout, return_raw = do_return_raw, mode = mode}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You don't need mode, it's unused in the replicasets_map_reduce

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@Serpentian Serpentian assigned Gerold103 and mrForza and unassigned Serpentian Apr 9, 2026
Copy link
Copy Markdown
Collaborator

@Gerold103 Gerold103 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this complex topic and being patient about comments 🙏.

Comment thread vshard/router/init.lua
-- high-level ref functions (such as router_ref_storage_all and router_ref_
-- storage_by_buckets).
--
local function router_map_callrw_send(router, timeout, args_builder,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name send is very generic. When I saw this function, I was confused what it is sending - ref? function calls? Since this stage is about refs exclusively, I suggest to include "ref" into the name.

Same about router_map_callrw_collect() - collect what: results of refs? results of the user function calls? From its code I see that it is only applicable to refs. So lets name its name reflect that, and be in sync with the send-function.

Something like '..._ref_send' + '..._ref_wait'?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, router_map_callrw_ref_send and router_map_callrw_ref_wait sound better

Comment thread vshard/router/init.lua
-- Waits until all future objects are ready and extracts results from it.
--
local function router_map_callrw_collect(futures, timeout)
local results = {}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was at least one reason in the previous code to have the stages not extracted into separate functions - the ability to reuse Lua tables. Lua table, as we use them here, are essentially just hash-tables. With all the consequences, that they are allocated on the heap, might need re-allocation, re-hashing, etc. And the goal was to minimize the number of such tables created and destroyed per request.

But I see also your point about code readability. I just hope it won't hurt perf. Probably we shouldn't notice much, since map-requests in general are quite expensive anyway due to other bigger reasons, than Lua tables.

Copy link
Copy Markdown
Contributor Author

@mrForza mrForza Apr 15, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite understand how the usage of hash-tables in separate functions can significantly degradate the perf of map_callrw.

And the goal was to minimize the number of such tables created and destroyed per request.

In our new map_callrw version we additionally create only results table after router_map_callrw_ref_send - this is the only place where some extra time can be spend on hash-table actions. futures, grouped_buckets, replicasets_to_check(wait) and bucket_ids we also created in the previous version of map_callrw (there is nothing new here).

Comment thread vshard/router/init.lua
local replicasets_all = router.replicasets
local rs_ids = grouped_buckets or replicasets_all
for rs_id, _ in pairs(rs_ids) do
local args_ref = args_builder(rs_id)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks expensive and complicated. It worries me that this requires a function call + each of them will return a new Lua table, while previously it was possible to reuse them.

I don't know tbh. Is this code really much more readable? For example: local rs_ids = grouped_buckets or replicasets_all. That looks quite untrivial to me. The caller not only must be aware of how to make this args_builder callback, but also know that grouped_buckets is optional and its default is "all replicasets".

Comment thread vshard/router/init.lua Outdated
Comment on lines +866 to +871
for _, bucket in pairs(res.moved) do
if type(res) ~= 'table' then
goto continue
end
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is it then if not a table?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A number for example. During the send stage of router_ref_storage_all we make RPCs of next functions: storage_ref and storage_ref_make_with_buckets.

storage_ref always returns a number.
storage_ref_make_with_buckets always returns a table.

In router_map_callrw_process_moved we use type(res) ~= 'table' in order to explicitly distinguish from which function the result was came back (we need only moved buckets which can be received only from storage_ref_make_with_buckets).

Comment thread vshard/router/init.lua
return bucket_ids
end

local function router_map_callrw_process_existent(router, results)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect you have meant here existing, not existent. The meaning is slightly different. It also took me a while to understand what this function does. Given that it is used in a single place, and looks very short, maybe lets better inline it. Perhaps the purpose of its code will be more obvious in the inlined place.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/router/init.lua Outdated
Comment on lines 895 to 923
args_builder = function() return {'storage_ref', rid, timeout} end
args_builder = function(rs_id)
local buckets = grouped_buckets[rs_id] or {}
if grouped_buckets[rs_id] then
return {'storage_ref_make_with_buckets', rid, timeout, buckets}
else
return {'storage_ref', rid, timeout}
end
end
timeout, err, err_id, futures = router_map_callrw_send(router, timeout,
args_builder)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of curiosity. How much shorter is the code now compared with when we would inline router_map_callrw_send right here?

Comment thread vshard/storage/init.lua
return ok, ret1, ret2, ret3
end

local function bucket_get_existent(bucket_ids)
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same about existent vs existing.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Comment thread vshard/storage/init.lua
Comment on lines +3185 to +3186
local bucket = box.space._bucket:get{bucket_id}
if bucket and bucket.status ~= BGARBAGE and bucket.status ~= BSENT then
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not completely sure, so this is why I am asking. Do you know if space:get() returns nil or box.NULL when nothing is found? If the latter, then this code will break when nothing is found, because box.NULL is cast to true implicitly. But perhaps I am wrong and :get() returns nil.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get method of space object always returns nil if there is no such tuple.

reproducer
box.cfg{}

box.schema.create_space('test')

box.space.test:format({{name = 'id', type = 'unsigned'}})

box.space.test:create_index('pk', {parts = {'id'}})

a = box.space.test:get(123)

type(a) -- nil

b = box.NULL

type(b) -- cdata

Comment thread vshard/storage/init.lua
local function bucket_get_existent(bucket_ids)
local res = {}
for _, bucket_id in pairs(bucket_ids) do
local bucket = box.space._bucket:get{bucket_id}
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is expensive to go space-get on each bucket ID, given the potential number of those ids. I suggest to use bucket ref cache instead. High chance, that most buckets, if not all, will be already there. Lua tables and orders of magnitude faster than spaces.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Full map_callrw with split args

4 participants