Skip to content

Starting an iterator in the middle of a values list in a label-ordered RecordTrie #30

Open
@gmossessian

Description

@gmossessian

Hi Mike,

Thanks for building this wrapper and providing the additional Bytes and RecordTrie classes, they're great and extremely useful and have been largely easy to build additional features into.

I would like to implement a RecordTrie feature as follows: if there are key-value pairs (u'a', (1, N_1)), (u'a', (2, N_2)), (u'a',(3, N_3)), ..., (u'a', (i, N_i)), and I know that I only need the values stored in N_p through N_q. If i is very large (say, a million), calling my_list = my_trie[u'a'] is prohibitively slow, and so I would like to start the loop at the pth value, that is, at b_prefix = <bytes>u'a'.encode('utf8') + self._b_value_separator + <bytes>bytes(struct(">I", p)).

Of course, if I set Agent key to this, the loop will not continue on to (u'a', (p+1, N_{p+1})), and I cannot figure out how to "trick" the predictive search into continuing to loop past that specific prefix and on to anything with prefix simply <bytes>u'a'.encode('utf8') + self._b_value_separator without resetting the entire loop from the top.

Do you have any idea of how this might be accomplished, or if it is a limitation of the marisa-trie base library?

Thanks,
-George

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions