-
Notifications
You must be signed in to change notification settings - Fork 2
[WIP] pyDKB: storages #277
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: pyDKB-loggable-object
Are you sure you want to change the base?
Conversation
174a9b3 to
d0eb05e
Compare
I presume that the last line of the example should be Side note - I've missed the [WIP] status of the PR, so my apologies if the problems here were already noted and planned to be dealt with later, and I started pointing at them too early. |
Right, it was just a copy-paste mistake, thank you for bringing it to my attention.
It`s Ok, the main idea of creating this PR was exactly to let others see what`s going on and where the development goes. Any comments are welcome, including general ones like "what it is supposed to look like for $this or $that kind of storage?", "suggested way of usage does not look very natural, I`d expect something like $this or $that" etc. |
| else: | ||
| scope = _scope | ||
| if scope is None: | ||
| raise exceptions.StorageException("Storages scope not specified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| raise exceptions.StorageException("Storages scope not specified") | |
| raise exceptions.StorageException("Storage's scope not specified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not a mistake (or, better to say, it is a different mistake): the message says that scope of storages, from which should be picked one (or maybe "storage lookup scope"?), is not specified. "Storage scope" is more like a scope of a single storage -- which wouldn`t have much sense in this situation.
Will be changed, thank you!
Sometimes we have to interact with the same storage in different
scripts. When it happens we got to implement same functionality multiple
times: read configuration, check that Python module is available, create
client, ...
This submodule is a place to implement this once and then reuse whenever
it is needed.
Initial module structire:
```
pyDKB
|---> storages
|---> exceptions
| |---> StorageException
| |---> NotFound
|
|---> client
|---> Client
```
Suggested way of the `pyDKB.storages` usage is:
```
from pyDKB import storages
def processA(...):
rucio_client = storages.getClient('rucio', 'atlas')
rucio_client.get_metadata(...)
...
def processB(...):
rucio_client = storages.getClient('rucio', 'atlas')
rucio_client.get_metadata(...)
...
```
The client used in `processA` and `processB` (and at every call of a
function) will be the same instance, initialized at first call. In other
words, client classes for individual storages are "singleton" classes
(as long as used via `getClient()` methods, not directly).
"Scope" abstraction is introduced to keep project-specific fucntionality
localised in a single submodule, preventing it from infiltration into
other, more general modules. Each "scope" is nothing but a submodule of
`pyDKB.storages`. It must contain pre-configured clients for the storages,
used in a given project and support general access method to these
clients (`getClient()` function).
It allows to get clients simply by name, not specifying the scope every
now and again:
```
from pyDKB import storages
storages.setScope('atlas')
def processA(...):
s = storages.getClient('rucio')
s.get_metadata(...)
```
…class.
The idea of the interface to provide common set of methods to interact
with different storages, not to (re-)implement all possible methods of
interaction with storages. But different storages may have specific
methods, and calling them as `my_client.c.client_method_A()` will look a
bit wierd: `my_client.client_method_A()` looks more natural.
To avoid re-implementation of all useful methods in this way:
```
def client_methodA(self, *args, **kwargs):
self.c.client_method_A(*args, **kwargs)
```
multiple inheritance will be used:
```
class MyClient(pyDKB.storages.client.Client, ParentClientClass):
...
```
In case of `ParentClientClass` having same methods as Client, by default
`Client` (interface) method will be used, raising `NotImplementedError`.
If `ParentClientClass` method should be used, it is to be specified
explicitly.
"id" is a reserved word in Python, so it should not be used as variable or parameter name.
Original Rucio `Client` does not have `get()` method.
Currently it is nothing but a wrapper around standard `elasticsearch.Elasticsearch` class, except that its `get()` method is overridden with `Client.get()` that raises `NotImplementedError`.
It allows to pass configuration to the client as hash. `__init__()` accepts parameters in the form consistent with the `elasticsearch.Elasticsearch()` parameters, so it is not possible to pass default index name when the object is created.
Co-Authored-By: Evildoor <[email protected]>
7b74454 to
a349cb6
Compare
...to rebase PR on the current version of the Update:
First push was a bit of a mess, so to check the real diff please use this link: diff between 7b74454 and 37de231. |
a349cb6 to
37de231
Compare
There's only two possibilities: an exception was or was not thrown. Whatever exception it is, it indicates that we failed to import something we wanted to import => so we must return a value that indicates "import has failed" -- `False`.
Who knows what value has this or that attribute one wants to import, right? And `False` looks like a pretty possible one... while string "NOT IMPORTED VALUE" should be less expected and I believe it won't conflict with real attributes values (except itself, but hopefully no one will use this function to import something added as service variable for this function itself).
Rewrites #244
This PR introduces to pyDKB library new submodule storages.
The module is intended to accumulate all the functionality related to the interaction with the storages we use, including both external and internal ones. It basically provides unified way to interact with any of them: object of
Clientclass, that understands basic set of methods, like:get()request (get item by ID);custom (parametric)query.Current modules structure is as follows:
Suggested way to use the new functionality is like this:
For storages like ES will be added following submodules:
pyDKB.storages.client.es-- wrapper for ES client withpyDKB.storages.client.Clientinterface;pyDKB.storages.atlas.chicagoES-- Chicago ES client;pyDKB.storages.atlas.dkb.es-- DKB ES client.