DNF-JSON #30
Replies: 0 comments 1 reply
-
Thanks for writing this down. My comments follow...
I think that rather than talking about "dependencies of a distribution", we should talk about "package set of an image". In addition, I would not say that we need to perform two solving requests, but it is what we ended up doing.
From use case point of view, I would say that we have two cases in composer:
osbuild-composer currently uses approach (1.) for all package sets, while ideally it should use (2.) when handling Blueprint (user-provided) packages. Ideally, while composing the image OS package set (the actual payload), we should do (1.) + (2.) in one step and install the resulting combined package set.
The important piece is that dep-solving the blueprint package set should not end up pulling in packages, which are not a transitive dependency of any of the requested packages, which would be already covered by the base image package set. The issue is that we currently see this undesired behavior, because we dep-solve blueprint packages as if there were no other packages installed in the image. This effectively ends up being a bootstrapping scenario, which it is not by definition. To the point of
Yes. DNF seems to be able to read the already installed packages only from RPMDB, which is an unfortunate limitation for us. However it turns out that by dep-solving two transactions in a row, gives what we need - osbuild/osbuild-composer#2125 (comment) |
Beta Was this translation helpful? Give feedback.
-
DNF-JSON
Intro
The dnf-json tool serves as a gateway between DNF's python API and a client.
Dnf-json is a deamon. It is started by a systemd socket and answers http requests. These requests must contain a
json
respecting the format described below.There are two kind of requests:
Depsolve request is used to solve the dependencies assuming an empty system.
Dump request is used to dump all the packages in each repos.
Description of the JSON interface to the service
Sample input for a depsolve request
package-specs
: the list of the packages to installexclude-specs
: the list of the packages to not instalrepos
: the list of repos to load in the cacheSample output
Output truncated to show only one dependency.
checksums
: the list of checksums of each in-cache reposdependences
: list of packages to install with their location, name and checksumInteraction between the daemon side and the DNF's python API
for a depsolve request
dnf configuration
Current procedure to depsolve dependencies for a custom image.
Double depsolve
To obtain the list of dependencies of a distribution we need to perform two solving request.
The first request contains the list of packages needed to create the base of the distributions minus all the weak dependencies.
We manually specify the list of the packages we don't want to have in the final distribution.
Followed by a second solving request for the customer's package list to install. This second request does not have weaks dependencies disabled, so it'll pull all the dependencies possible.
issues with this approach
The issue with the second depsolve, is that it will pull packages from the first solving request. So we will endup with a lot of duplicates.
And we can't really make only one request as some dependencies required by the second request might be masked by the excluded packages of the first one.
The ideal would be to trick DNF to think that the list of packages solved with the first request are installed on the system to perform the second request.
Hacks and Bodges
Cache cleaning
A lot of customer's customization are involving one-shot repositories that will be cached only once. Because of that we have an ever growing cache directory. We have to run a routine after each solving that cleans cache directories that aren't used since a long period of time.
Multiprocessing and the global lock
We have noticed that DNF's python API is leaking memory if we make create multiple
dnf.Base
objects one after the others. It might be due to out edgy way of using the API though. So to counteract that, each http request is handled in a separate process. This way the memory is automatically freed when the process is terminated.Beta Was this translation helpful? Give feedback.
All reactions