Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support "concurrency groups" for archiver processes #23130

Open
mneudert opened this issue Mar 14, 2025 · 0 comments
Open

Support "concurrency groups" for archiver processes #23130

mneudert opened this issue Mar 14, 2025 · 0 comments
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Stability For issues that make Matomo more stable and reliable to run for sys admins.

Comments

@mneudert
Copy link
Member

The Matomo CLI archiver provides a simple concurrency control by using the --concurrent-archivers argument.

When used it will prevent the number of running archiving processes to exceed the set number by checking the server's process list for other running archivers. In multi-tenant setups the instance id (--matomo-domain) will be taken into account to group archiving processes together.

Archivers with different configurations

This works well until there is a need to add temporary archivers to work on a short-term backlog, for example after creating several new segments.

When trying to run different configurations for archivers on the same host, the current concurrency handling can also lead to unintended balancing.

For example, the following crontab:

15 * * * * console core:archive --concurrent-archivers=2 --skip-idsites=1
45 * * * * console core:archive --concurrent-archivers=2 --force-idsites=1

It could be created with the intent to have one archiver work on "site 1", and one archiver work on all other sites.

However, depending on the timing, it will eventually lead to "two archivers work on site 1, none on the rest", or vice versa. And once an archiver is started manually to work on a bigger invalidation backlog, it could prevent the crontab from being executed because concurrency slots are occupied by the manually started process.

Concurrency groups

One option to solve the mentioned problem would be to introduce and additional concurrency group flag:

15 * * * * console core:archive --concurrent-archivers=1 --skip-idsites=1 --concurrency-group=other-sites
45 * * * * console core:archive --concurrent-archivers=1 --force-idsites=1 --concurrenty-group=primary-site

The internal CronArchive::hasReachedMaxConcurrentArchivers can check if the current process was started for a given concurrency group and ignore processes in other groups, allowing a more fine-grained control over what the archiving work is spent on. And giving the option to raise the number of archivers on one server manually without changes to the existing (cron) setup.

@mneudert mneudert added Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Stability For issues that make Matomo more stable and reliable to run for sys admins. labels Mar 14, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement For new feature suggestions that enhance Matomo's capabilities or add a new report, new API etc. Stability For issues that make Matomo more stable and reliable to run for sys admins.
Projects
None yet
Development

No branches or pull requests

1 participant