Skip to content

Schedule Sortinghat jobs using GrimoireLab core scheduler#122

Open
jjmerchante wants to merge 2 commits intochaoss:mainfrom
jjmerchante:sortinghat-jobs
Open

Schedule Sortinghat jobs using GrimoireLab core scheduler#122
jjmerchante wants to merge 2 commits intochaoss:mainfrom
jjmerchante:sortinghat-jobs

Conversation

@jjmerchante
Copy link
Contributor

@jjmerchante jjmerchante commented Nov 11, 2025

This PR includes SortingHat tasks using GrimoireLab scheduler. This allows to run SortingHat jobs from GrimoireLab, such as affiliate, unify, genderize, import identities, and generate recommendations.

Include new generic API endpoints to list, create, retrieve, reschedule, and cancel tasks:

Include the following API endpoints:

/api/v1/tasks/
/api/v1/tasks/<str:task_type>/
/api/v1/tasks/<str:task_type>/<str:uuid>/
/api/v1/tasks/<str:task_type>/<str:uuid>/reschedule/
/api/v1/tasks/<str:task_type>/<str:uuid>/cancel/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/<str:uuid>/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/<str:uuid>/logs

Currently, <task_type> could be 'eventizer', 'recommend_affiliations', 'affiliate', 'recommend_matches', 'unify', 'recommend_gender', 'genderize', or 'import_identities'.

Fixes chaoss/grimoirelab#798

@jjmerchante
Copy link
Contributor Author

Jobs can still be scheduled using the GraphQL API, but there won’t be any worker to process them. We may need a way to disable this in the settings.

@jjmerchante jjmerchante changed the title Sortinghat jobs Schedule Sortinghat jobs using GrimoireLab core scheduler Nov 11, 2025
Copy link
Member

@sduenas sduenas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check my comments. Also, update the commit message. It says the api calls are /api/v1/<str:task_type>/ but they should be `/api/v1/tasks/str:task_type/'.

Comment on lines +25 to +36
from django.db.models import CharField, TextChoices
from sortinghat.core.importer.backend import find_import_identities_backends
from sortinghat.core.jobs import (
recommend_affiliations,
recommend_matches,
recommend_gender,
affiliate,
unify,
genderize,
import_identities,
)
from sortinghat.core.context import SortingHatContext
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from django.db.models import CharField, TextChoices
from sortinghat.core.importer.backend import find_import_identities_backends
from sortinghat.core.jobs import (
recommend_affiliations,
recommend_matches,
recommend_gender,
affiliate,
unify,
genderize,
import_identities,
)
from sortinghat.core.context import SortingHatContext
from django.db.models import CharField, TextChoices
from sortinghat.core.context import SortingHatContext
from sortinghat.core.importer.backend import find_import_identities_backends
from sortinghat.core.jobs import (
recommend_affiliations,
recommend_matches,
recommend_gender,
affiliate,
unify,
genderize,
import_identities,
)

Comment on lines +191 to +201
class SortingHatTask(Task):
"""Base class for SortingHat tasks."""

class JobType(TextChoices):
AFFILIATE = "affiliate", "Affiliate"
UNIFY = "unify", "Unify"
GENDERIZE = "genderize", "Genderize"
RECOMMEND_AFFILIATIONS = "recommend_affiliations", "Recommend affiliations"
RECOMMEND_MATCHES = "recommend_matches", "Recommend matches"
RECOMMEND_GENDER = "recommend_gender", "Recommend gender"
IMPORT_IDENTITIES = "import_identities", "Import identities"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you decide to have just one type of task and several types of jobs? Wouldn't be better to have several tasks? I think it's easy to handle parameters if the task types are different. They can inherit from SortingHatTask (or IdentitiesTask) if needed.

task_args, job_interval, job_max_retries, burst=burst, *args, **kwargs
)
try:
job_type = kwargs["job_type"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can pass the type of job directly in create_task as a parameter, so you don't have to get it from kwargs.

Comment on lines +49 to +54
"""
Serializer mixin for Task models to be used for get and create views of tasks.

Subclasses should define the `model` in Meta and can extend the `fields` list
and create_scheduler_task_args() method as needed.
"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""
Serializer mixin for Task models to be used for get and create views of tasks.
Subclasses should define the `model` in Meta and can extend the `fields` list
and create_scheduler_task_args() method as needed.
"""
"""Serializer mixin for Task models to be used for get and create views of tasks.
Subclasses should define the `model` in Meta and can extend the `fields` list
and create_scheduler_task_args() method as needed.
"""

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mind to add some basic documentation to the classes of this file?

self.assertEqual(response.status_code, status.HTTP_400_BAD_REQUEST)
self.assertIn("Unknown task type", str(response.data))

@patch("grimoirelab.core.scheduler.api.schedule_task")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really needed? The mock object is never used.

user = get_user_model().objects.create(username="test", is_superuser=True)
self.client.force_authenticate(user=user)

@patch("grimoirelab.core.scheduler.api.reschedule_task")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed too?

user = get_user_model().objects.create(username="test", is_superuser=True)
self.client.force_authenticate(user=user)

@patch("grimoirelab.core.scheduler.api.cancel_task")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this needed?

If a task fails to create a job for any reason, it will
not be executed. Ensure that every task has a related job.

Signed-off-by: Jose Javier Merchante <jjmerchante@bitergia.com>
This commit includes SortingHat tasks to GrimoireLab
scheduler. This allows to run SortingHat jobs from GrimoireLab,
such as affiliate, unify, genderize, import identities,
and generate recommendations.

Include new generic API endpoints to list, create,
retrieve, reschedule, and cancel tasks:

Include the following API endpoints:
```
/api/v1/tasks/
/api/v1/tasks/<str:task_type>/
/api/v1/tasks/<str:task_type>/<str:uuid>/
/api/v1/tasks/<str:task_type>/<str:uuid>/reschedule/
/api/v1/tasks/<str:task_type>/<str:uuid>/cancel/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/<str:uuid>/
/api/v1/tasks/<str:task_type>/<str:task_id>/jobs/<str:uuid>/logs
```

Currently, <task_type> could be 'eventizer', 'recommend_affiliations',
'affiliate', 'recommend_matches', 'unify', 'recommend_gender',
'genderize', or 'import_identities'.

Signed-off-by: Jose Javier Merchante <jjmerchante@bitergia.com>

# Conflicts:
#	poetry.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Use grimoirelab-core scheduler for scheduling SortingHat jobs

2 participants