Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include proper task overview while running #5

Open
nils-braun opened this issue Aug 5, 2018 · 2 comments
Open

Include proper task overview while running #5

nils-braun opened this issue Aug 5, 2018 · 2 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@nils-braun
Copy link
Owner

luigi prints a lot of messages to the screen while running and the important ones (which jobs are finished, which are failed etc.) may get lost.
It would be nice to have kind of a "console overview" like gridcontrol with the current status. The failed tasks could for example be marked with their corresponding log files.
I thought about using a multiprocessing.Queue for sending information on some task.events (e.g. on failure, on creation etc.) as a JSON and read them in later.
One could use the same mechanism also for accessing b2luigi from jupyter notebooks, which would give a very nice additional feature.

@nils-braun nils-braun added the help wanted Extra attention is needed label Aug 5, 2018
@meliache
Copy link
Collaborator

meliache commented Sep 24, 2020

This is still labelled help wanted and I'm thinking playing around with this,
though feature-wise I think we shouldn't try to compete with the web view of
the central scheduler.

Tbh I haven't use gridcontrol in a long time and almost don't remember how its
overview looked like. But I imagine an htop-like tabular view like

Task Parameters Requires Batch tstart tlast update Status Progress Additional info

together with a summary like the part of an htop window where you see the load
and the CPU usage. I'm not sure if this view could somehow represent the task
graph, htop has a tree view, but luigi supports doesn't support trees only
(arbitrary DAG's?) But maybe I'm totally misunderstanding what you are
imagining, so maybe you can clear that up.

Something that I'd usually want to check is whether we can re-use existing
functionality that is used for the web view, not sure if that would be possible
without running the central scheduler.

The failed tasks could for example be marked with their corresponding log
files
You mean the path to the log files?

And I'm not sure how you imagine this should be used. Should the overview console
automatically start if you run luigi tasks (without --test) and thus hide the
normal standard output? Or would the use need to run a separate command to view the task
overview console, which then listens to the main luigi process which acts as a
server that sending the json?

There's especially a lot of output when running gbasf2 tasks, because I only
safe the remote log files that gbasf2 creates on the grid in the logs, but I
don't log the output of the gbasf2 subprocesses that I run, but just let them
write to stdout. But I think I definitely could do that better and maybe somehow
have a better overview of the status of a gbasf2 project and the associated
task. If the overview has some console for batch/additional info, I think I
could accommodate that there somehow.

@nils-braun
Copy link
Owner Author

Thanks for picking this up.
What I was imaging (but maybe this is bit the best think to do), is - exactly as you said - that users could start b2luigi with an additional option (maybe "monitor") and get only this new view (and not the usual output). Maybe we could put the usual output to a file instead for full reference.

The format you are thinking of is even more complicated than I thought. Having the table "too wide" might again make it unreadable on the terminal, so I thought for the beginning let's start with task family, batch status and submission information: how much jobs are running, how much have failed, how many are scheduled etc. I think the most interesting "feature" of this overview would be that it groups jobs together: currently you get a single output log line per job. If you are starting 1000 batch jobs that can be quite confusing :-)

Happy to discuss, if you have a different opinion!

Concerning the re-usage: not sure we can do so, as currently the luigi process pushes information to the scheduler which are than transformed and shown. Might be possible to re-use bits and pieces but the general logic (the batch stuff) might only be usable with b2luigi

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants