Skip to content
This repository has been archived by the owner on Dec 14, 2023. It is now read-only.

Optimize topics/list #777

Open
dsjen opened this issue Mar 8, 2021 · 3 comments
Open

Optimize topics/list #777

dsjen opened this issue Mar 8, 2021 · 3 comments
Assignees
Labels

Comments

@dsjen
Copy link
Contributor

dsjen commented Mar 8, 2021

The frontend calls topics/list until the list of topics is exhausted and then checks to see if any topics are running or in queue in order to inform non-admin users whether they can create topics (non-admins can only create one topic at a time). This can take a long time, resulting in a timeout.

Are there optimizations that can be made to this call? One idea is that perhaps it would be more efficient to filter via a param so that only topics that matched "in progress" or "running" are returned.

Any ideas?

@dsjen dsjen added the bug label Mar 8, 2021
@pypt
Copy link
Contributor

pypt commented Mar 11, 2021

As always with issues like this, would you be able to post a specific API call, complete with the host that you're calling (is it the frontend cache or the backend directly), arguments, limits, expected result (e.g. "should return in x / in x s") and actual result ("doesn't return at all") to make it easier for us to look into it?

Alternatively, a link for us to click on to observe the call's behavior would be tremendously useful and speed up debugging.

More details would potentially reduce the number of "dunno, works for me" responses from us :)

@dsjen dsjen closed this as completed Mar 11, 2021
@dsjen dsjen reopened this Mar 11, 2021
@dsjen
Copy link
Contributor Author

dsjen commented Apr 12, 2021

Here's the code on the frontend that's in questions, https://github.com/mediacloud/web-tools/blob/45422d7be1f5e766fe0f865982c70014932451ed/server/views/topics/topiclist.py#L69. What's happening is that the topics/list is called in an attempt to find if a user has a "running" or a "queued" topic.

@pypt
Copy link
Contributor

pypt commented Apr 13, 2021

So which user (auth_users_id) is it slow for specifically?

If I read it correctly, does_user_have_a_running_topic() fetches all (user's?) topics, filters them afterwards and returns the list (despite the name which would suggest that the function returns only a boolean). Currently we have 4056 topics which normally wouldn't be that much but then the code that fetches the whole list does run a bunch of other things as well for all 4056 or so topics, thus the slowness.

A natural solution would be to add some sort of a filtering capability on the backend, for example, one should be able to do topics/list?state=running and get only the running topics. As for a more immediate hacky fixes (not so much of a fix, rather a possibility to slightly improve performance), you can call topicList() with a bigger limit (10000) to limit the SQL queries made on the backend while fetching the list of topics.

Also maybe it's worth it reviewing what does_user_have_a_running_topic() does and if one needs it at all because in case the user has admin privileges, the function returns an empty list (instead of a list of all 4056 topics perhaps?), but I don't know enough JavaScript to backtrack what is it that it does with this (empty?) list.

@jtotoole jtotoole removed their assignment Dec 1, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

3 participants