Add print-schema option `allow_tables_in_same_query` #4500

sgoll · 2025-02-19T22:04:32Z

This adds a new config option to print-schema, i.e. allow_tables_in_same_query (in diesel.toml) or --allow-tables-in-same-query (in CLI).

Setting this option to all_tables, or leaving it unset, remains the current behavior of generating a single invocation of the macro allow_tables_to_appear_in_same_query! that lists all tables in the schema.

Setting this option to fk_related_tables changes this behavior: instead of a single invocation, several invocations are issued, one for each group of tables related to each other through foreign keys.

For example, given the following tables with foreign keys:

users (id)
posts (id, user_id -> users(id))
comments (id, post_id -> posts(id))

sessions (id)
transactions (id, session_id -> sessions(id))

cars (id)
bikes (id)

The following macro invocations are generated:

For all_tables (default):

diesel::allow_tables_to_appear_in_same_query!(
    bikes, cars, comments, posts, sessions, transactions, users);

For fk_related_tables:

diesel::allow_tables_to_appear_in_same_query!(comments, posts, users);
diesel::allow_tables_to_appear_in_same_query!(sessions, transactions);

See #4333 for rationale.

sgoll · 2025-02-19T22:05:30Z

@weiznich When this gets merged, I can also provide a backport for #4490, or for a future release.

Ten0

Hey!
Thanks a lot for your contribution! 😊
It looks like this will definitely be useful for people who have very disconnected large graphs.

Unfortunately my 166 tables graph is largely connected so that won't help me personally 😅 - I'm curious, how is your use-case that is so different?

Otherwise from an implementation standpoint this looks good 😊 - although I did find 2-3 things that I would tend to change.

Thanks again for working on this!

Ten0 · 2025-02-21T12:33:53Z

diesel_cli/src/cli.rs

@@ -280,6 +280,14 @@ pub fn build_cli() -> Command {
                .action(ArgAction::Append)
                .value_parser(PossibleValuesParser::new(print_schema::DocConfig::VARIANTS_STR)),
        )
+        .arg(
+            Arg::new("allow-tables-in-same-query")


Naming:

given the name of this argument, at first glance I would expect that I can do:

--allow-tables-to-appear-in-same-query table1 table2

and that that would merge the connected components of table1 and table2 together for generation.

Considering that this may even be something that makes sense and we might want to add later on, how about calling this allow_tables_to_appear_in_same_query_generation_mode or something along those lines ?

You are correct. Initially, I had --allow-tables-to-appear-in-same-query, which suffers from the same problem but seemed to be too verbose to me. Adding a suffix -mode or -generation-mode would make it even longer.

There is precedent for the suffix -config, as in --with-docs-config, so I think I would prefer that over -mode and -generation-mode. What do you think?

The full name then could be --allow-tables-to-appear-in-same-query-config. Still pretty long, but as long as that is okay, I can change it to that.

diesel_cli/src/print_schema.rs

Ten0 · 2025-02-21T22:20:00Z

diesel_cli/tests/print_schema/print_schema_allow_tables_in_same_query/diesel.toml

@@ -0,0 +1,3 @@
+[print_schema]
+file = "src/schema.rs"
+allow_tables_in_same_query = "fk_related_tables"


test should probably be named around that it's testing specifically fk_related_tables, not allow_tables_to_appear_in_same_query itself.

That is correct. I had this originally, but then I ran into PostgreSQL's limit on the length of database names (63 characters) for the databases created by the test suite.

This may be another hint that the option name --allow-tables-to-appear-in-same-query-config is too long 😉

Co-authored-by: Thomas B <[email protected]>

sgoll · 2025-02-22T12:12:15Z

Unfortunately my 166 tables graph is largely connected so that won't help me personally 😅 - I'm curious, how is your use-case that is so different?

To be honest, my latest project has, at the moment, only a mediocre amount of 27 tables, so I have not been affected as such by the issue outlined in #4333. As in your case, my set of tables is also largely connected, so the changes here only reduce the number of table pairs from 351 down to 301 (25+2 connected tables).

I have been thinking about how else to mitigate the overly exhaustive set of allowed combinations. When splitting into connected components is not sufficient, maybe we can somehow limit the length of join chains (with the rationale that usually there are no more than, say, a handful of tables involved in a single query).

However, since we do not know beforehand which tables will be joined, and the FK-table graph likely has a rather short maximum path length (less than the expected maximum number of table joins, similar to the small-world experiment), this likely would not help reduce the number of table pairs much, if at all.¹

In my current set of 25 connected tables, the maximum path length is 4, i.e. with only four joins each table can reach each other table. So, with such a hypothetical configurable code-generation option I could only reduce the number of generated table pairs if I only ever had to join across three tables. But in my code base, I already have joins across up to five tables. ↩

Ten0 · 2025-02-22T22:34:04Z

However, since we do not know beforehand which tables will be joined, and the FK-table graph likely has a rather short maximum path length (less than the expected maximum number of table joins, similar to the small-world experiment), this likely would not help reduce the number of table pairs much, if at all.

Haha I gave this strategy some thought yesterday as well while looking at the issue and PR and had reached the exact same conclusion 😵‍💫

I also considered doing static analysis on existing files to figure out what tables were used in the same file/function, which while it may work practically for a lot of cases, might not work well with reusable query fragments that may live across multiple files (often via #[dsl::auto_type]). Trying to resolve imports becomes a rather large project that would likely require quite a bit of maintenance, and will always be imperfect if macros are involved, so overall it seems to be a bad complexity/benefit ratio. 😕

Add print-schema option allow_tables_in_same_query

e575f1c

sgoll force-pushed the cli-same-query branch from 1024598 to e575f1c Compare February 19, 2025 22:12

weiznich requested a review from a team February 21, 2025 09:35

Ten0 requested changes Feb 21, 2025

View reviewed changes

sgoll and others added 4 commits February 22, 2025 12:29

Use return value from insert() directly

8b945ab

Co-authored-by: Thomas B <[email protected]>

Sort foreign key components for stable output

472ed39

Simplify formatting code, avoid trailing , after macro formatting

758b090

Merge remote-tracking branch 'upstream/master' into cli-same-query

1889f31

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add print-schema option `allow_tables_in_same_query` #4500

Add print-schema option `allow_tables_in_same_query` #4500

sgoll commented Feb 19, 2025 •

edited

Loading

sgoll commented Feb 19, 2025 •

edited

Loading

Ten0 left a comment

Ten0 Feb 21, 2025

sgoll Feb 22, 2025

Ten0 Feb 21, 2025

sgoll Feb 22, 2025

sgoll commented Feb 22, 2025

Ten0 commented Feb 22, 2025

Add print-schema option allow_tables_in_same_query #4500

Are you sure you want to change the base?

Add print-schema option allow_tables_in_same_query #4500

Conversation

sgoll commented Feb 19, 2025 • edited Loading

sgoll commented Feb 19, 2025 • edited Loading

Ten0 left a comment

Choose a reason for hiding this comment

Ten0 Feb 21, 2025

Choose a reason for hiding this comment

sgoll Feb 22, 2025

Choose a reason for hiding this comment

Ten0 Feb 21, 2025

Choose a reason for hiding this comment

sgoll Feb 22, 2025

Choose a reason for hiding this comment

sgoll commented Feb 22, 2025

Footnotes

Ten0 commented Feb 22, 2025

Add print-schema option `allow_tables_in_same_query` #4500

Add print-schema option `allow_tables_in_same_query` #4500

sgoll commented Feb 19, 2025 •

edited

Loading

sgoll commented Feb 19, 2025 •

edited

Loading