Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGSEGV on simple query with LEFT OUTER JOIN #7787

Closed
mbona92 opened this issue Dec 12, 2024 · 4 comments · Fixed by #7901
Closed

SIGSEGV on simple query with LEFT OUTER JOIN #7787

mbona92 opened this issue Dec 12, 2024 · 4 comments · Fixed by #7901

Comments

@mbona92
Copy link

mbona92 commented Dec 12, 2024

Hi,

we are facing a segmentation fault executing one specific query.
The issue seems similar to #7705 but our query does not contains window partition.
Latest version of Postgres 16 server and citus are installed and issue appears in the same way using citus from rhel repo and from latest github release.

Postgres version: PostgreSQL 16.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2), 64-bit
Citus version: Citus 12.1.5 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-2), 64-bit
OS: Rocky Linux 9

Query that always causes SIGSEGV is:

SELECT
    "public"."table1"."col1" AS "42e84ecab692a85e748bdf7ff86a474e",
    "public"."table1"."col2" AS "6a576f53b68ac363812eaf95205439aa",
    "public"."table1"."col3" AS "695e025e83bd013e2bda04762fc753e5",
    "public"."table1"."col4" AS "b1d24f4d668be03b983cf43621068584",
    "public"."table1"."col5" AS "771c5eaa7ed151a8de99a794e8931843",
    SUM("public"."table2"."col1") AS "0888d71bf3ea38d31bbc95cade8f3b90",
    COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7"
FROM
    "public"."table2"
    LEFT OUTER JOIN "public"."table1" ON
        "public"."table2"."col2" = "public"."table1"."col6"
    LEFT OUTER JOIN "public"."table3" ON
        "public"."table2"."col3" = "public"."table3"."col2"
    LEFT OUTER JOIN "public"."table4" ON
        "public"."table2"."col4" = "public"."table4"."col1"
WHERE
    "public"."table2"."col5" BETWEEN '2024-10-01'
    AND '2024-10-31'
GROUP BY
    "public"."table1"."col1",
    "public"."table1"."col2",
    "public"."table1"."col3",
    "public"."table1"."col4",
    "public"."table1"."col5"
ORDER BY
    "0888d71bf3ea38d31bbc95cade8f3b90" DESC NULLS LAST,
    "42e84ecab692a85e748bdf7ff86a474e" ASC NULLS LAST,
    "6a576f53b68ac363812eaf95205439aa" ASC NULLS LAST,
    "695e025e83bd013e2bda04762fc753e5" ASC NULLS LAST,
    "b1d24f4d668be03b983cf43621068584" ASC NULLS LAST,
    "771c5eaa7ed151a8de99a794e8931843" ASC NULLS LAST
LIMIT
    1001;

Backtrace from gdb:

Program received signal SIGSEGV, Segmentation fault.
get_eclass_for_sort_expr (root=root@entry=0x28288e0, expr=0x282b448, expr@entry=0x2829780, opfamilies=0x282b3f8, 
    opcintype=opcintype@entry=25, collation=collation@entry=100, sortref=sortref@entry=1, rel=0x0, create_it=true) at equivclass.c:734
734				Assert(rel->reloptkind == RELOPT_BASEREL);
(gdb) bt
#0  get_eclass_for_sort_expr (root=root@entry=0x28288e0, expr=0x282b448, expr@entry=0x2829780, opfamilies=0x282b3f8, 
    opcintype=opcintype@entry=25, collation=collation@entry=100, sortref=sortref@entry=1, rel=0x0, create_it=true) at equivclass.c:734
#1  0x000000000078bb60 in make_pathkey_from_sortinfo (root=0x28288e0, expr=0x2829780, opfamily=1994, opcintype=25, collation=100, 
    reverse_sort=<optimized out>, nulls_first=false, sortref=1, rel=0x0, create_it=true) at pathkeys.c:233
#2  0x000000000078c81b in make_pathkey_from_sortop (create_it=true, sortref=1, nulls_first=<optimized out>, ordering_op=664, 
    expr=0x2829780, root=0x28288e0) at pathkeys.c:275
#3  make_pathkeys_for_sortclauses_extended (root=root@entry=0x28288e0, sortclauses=sortclauses@entry=0x7ffec84f4d98, tlist=0x28297d0, 
    remove_redundant=remove_redundant@entry=false, sortable=sortable@entry=0x7ffec84f4daf) at pathkeys.c:1190
#4  0x000000000078c8ca in make_pathkeys_for_sortclauses (root=root@entry=0x28288e0, sortclauses=<optimized out>, tlist=<optimized out>)
    at pathkeys.c:1138
#5  0x000000000079d042 in adjust_group_pathkeys_for_groupagg (root=0x28288e0) at planner.c:3284
#6  standard_qp_callback (root=0x28288e0, extra=<optimized out>) at planner.c:3457
#7  0x000000000079b2f1 in query_planner (root=root@entry=0x28288e0, qp_callback=qp_callback@entry=0x79cc50 <standard_qp_callback>, 
    qp_extra=qp_extra@entry=0x7ffec84f4f60) at planmain.c:207
#8  0x00000000007a0d01 in grouping_planner (root=root@entry=0x28288e0, tuple_fraction=<optimized out>, tuple_fraction@entry=0)
    at planner.c:1495
#9  0x00000000007a3a3b in subquery_planner (glob=glob@entry=0x28287d0, parse=parse@entry=0x2826d90, parent_root=parent_root@entry=0x0, 
    hasRecursion=hasRecursion@entry=false, tuple_fraction=tuple_fraction@entry=0) at planner.c:1064
#10 0x00000000007a3dfe in standard_planner (parse=0x2826d90, query_string=<optimized out>, cursorOptions=0, 
    boundParams=<optimized out>) at planner.c:413
#11 0x00007fb904af1660 in BuildSelectStatementViaStdPlanner (combineQuery=0x2826d90, remoteScanTargetList=0x28272e0, 
    remoteScan=0x2826fb0) at planner/combine_query_planner.c:304
#12 0x00007fb904af1072 in PlanCombineQuery (distributedPlan=0x2826ea0, remoteScan=0x2826fb0) at planner/combine_query_planner.c:68
#13 0x00007fb904af4dae in FinalizeNonRouterPlan (localPlan=0x2762540, distributedPlan=0x2826ea0, customScan=0x2826fb0)
    at planner/distributed_planner.c:1489
#14 0x00007fb904af4d64 in FinalizePlan (localPlan=0x2762540, distributedPlan=0x2826ea0) at planner/distributed_planner.c:1470
#15 0x00007fb904af3e91 in CreateDistributedPlannedStmt (planContext=0x7ffec84f54d0) at planner/distributed_planner.c:800
#16 0x00007fb904af3c53 in PlanDistributedStmt (planContext=0x7ffec84f54d0, rteIdCounter=5) at planner/distributed_planner.c:685
#17 0x00007fb904af3447 in distributed_planner (parse=0x2562398, 
    query_string=0x24f95d8 "SELECT\n    \"public\".\"cliente\".\"n_intestazione\" AS \"42e84ecab692a85e748bdf7ff86a474e\",\n    \"public\".\"cliente\".\"k_ndc\" AS \"6a576f53b68ac363812eaf95205439aa\",\n    \"public\".\"cliente\".\"flag_cgb\" AS \"695e02"..., 
    cursorOptions=2048, boundParams=0x0) at planner/distributed_planner.c:277
#18 0x00007fb9069211b2 in pgss_planner (parse=0x2562398, 
    query_string=0x24f95d8 "SELECT\n    \"public\".\"cliente\".\"n_intestazione\" AS \"42e84ecab692a85e748bdf7ff86a474e\",\n    \"public\".\"cliente\".\"k_ndc\" AS \"6a576f53b68ac363812eaf95205439aa\",\n    \"public\".\"cliente\".\"flag_cgb\" AS \"695e02"..., 
    cursorOptions=2048, boundParams=0x0) at pg_stat_statements.c:953
#19 0x000000000086ee58 in pg_plan_query (querytree=0x2562398, 
    query_string=query_string@entry=0x24f95d8 "SELECT\n    \"public\".\"cliente\".\"n_intestazione\" AS \"42e84ecab692a85e748bdf7ff86a474e\",\n    \"public\".\"cliente\".\"k_ndc\" AS \"6a576f53b68ac363812eaf95205439aa\",\n    \"public\".\"cliente\".\"flag_cgb\" AS \"695e02"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:908
#20 0x000000000086ef51 in pg_plan_queries (querytrees=0x26f2c50, 
    query_string=query_string@entry=0x24f95d8 "SELECT\n    \"public\".\"cliente\".\"n_intestazione\" AS \"42e84ecab692a85e748bdf7ff86a474e\",\n    \"public\".\"cliente\".\"k_ndc\" AS \"6a576f53b68ac363812eaf95205439aa\",\n    \"public\".\"cliente\".\"flag_cgb\" AS \"695e02"..., cursorOptions=cursorOptions@entry=2048, boundParams=boundParams@entry=0x0) at postgres.c:1000
#21 0x000000000086f25a in exec_simple_query (
    query_string=0x24f95d8 "SELECT\n    \"public\".\"cliente\".\"n_intestazione\" AS \"42e84ecab692a85e748bdf7ff86a474e\",\n    \"public\".\"cliente\".\"k_ndc\" AS \"6a576f53b68ac363812eaf95205439aa\",\n    \"public\".\"cliente\".\"flag_cgb\" AS \"695e02"...)
    at postgres.c:1197
#22 0x0000000000870e24 in PostgresMain (dbname=<optimized out>, username=<optimized out>) at postgres.c:4725
#23 0x00000000007e2922 in BackendRun (port=0x256d310, port=0x256d310) at postmaster.c:4464
#24 BackendStartup (port=0x256d310) at postmaster.c:4192
#25 ServerLoop () at postmaster.c:1782
#26 0x00000000007e3837 in PostmasterMain (argc=argc@entry=1, argv=argv@entry=0x24f29d0) at postmaster.c:1466
#27 0x0000000000514177 in main (argc=1, argv=0x24f29d0) at main.c:198

If the last count in select (COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7") is commented out, SIGSEGV does not occurs.
Furthermore, if query changes from LEFT OUTER JOIN to INNER JOIN, issue does not occurs:

SELECT
    "public"."table1"."col1" AS "42e84ecab692a85e748bdf7ff86a474e",
    "public"."table1"."col2" AS "6a576f53b68ac363812eaf95205439aa",
    "public"."table1"."col3" AS "695e025e83bd013e2bda04762fc753e5",
    "public"."table1"."col4" AS "b1d24f4d668be03b983cf43621068584",
    "public"."table1"."col5" AS "771c5eaa7ed151a8de99a794e8931843",
    SUM("public"."table2"."col1") AS "0888d71bf3ea38d31bbc95cade8f3b90",
    COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7"
FROM
    "public"."table2"
    INNER JOIN "public"."table1" ON
        "public"."table2"."col2" = "public"."table1"."col6"
    INNER JOIN "public"."table3" ON
        "public"."table2"."col3" = "public"."table3"."col2"
    INNER JOIN "public"."table4" ON
        "public"."table2"."col4" = "public"."table4"."col1"
WHERE                                           
    "public"."table2"."col5" BETWEEN '2024-10-01'
    AND '2024-10-31'
GROUP BY
    "public"."table1"."col1",
    "public"."table1"."col2",
    "public"."table1"."col3",
    "public"."table1"."col4",
    "public"."table1"."col5"
ORDER BY                       
    "0888d71bf3ea38d31bbc95cade8f3b90" DESC NULLS LAST,
    "42e84ecab692a85e748bdf7ff86a474e" ASC NULLS LAST,
    "6a576f53b68ac363812eaf95205439aa" ASC NULLS LAST,
    "695e025e83bd013e2bda04762fc753e5" ASC NULLS LAST,
    "b1d24f4d668be03b983cf43621068584" ASC NULLS LAST,
    "771c5eaa7ed151a8de99a794e8931843" ASC NULLS LAST
LIMIT                                                 
    1001;

Could you help us solving this?
Ask me if you need more info.

Thanks,
Matteo

@colm-mchugh
Copy link
Contributor

Hi Matteo, can you share the schema of the tables, the CREATE TABLE statements (relevant parts) and how the tables are distributed ? This will help us in reproducing the issue.

If the last count in select (COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7") is commented out, SIGSEGV does not occurs

So the issue does not occur if the query is changed to:

SELECT
    "public"."table1"."col1" AS "42e84ecab692a85e748bdf7ff86a474e",
    "public"."table1"."col2" AS "6a576f53b68ac363812eaf95205439aa",
    "public"."table1"."col3" AS "695e025e83bd013e2bda04762fc753e5",
    "public"."table1"."col4" AS "b1d24f4d668be03b983cf43621068584",
    "public"."table1"."col5" AS "771c5eaa7ed151a8de99a794e8931843",
    SUM("public"."table2"."col1") AS "0888d71bf3ea38d31bbc95cade8f3b90"
--    COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7"
FROM
    "public"."table2"
    LEFT OUTER JOIN "public"."table1" ON
        "public"."table2"."col2" = "public"."table1"."col6"
    LEFT OUTER JOIN "public"."table3" ON
        "public"."table2"."col3" = "public"."table3"."col2"
    LEFT OUTER JOIN "public"."table4" ON
        "public"."table2"."col4" = "public"."table4"."col1"
WHERE
    "public"."table2"."col5" BETWEEN '2024-10-01'
    AND '2024-10-31'
GROUP BY
    "public"."table1"."col1",
    "public"."table1"."col2",
    "public"."table1"."col3",
    "public"."table1"."col4",
    "public"."table1"."col5"
ORDER BY
    "0888d71bf3ea38d31bbc95cade8f3b90" DESC NULLS LAST,
    "42e84ecab692a85e748bdf7ff86a474e" ASC NULLS LAST,
    "6a576f53b68ac363812eaf95205439aa" ASC NULLS LAST,
    "695e025e83bd013e2bda04762fc753e5" ASC NULLS LAST,
    "b1d24f4d668be03b983cf43621068584" ASC NULLS LAST,
    "771c5eaa7ed151a8de99a794e8931843" ASC NULLS LAST
LIMIT
    1001;

?

Thanks!

@mbona92
Copy link
Author

mbona92 commented Dec 13, 2024

Hi @colm-mchugh,

I'm able to reproduce the same error with the following instructions:

CREATE TABLE public.table1 (
    col1 character varying(70),
    col2 text,
    col3 text,
    col4 text,
    col5 text,
    k_client text
);
CREATE TABLE public.table3 (
    col1 character varying(200),
    col2 text
);
CREATE TABLE public.table4 (
    col1 text
);
CREATE TABLE public.table2 (
    col1 numeric,
    k_client text,
    col3 text,
    col4 text,
    col5 date
);

CREATE INDEX ON public.table1 USING btree (k_client);
CREATE INDEX ON public.table3 USING btree (col2);  
CREATE INDEX ON public.table4 USING btree (col1);  
CREATE INDEX ON public.table2 USING btree (k_client);
CREATE INDEX ON public.table2 USING btree (col3);  
CREATE INDEX ON public.table2 USING btree (col4);  
SELECT create_distributed_table('table1', 'k_client', shard_count => 8);
SELECT create_distributed_table('table2', 'k_client', colocate_with => 'table1');
SELECT create_reference_table('table3');
SELECT create_reference_table('table4');

I slightly modified the query from the first message to distribute the tables on the k_client column as in the real use case in production.
You can see the changes in bold:

SELECT
    "public"."table1"."col1" AS "42e84ecab692a85e748bdf7ff86a474e",
    "public"."table1"."col2" AS "6a576f53b68ac363812eaf95205439aa",
    "public"."table1"."col3" AS "695e025e83bd013e2bda04762fc753e5",
    "public"."table1"."col4" AS "b1d24f4d668be03b983cf43621068584",
    "public"."table1"."col5" AS "771c5eaa7ed151a8de99a794e8931843",
    SUM("public"."table2"."col1") AS "0888d71bf3ea38d31bbc95cade8f3b90",
    COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7"
FROM
    "public"."table2"
    LEFT OUTER JOIN "public"."table1" ON
        "public"."table2"."k_client" = "public"."table1"."k_client"
    LEFT OUTER JOIN "public"."table3" ON
        "public"."table2"."col3" = "public"."table3"."col2"
    LEFT OUTER JOIN "public"."table4" ON
        "public"."table2"."col4" = "public"."table4"."col1"
WHERE
    "public"."table2"."col5" BETWEEN '2024-10-01'
    AND '2024-10-31'
GROUP BY    
    "public"."table1"."col1",
    "public"."table1"."col2",
    "public"."table1"."col3",
    "public"."table1"."col4",
    "public"."table1"."col5"
ORDER BY    
    "0888d71bf3ea38d31bbc95cade8f3b90" DESC NULLS LAST,
    "42e84ecab692a85e748bdf7ff86a474e" ASC NULLS LAST,
    "6a576f53b68ac363812eaf95205439aa" ASC NULLS LAST,
    "695e025e83bd013e2bda04762fc753e5" ASC NULLS LAST,
    "b1d24f4d668be03b983cf43621068584" ASC NULLS LAST,
    "771c5eaa7ed151a8de99a794e8931843" ASC NULLS LAST
LIMIT
    1001;

I confirm that running the query as:

SELECT      
    "public"."table1"."col1" AS "42e84ecab692a85e748bdf7ff86a474e",
    "public"."table1"."col2" AS "6a576f53b68ac363812eaf95205439aa",
    "public"."table1"."col3" AS "695e025e83bd013e2bda04762fc753e5",
    "public"."table1"."col4" AS "b1d24f4d668be03b983cf43621068584",
    "public"."table1"."col5" AS "771c5eaa7ed151a8de99a794e8931843",
    SUM("public"."table2"."col1") AS "0888d71bf3ea38d31bbc95cade8f3b90"
    -- COUNT(DISTINCT "public"."table3"."col1") AS "d748e8927346013d185155c00172b7f7"
FROM        
    "public"."table2"                              
    LEFT OUTER JOIN "public"."table1" ON           
        "public"."table2"."k_client" = "public"."table1"."k_client"
    LEFT OUTER JOIN "public"."table3" ON           
        "public"."table2"."col3" = "public"."table3"."col2"
    LEFT OUTER JOIN "public"."table4" ON           
        "public"."table2"."col4" = "public"."table4"."col1"
WHERE       
    "public"."table2"."col5" BETWEEN '2024-10-01'
    AND '2024-10-31'                               
GROUP BY    
    "public"."table1"."col1",                      
    "public"."table1"."col2",                      
    "public"."table1"."col3",                      
    "public"."table1"."col4",                      
    "public"."table1"."col5"                       
ORDER BY    
    "0888d71bf3ea38d31bbc95cade8f3b90" DESC NULLS LAST,
    "42e84ecab692a85e748bdf7ff86a474e" ASC NULLS LAST,
    "6a576f53b68ac363812eaf95205439aa" ASC NULLS LAST,
    "695e025e83bd013e2bda04762fc753e5" ASC NULLS LAST,
    "b1d24f4d668be03b983cf43621068584" ASC NULLS LAST,
    "771c5eaa7ed151a8de99a794e8931843" ASC NULLS LAST
LIMIT     
    1001;   

with "COUNT" commented out, runs well without raise SIGSEGV.

The same error occurs with both 1 worker node and multiple worker nodes.
I tested the query up to 4 worker nodes with 32 shards.
With a number of shards > 1 the query crashes, 1 shard does not give the issue.

Thanks,
Matteo

@colm-mchugh
Copy link
Contributor

colm-mchugh commented Dec 13, 2024

Thanks @mbona92 I can confirm that we can reproduce the issue. It is related to #7705; when building a worker subquery the VAR node for the column in the COUNT DISTINCT expression has a non-empty varnullingrels field, this should be empty for the corresponding VAR in the combine query but it is just copied over. We will investigate further and fix. Thanks again for bringing to our attention.

colm-mchugh added a commit that referenced this issue Dec 16, 2024
colm-mchugh added a commit that referenced this issue Dec 19, 2024
Fix the SEGV seen in #7787; it occurs because a column in the targetlist
of a worker subquery can contain a non-empty varnullingrels field if the
column is from the inner side of a left outer join.
The issue can also occur with the columns in the HAVING clause, and this
is also tested in the fix. The issue was triggered by the introduction
of the varnullingrels to Vars in Postgres 16 (2489d76c)

the query tree for the combine query. Here, the issue occurs when
creating a worker subquery. The regress file from #7705 is used (and
renamed) to also test this (#7787). An alternative test output file
is required for Postgres 15 because of an optimization to DISTINCT
in Postgres 16 (1349d2790bf)
colm-mchugh added a commit that referenced this issue Dec 19, 2024
Fix the SEGV seen in #7787; it occurs because a column in the targetlist
of a worker subquery can contain a non-empty varnullingrels field if the
column is from the inner side of a left outer join.
The issue can also occur with the columns in the HAVING clause, and this
is also tested in the fix. The issue was triggered by the introduction
of the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels
was incorrectly copied into the query tree for the combine query.
Here, a non-empty varnullingrels field of a var is incorrectly copied
into the query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
colm-mchugh added a commit that referenced this issue Dec 24, 2024
Fix the SEGV seen in #7787; it occurs because a column in the targetlist
of a worker subquery can contain a non-empty varnullingrels field if the
column is from the inner side of a left outer join.
The issue can also occur with the columns in the HAVING clause, and this
is also tested in the fix. The issue was triggered by the introduction
of the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels
was incorrectly copied into the query tree for the combine query.
Here, a non-empty varnullingrels field of a var is incorrectly copied
into the query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
@colm-mchugh colm-mchugh linked a pull request Feb 13, 2025 that will close this issue
colm-mchugh added a commit that referenced this issue Feb 14, 2025
Fix the SEGV seen in #7787; it occurs because a column in the targetlist
of a worker subquery can contain a non-empty varnullingrels field if the
column is from the inner side of a left outer join.
The issue can also occur with the columns in the HAVING clause, and this
is also tested in the fix. The issue was triggered by the introduction
of the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels
was incorrectly copied into the query tree for the combine query.
Here, a non-empty varnullingrels field of a var is incorrectly copied
into the query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
@colm-mchugh colm-mchugh linked a pull request Feb 14, 2025 that will close this issue
colm-mchugh added a commit that referenced this issue Feb 18, 2025
DESCRIPTION: Fixes a crash in left outer joins that can happen when
there is an an aggregate on a column from the inner side of the join.

Fix the SEGV seen in #7787 and #7899; it occurs because a column in the
targetlist of a worker subquery can contain a non-empty varnullingrels
field if the column is from the inner side of a left outer join. The
issue can also occur with the columns in the HAVING clause, and this is
also tested in the fix. The issue was triggered by the introduction of
the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels was
incorrectly copied into the query tree for the combine query. Here, a
non-empty varnullingrels field of a var is incorrectly copied into the
query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
@colm-mchugh
Copy link
Contributor

colm-mchugh commented Feb 18, 2025

Fixed by PR #7901

colm-mchugh added a commit that referenced this issue Feb 18, 2025
DESCRIPTION: Fixes a crash in left outer joins that can happen when
there is an an aggregate on a column from the inner side of the join.

Fix the SEGV seen in #7787 and #7899; it occurs because a column in the
targetlist of a worker subquery can contain a non-empty varnullingrels
field if the column is from the inner side of a left outer join. The
issue can also occur with the columns in the HAVING clause, and this is
also tested in the fix. The issue was triggered by the introduction of
the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels was
incorrectly copied into the query tree for the combine query. Here, a
non-empty varnullingrels field of a var is incorrectly copied into the
query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
colm-mchugh added a commit that referenced this issue Feb 18, 2025
DESCRIPTION: Fixes a crash in left outer joins that can happen when
there is an an aggregate on a column from the inner side of the join.

Fix the SEGV seen in #7787 and #7899; it occurs because a column in the
targetlist of a worker subquery can contain a non-empty varnullingrels
field if the column is from the inner side of a left outer join. The
issue can also occur with the columns in the HAVING clause, and this is
also tested in the fix. The issue was triggered by the introduction of
the varnullingrels to Vars in Postgres 16 (2489d76c)

There is a related issue, #7705, where a non-empty varnullingrels was
incorrectly copied into the query tree for the combine query. Here, a
non-empty varnullingrels field of a var is incorrectly copied into the
query tree for a worker subquery.

The regress file from #7705 is used (and renamed) to also test this
(#7787). An alternative test output file is required for Postgres 15
because of an optimization to DISTINCT in Postgres 16 (1349d2790bf).
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants