Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 7887 Fix insert select planner to exclude identity columns from target list on partial inserts #7911

Draft
wants to merge 3 commits into
base: release-13.0
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
74 changes: 74 additions & 0 deletions src/backend/distributed/planner/insert_select_planner.c
Original file line number Diff line number Diff line change
Expand Up @@ -1438,6 +1438,80 @@ CreateNonPushableInsertSelectPlan(uint64 planId, Query *parse, ParamListInfo bou
/* get the SELECT query (may have changed after PrepareInsertSelectForCitusPlanner) */
Query *selectQuery = selectRte->subquery;

/*
* 1) Open the target relation to inspect its attributes and detect identity columns.
*/
Relation targetRel = RelationIdGetRelation(targetRelationId);
if (RelationIsValid(targetRel))
{

ListCell *lcInsert = NULL;
// ListCell *lcSelect = list_head(selectQuery->targetList);

/* We'll build new lists for both sides */
List *newInsertTList = NIL;
List *newSelectTList = NIL;

int insertIndex = 0;

foreach(lcInsert, insertSelectQuery->targetList)
{
TargetEntry *insertTle = (TargetEntry *) lfirst(lcInsert);
insertIndex++;

/* Get the corresponding TLE from the SELECT by position or resno */
TargetEntry *selectTle = NULL;

/*
* If your plan is guaranteed to keep them in the same order, you can
* do "selectTle = (TargetEntry *) list_nth(selectQuery->targetList, insertIndex - 1)".
*
* Alternatively, if you rely on resno alignment, you'd find the TLE with resno==insertTle->resno.
* For simplicity, let's assume same ordering:
*/
selectTle = (TargetEntry *) list_nth(selectQuery->targetList, insertIndex - 1);

/*
* Check if the insertTle is an identity column that the user didn't supply,
* e.g. by checking 'attr->attidentity == ATTRIBUTE_IDENTITY_ALWAYS' etc.
* If we skip it, also skip the SELECT TLE at the same position.
*/

int attno = insertTle->resno;
if (attno > 0 && attno <= targetRel->rd_att->natts)
{
Form_pg_attribute attr = TupleDescAttr(targetRel->rd_att, attno - 1);

/*
* If 'attr->attidentity' is 'a' or 'd' => It's an identity column.
* If the user hasn't explicitly specified a value (which is presumably
* indicated by something in the parse tree?), we remove or convert
* the TLE to a default.
*/
// bool userSpecifiedValue = CheckIfUserSpecifiedValue(tle, parse);
bool userSpecifiedValue = false;
if ((attr->attidentity == ATTRIBUTE_IDENTITY_ALWAYS ||
attr->attidentity == ATTRIBUTE_IDENTITY_BY_DEFAULT) &&
!userSpecifiedValue)
{
/* Skip adding TLE => effectively uses default identity generation */
continue;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the value of tle->resjunk at this point ? Skipping TLE with resjunk = true may be a more general way to deal with the problem. But please check if that is actually the case here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind - this is not the case so resjunk cannot be used to determine if a column was specified by the user.

}
}

/* else keep both TLEs */
newInsertTList = lappend(newInsertTList, insertTle);
newSelectTList = lappend(newSelectTList, selectTle);
}

/* Now we have 1:1 matching lists with the identity column removed from both sides */
insertSelectQuery->targetList = newInsertTList;
selectQuery->targetList = newSelectTList;

RelationClose(targetRel);
}


/*
* Later we might need to call WrapTaskListForProjection(), which requires
* that select target list has unique names, otherwise the outer query
Expand Down
2 changes: 1 addition & 1 deletion src/test/regress/multi_schedule
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ test: multi_dropped_column_aliases foreign_key_restriction_enforcement
test: binary_protocol
test: alter_table_set_access_method
test: alter_distributed_table
test: issue_5248 issue_5099 issue_5763 issue_6543 issue_6758 issue_7477
test: issue_5248 issue_5099 issue_5763 issue_6543 issue_6758 issue_7477 issue_7887
test: object_propagation_debug
test: undistribute_table
test: run_command_on_all_nodes
Expand Down
33 changes: 33 additions & 0 deletions src/test/regress/sql/issue_7887.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
CREATE SCHEMA issue_7887;
CREATE SCHEMA issue_7887;

CREATE TABLE local1 (
id text not null primary key
);

CREATE TABLE reference1 (
id int not null primary key,
reference_col1 text not null
);
SELECT create_reference_table('reference1');

CREATE TABLE local2 (
id int not null generated always as identity,
local1fk text not null,
reference1fk int not null,
constraint loc1fk foreign key (local1fk) references local1(id),
constraint reference1fk foreign key (reference1fk) references reference1(id),
constraint testlocpk primary key (id)
);

INSERT INTO local1(id) VALUES ('aaaaa'), ('bbbbb'), ('ccccc');
INSERT INTO reference1(id, reference_col1) VALUES (1, 'test'), (2, 'test2'), (3, 'test3');

-- The statement that triggers the bug:
INSERT INTO local2(local1fk, reference1fk)
SELECT id, 1
FROM local1;

-- If you want to see the error in the regression output, you might do something like:
-- NOTE: The next line is typically how you'd test for an error in a .sql regression test
-- but with a custom "expected" file you'll confirm you get the "invalid string enlargement request size: -4" text
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we put test code into generated_identity.sql instead of creating a new test file ? Given that it already tests identity column in Citus,