Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds REASSIGN OWNED BY propagation #7319

Merged
merged 32 commits into from
Dec 28, 2023
Merged

Adds REASSIGN OWNED BY propagation #7319

merged 32 commits into from
Dec 28, 2023

Conversation

gurkanindibay
Copy link
Contributor

@gurkanindibay gurkanindibay commented Nov 2, 2023

DESCRIPTION: Adds REASSIGN OWNED BY propagation

This pull request introduces the propagation of the "Reassign owned by" statement. It accommodates both local and distributed roles for both the old and new assignments. However, when the old role is a local role, it undergoes filtering and is not propagated. On the other hand, if the new role is a local role, the process involves first creating the role on worker nodes before propagating the "Reassign owned" statement.

Copy link

codecov bot commented Nov 2, 2023

Codecov Report

Merging #7319 (29e77ed) into main (181b8ab) will decrease coverage by 6.43%.
The diff coverage is 94.66%.

❗ Current head 29e77ed differs from pull request most recent head f49d429. Consider uploading reports for the commit f49d429 to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #7319      +/-   ##
==========================================
- Coverage   89.61%   83.19%   -6.43%     
==========================================
  Files         278      350      +72     
  Lines       60129    62749    +2620     
  Branches     7490     8131     +641     
==========================================
- Hits        53882    52201    -1681     
- Misses       4104     8170    +4066     
- Partials     2143     2378     +235     

@onurctirtir onurctirtir self-requested a review November 6, 2023 18:52
Copy link
Member

@onurctirtir onurctirtir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

REASSIGN TO non-distributed role TO distributed role
OWNED BY all-non-distributed roles C1 C2
OWNED BY some-distributed roles C3 C4
OWNED BY all-distributed roles C5 C6

Going through all the possible scenarios, it's obvious that while Citus should not return any DDLJobs for C1, it needs to return a DDLJob for C6.

However, I'm confused about the other scenarios, namely, C2, C3, C4 and C5 i.e., the cases where DDL both references non-distributed roles as well as distributed ones, either in OWNED BY clause or in TO clause.

As an example, for C4, would it really make sense to only reassign the objects owned by distributed ones to the one given in TO clause, or would it be confusing on user's end? We have two options here; either do what this PR proposes (filter the distributed roles in provided in OWNED BY claue and forget about the others), or automatically propagate non-distributed roles before sending REASSIGN OWNED BY command to workers.

I think choosing the first option would cause role objects to get out-of-sync on different nodes so might not be the ideal approach, so I'd like to talk more about this in weekly sync: @gurkanindibay, @JelteF.

Or, for C5, I'd expect the solution proposed in this PR to fail as it would incorrectly search for the role provided in TO clause on worker nodes. Do we have any tests for that?



List *
PreprocessReassignOwnedStmt(Node *node, const char *queryString,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we have this in Postprocess instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason to move this to Postprocess actually. It just prepares the command.
Normally when you check other statements who is in the nature of Alter like PreprocessAlterDatabaseStmt, we use Preprocess
Create needs to have Post since address only exists after local execution, Drop needs to be Pre since address is missing after local drop execution.
For Alter style statements, there is no such requirement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get this into postprocess as we've discussed in group chat, due to reasons discussed there

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's get this into postprocess as we've discussed in group chat, due to reasons discussed there

(discussion in group chat was still on-going indeed)

@onurctirtir
Copy link
Member

REASSIGN TO non-distributed role TO distributed role
OWNED BY all-non-distributed roles C1 C2
OWNED BY some-distributed roles C3 C4
OWNED BY all-distributed roles C5 C6
Going through all the possible scenarios, it's obvious that while Citus should not return any DDLJobs for C1, it needs to return a DDLJob for C6.

However, I'm confused about the other scenarios, namely, C2, C3, C4 and C5 i.e., the cases where DDL both references non-distributed roles as well as distributed ones, either in OWNED BY clause or in TO clause.

As an example, for C4, would it really make sense to only reassign the objects owned by distributed ones to the one given in TO clause, or would it be confusing on user's end? We have two options here; either do what this PR proposes (filter the distributed roles in provided in OWNED BY claue and forget about the others), or automatically propagate non-distributed roles before sending REASSIGN OWNED BY command to workers.

I think choosing the first option would cause role objects to get out-of-sync on different nodes so might not be the ideal approach, so I'd like to talk more about this in weekly sync: @gurkanindibay, @JelteF.

Or, for C5, I'd expect the solution proposed in this PR to fail as it would incorrectly search for the role provided in TO clause on worker nodes. Do we have any tests for that?

As we've discussed earlier this week, let's not filter out local roles but rely on workers to throw an error if the role doesn't exist on a worker node.

ReassignOwnedStmt *stmt = castNode(ReassignOwnedStmt, node);
List *allReassignRoles = stmt->roles;

List *distributedReassignRoles = FilterDistributedRoles(allReassignRoles);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should remove filtering logic as we've discussed two weeks ago.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We decided to keep the filtering logic



List *
PreprocessReassignOwnedStmt(Node *node, const char *queryString,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

need to move this down to Postprocess phase

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

@onurctirtir onurctirtir requested a review from naisila November 21, 2023 12:46
Copy link
Member

@naisila naisila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we haven't still decided yet on the logic, I only added some comments for the tests. Thank you.


set citus.log_remote_commands to on;
set citus.grep_remote_commands = '%REASSIGN OWNED BY%';
REASSIGN OWNED BY role1 TO role2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Regardless of what we decide to do with the scenarios C1,C2,C3,C4,C5,C6,
    we should have at least one test for each scenario. And also we should ensure to have test cases with >1 role in the BY clause, e.g. 2 roles: REASSIGN OWNED BY role0, role1 TO role2;

  2. As always, let's also add tests where the role names have escaped characters :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tests for distributed and non-distributed roles with escape characters

@gurkanindibay
Copy link
Contributor Author

gurkanindibay commented Nov 27, 2023

Pre Review Request Changes

  • Camel Case naming of variables, functions etc.
  • Includes ordering
  • Single line comments starts with lower case
  • Order of function implementations and their declarations in a file
  • Make sure that there is no typo in comments
  • PR description is clear enough.
  • Description should exist if a user facing issue
  • If a test check is reusable then define in multi_tester_helper.sql
  • If you're adding a new file either use create_test.py or create a test file creating its own schema and release all the sources you created
  • If you need to execute an sql statement in all of the nodes, then use run_command_on_all_nodes instead of connecting to each nodes
  • Check for codecov outputs and see that if all blocks are covered (I suppose there is a problem with Codecov since the lines that it shows uncovered is in the main execution blocks)

Copy link
Member

@naisila naisila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the changes, I realize that it is much better to open a PR against this branch for the changed EnsureDependencies function. It was hard to follow in this PR. I already left comments here, but can you consider that? Especially if we will ask for an extra review from Nils or someone else.


static void EnsureDependenciesCanBeDistributed(const ObjectAddress *relationAddress);
static void ErrorIfCircularDependencyExists(const ObjectAddress *objectAddress);
static int ObjectAddressComparator(const void *a, const void *b);
static void EnsureDependenciesExistOnAllNodes(const ObjectAddress *target);
static void EnsureRequiredObjectExistOnAllNodes(const ObjectAddress *target,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming suggestion: Let's use the name of the struct RequiredObjectSet inside the name of this function

Suggested change
static void EnsureRequiredObjectExistOnAllNodes(const ObjectAddress *target,
static void EnsureRequiredObjectSetExistsOnAllNodes(const ObjectAddress *target,

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@onurctirtir what's your opinion?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the suggestion makes sense to me

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

List *dependenciesWithCommands = NIL;
Assert(requiredObjectSet == REQUIRE_ONLY_DEPENDENCIES ||
requiredObjectSet == REQUIRE_OBJECT_AND_DEPENDENCIES);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before doing anything else, here we can first check if the target object is in fact distributed: if it is we can return early here, regardless of the RequiredObjectSet

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested to put IsAnyObjectDistributed method however got test failures so I removed

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe lastly we agreed on adding the following check?

Suggested change
if (IsAnyObjectDistributed(list_make1((ObjectAddress *) target)))
{
return;
}

Copy link
Contributor Author

@gurkanindibay gurkanindibay Dec 25, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the test error when we use IsAnyObjectDistributed(list_make1((ObjectAddress *) target))
https://github.com/citusdata/citus/actions/runs/7321494377

Comment on lines +140 to +143
int saveNestLevel = NewGUCNestLevel();
set_config_option("citus.enable_create_role_propagation", "on",
(superuser() ? PGC_SUSET : PGC_USERSET), PGC_S_SESSION,
GUC_ACTION_LOCAL, true, 0, false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We never do anything when calling EnsureDependencies function, and sometimes we may need to propagate roles in that function. Whatever we did before, the same can be applied here. So, I think we don't need these lines.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block is required to set the GUC into previous state. I've used it in citus_internal_database_command

int saveNestLevel = NewGUCNestLevel();

@onurctirtir onurctirtir mentioned this pull request Nov 29, 2023
57 tasks
appendStringInfo(buf, "REASSIGN OWNED BY ");

AppendRoleList(buf, stmt->roles);
char const *newRoleName = RoleSpecString(stmt->newrole, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor;

Suggested change
char const *newRoleName = RoleSpecString(stmt->newrole, true);
char *newRoleName = RoleSpecString(stmt->newrole, true);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

without const I'm getting compile warning below

deparser/deparse_owned_stmts.c:94:22: warning: initialization discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
94 | char *newRoleName = RoleSpecString(stmt->newrole, true);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's do const char instead of char const as we usually do then?

Copy link
Member

@onurctirtir onurctirtir Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please let's not forget changing this, see #7319 (comment)

List *dependenciesWithCommands = NIL;
Assert(requiredObjectSet == REQUIRE_ONLY_DEPENDENCIES ||
requiredObjectSet == REQUIRE_OBJECT_AND_DEPENDENCIES);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe lastly we agreed on adding the following check?

Suggested change
if (IsAnyObjectDistributed(list_make1((ObjectAddress *) target)))
{
return;
}

@onurctirtir
Copy link
Member

onurctirtir commented Dec 25, 2023

(TODO for myself: Test the PR manually once #7319 (comment) is resolved.)

Copy link
Member

@onurctirtir onurctirtir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks very good to me, please consider those last two minor comments too, thanks!

Comment on lines +115 to +117
set citus.enable_create_role_propagation to off;
set citus.enable_alter_role_propagation to off;
set citus.enable_alter_role_set_propagation to off;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome test!

Comment on lines +92 to +94
show citus.enable_create_role_propagation;
show citus.enable_alter_role_propagation;
show citus.enable_alter_role_set_propagation;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

show stmts are not needed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm rollbacking GUC state to previous one so I wanted to test GUC state if it is rollbacked

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh that really makes sense

appendStringInfo(buf, "REASSIGN OWNED BY ");

AppendRoleList(buf, stmt->roles);
char const *newRoleName = RoleSpecString(stmt->newrole, true);
Copy link
Member

@onurctirtir onurctirtir Dec 28, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please let's not forget changing this, see #7319 (comment)

@gurkanindibay gurkanindibay merged commit c3579ee into main Dec 28, 2023
@gurkanindibay gurkanindibay deleted the reassign_owned_prop branch December 28, 2023 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants