-
Notifications
You must be signed in to change notification settings - Fork 695
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Avoid re-assigning the global pid for client backends and bg workers …
…when the application_name changes (#7791) DESCRIPTION: Fixes a crash that happens because of unsafe catalog access when re-assigning the global pid after application_name changes. When application_name changes, we don't actually need to try re-assigning the global pid for external client backends because application_name doesn't affect the global pid for such backends. Plus, trying to re-assign the global pid for external client backends would unnecessarily cause performing a catalog access when the cached local node id is invalidated. However, accessing to the catalog tables is dangerous in certain situations like when we're not in a transaction block. And for the other types of backends, i.e., the Citus internal backends, we need to re-assign the global pid when the application_name changes because for such backends we simply extract the global pid inherited from the originating backend from the application_name -that's specified by originating backend when openning that connection- and this doesn't require catalog access. (cherry picked from commit 7341191)
- Loading branch information
1 parent
c44682a
commit cb31a64
Showing
6 changed files
with
111 additions
and
9 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,37 @@ SELECT master_remove_node('localhost', :master_port); | |
|
||
(1 row) | ||
|
||
-- to silence -potentially flaky- "could not establish connection after" warnings in below test | ||
SET client_min_messages TO ERROR; | ||
-- to fail fast when the hostname is not resolvable, as it will be the case below | ||
SET citus.node_connection_timeout to '1s'; | ||
BEGIN; | ||
SET application_name TO 'new_app_name'; | ||
-- that should fail because of bad hostname & port | ||
SELECT citus_add_node('200.200.200.200', 1, 200); | ||
ERROR: connection to the remote node [email protected]:1 failed | ||
-- Since above command failed, now Postgres will need to revert the | ||
-- application_name change made in this transaction and this will | ||
-- happen within abort-transaction callback, so we won't be in a | ||
-- transaction block while Postgres does that. | ||
-- | ||
-- And when the application_name changes, Citus tries to re-assign | ||
-- the global pid but it does so only for Citus internal backends, | ||
-- and doing so for Citus internal backends doesn't require being | ||
-- in a transaction block and is safe. | ||
-- | ||
-- However, for the client external backends (like us here), Citus | ||
-- doesn't re-assign the global pid because it's not needed and it's | ||
-- not safe to do so outside of a transaction block. This is because, | ||
-- it would require performing a catalog access to retrive the local | ||
-- node id when the cached local node is invalidated like what just | ||
-- happened here because of the failed citus_add_node() call made | ||
-- above. | ||
-- | ||
-- So by failing here (rather than crashing), we ensure this behavior. | ||
ROLLBACK; | ||
RESET client_min_messages; | ||
RESET citus.node_connection_timeout; | ||
-- restore coordinator for the rest of the tests | ||
SELECT citus_set_coordinator_host('localhost', :master_port); | ||
citus_set_coordinator_host | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,41 @@ | ||
-- removing coordinator from pg_dist_node should update pg_dist_colocation | ||
SELECT master_remove_node('localhost', :master_port); | ||
|
||
-- to silence -potentially flaky- "could not establish connection after" warnings in below test | ||
SET client_min_messages TO ERROR; | ||
|
||
-- to fail fast when the hostname is not resolvable, as it will be the case below | ||
SET citus.node_connection_timeout to '1s'; | ||
|
||
BEGIN; | ||
SET application_name TO 'new_app_name'; | ||
|
||
-- that should fail because of bad hostname & port | ||
SELECT citus_add_node('200.200.200.200', 1, 200); | ||
|
||
-- Since above command failed, now Postgres will need to revert the | ||
-- application_name change made in this transaction and this will | ||
-- happen within abort-transaction callback, so we won't be in a | ||
-- transaction block while Postgres does that. | ||
-- | ||
-- And when the application_name changes, Citus tries to re-assign | ||
-- the global pid but it does so only for Citus internal backends, | ||
-- and doing so for Citus internal backends doesn't require being | ||
-- in a transaction block and is safe. | ||
-- | ||
-- However, for the client external backends (like us here), Citus | ||
-- doesn't re-assign the global pid because it's not needed and it's | ||
-- not safe to do so outside of a transaction block. This is because, | ||
-- it would require performing a catalog access to retrive the local | ||
-- node id when the cached local node is invalidated like what just | ||
-- happened here because of the failed citus_add_node() call made | ||
-- above. | ||
-- | ||
-- So by failing here (rather than crashing), we ensure this behavior. | ||
ROLLBACK; | ||
|
||
RESET client_min_messages; | ||
RESET citus.node_connection_timeout; | ||
|
||
-- restore coordinator for the rest of the tests | ||
SELECT citus_set_coordinator_host('localhost', :master_port); |