Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how to forcefully drop/remove a node with citus. #7457

Open
shankarmn94 opened this issue Jan 25, 2024 · 2 comments
Open

how to forcefully drop/remove a node with citus. #7457

shankarmn94 opened this issue Jan 25, 2024 · 2 comments

Comments

@shankarmn94
Copy link

in my 8 node citus cluster, where one of the node is giving the problem of loading or getting data from specific tables,

so we tried the following,

as i mentioned in #7424

we removed the nodes one by one from the last and added back so we can check if the shards on that node causing the problem,
when we tried to remove worker-3 we are stuck with moving shards, create tables on cluster db.
when i tried to drain this node it took 2 days to move a shards.. while i keep trying it moved 2 shards later..

so this was taking time i tried to remove the node directly..

ccnsapp=# select citus_remove_node('10.104.0.5',5432);
ERROR: cannot remove or disable the node 10.104.0.5:5432 because because it contains the only shard placement for shard 102017
DETAIL: One of the table(s) that prevents the operation complete successfully is public.companies
HINT: To proceed, either drop the tables or use undistribute_table() function to convert them to local tables

so i started to move this shard to differnt node..
SELECT citus_move_shard_placement( 102017, '10.104.0.5', 5432,'10.104.0.10', 5432);

And am trying this from two days.. where this will be stuck or hung am not able to understand..

here when i see the progress

sessionid | table_name | shardid | shard_size | sourcename | sourceport | targetname | targetport | progress | source_shard_size | target_shard_size | operation_type | source_lsn | target_lsn | status
-----------+------------------------------------+---------+------------+------------+------------+-------------+------------+----------+-------------------+-------------------+----------------+--------------+------------+------------
201523 | companies | 102017 | 155648 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 155648 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | agents | 102049 | 18087936 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 18087936 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | assets | 102081 | 121077760 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 121077760 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | deprecation_status | 102113 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/751164B8 | | Setting Up
201523 | windows_specifics | 102145 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/751164B8 | | Setting Up
201523 | mac_specifics | 102177 | 40960 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 40960 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | linux_specifics | 102209 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/751164B8 | | Setting Up
201523 | network_devices_specifics | 102241 | 32768 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 32768 | 0 | move | 8FE/751164B8 | | Setting Up
201523 | asset_windows_reboot_required | 102305 | 73728 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 73728 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | asset_windows_security_products | 102337 | 2760704 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 2760704 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | asset_firewall_rules | 102369 | 576430080 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 576430080 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | asset_unqouted_services | 102401 | 688128 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 688128 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | asset_msdt | 102433 | 1089536 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 1089536 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | asset_registry_misconfiguration | 102465 | 27328512 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 27328512 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | remediated | 102497 | 802816 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 802816 | 8192 | move | 8FE/751164B8 | | Setting Up
201523 | software | 102529 | 119169024 | 10.104.0.5 | 5432 | 10.104.0.10 | 5432 | 1 | 119169024 | 8192 | move | 8FE/751164B8 | | Setting Up

...
...
(116 rows)

how long it will take.. this says setting up setting up i dono when this will stop..

could some one let me know whats happening over here.

@cyraid
Copy link

cyraid commented May 28, 2024

I also would like to know how to remove a node? The physical node isn't running anymore, and it's giving me the error:

ERROR: cannot remove or disable the node <name>:5432 because it contains the only shard placement for shard 102073
Detail: One of the table(s) that prevents the operation complete successfully is <table>
Hint: To proceed, either drop the tables or use undistribute_table() function to convert them to local tables

I can't undistribute that table as the node isn't running anymore, and I can't remove the node because it's not running. I need to remove that table and only one worker is running, how I can I remove the table?

@Eugene-M
Copy link

Eugene-M commented Sep 10, 2024

I faced the same problem. Is there any solution? I understand that it's because Citus prevents data loss.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants