-
Notifications
You must be signed in to change notification settings - Fork 18
DOCTEAM-1962: Changing SBD config #558
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
<para> | ||
If you need to replace an &sbd; device, you can use <command>crm sbd device add</command> | ||
to add the new device and <command>crm sbd device remove</command> to remove the old device. | ||
If the cluster has two &sbd; devices, you can run these commands in any order. However, if | ||
the cluster has one or three &sbd; devices, you must run these commands in a specific order: | ||
</para> | ||
<itemizedlist> | ||
<listitem> | ||
<para> | ||
One device: <command>crm sbd device remove</command> cannot remove the only device, | ||
so you must add the new device before you can remove the old device. | ||
</para> | ||
</listitem> | ||
<listitem> | ||
<para> | ||
Three devices: <command>crm sbd device add</command> cannot add a fourth device, | ||
so you must remove the old device before you can add the new device. | ||
</para> | ||
</listitem> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently the procedure is a bit bloated because of having to swap the order of the steps depending on the number of devices. Did I overthink this, and there's actually a simpler way to do it?
If not, would it be possible to simplify it in a future release so you don't have to consider the order? 🙏
Maybe something like crm sbd device replace OLD-DEVICE NEW-DEVICE
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, If the user wants to replace the device, he must have to consider the current device number, and the cluster must be restarted twice:
- If there is only one device, must
add
first, thenremove
and restart - If there are 3 devices, must
remove
first, thenadd
and restart
@zzhou1 @gao-yan, what do you think to add a new option crm sbd device replace
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, If the user wants to replace the device, he must have to consider the current device number, and the cluster must be restarted twice:
Ahhh, thank you for pointing this out. It seemed to work okay for me just doing crm maintenance on
/crm cluster restart --all
/crm maintenance off
once, after doing both add
and remove
. But I only have a basic cluster with one IP address resource, so I can't say what effect it would have on a proper working cluster.
To restart the cluster twice, can I put the cluster in maintenance mode first, then do the add/restart/remove/restart, then take the cluster out of maintenance only at the end?
Or does each change only apply after you remove maintenance mode (so you have to do crm maintenance on
/crm cluster restart --all
/crm maintenance off
twice)?
E.g. here is what happened when I tried doing maintenance mode first with both step orders:
Add then remove:
- Put cluster in maintenance mode
- Added a new device
- The cluster restarted automatically
- Removed old device
- Had to restart the cluster manually
- Exited maintenance mode
Remove then add:
- Put cluster in maintenance mode
- Removed old device
- Had to restart the cluster manually
- Added a new device
- The cluster restarted automatically
- Exited maintenance mode
This seemed to work fine in both cases, but again I only have a minimal test cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From my impression, I would propose a syntax like crm sbd device add xxx remove yyy
, where the order is interchangeable with crm sbd device remove yyy add xxx
. Let crmsh ensure the correct order internally.
And, I would prefer to avoid adding another subcommand, as I feel the crmsh options are already quite crowded. However, if the consensus is to implement a 'replace' command, I suggest a self-explanatory syntax like: crm sbd device replace xxx by yyy
. Bash completion can help to display "by" to guide the sysadmin.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can I put the cluster in maintenance mode first, then do the add/restart/remove/restart, then take the cluster out of maintenance only at the end?
This is good idea indeed! With maintenance on, the cluster restart twice will be very quick and same for sbd to pick up the latest configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated this procedure, and the PDF in the description. Let me know if it still needs any adjustments :)
Hi @tahliar |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi Tahlia,
LGTM. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, just minor nits, really. Thanks!
&sbd; relies on multiple different timeout settings to manage node fencing. When you | ||
configure &sbd; using the &crmshell;, these timeouts are automatically calculated and | ||
adjusted. The automatic values are sufficient for most use cases, but if you need to | ||
change them you can use the <command>crm sbd configure</command> command. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change them you can use the <command>crm sbd configure</command> command. | |
change them, you can use the <command>crm sbd configure</command> command. |
<para> | ||
When you change a timeout with <command>crm sbd configure</command>, the global | ||
&stonith; timeouts are also adjusted automatically. The automatic values are | ||
sufficient for most use cases, but if you need to change them you can use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sufficient for most use cases, but if you need to change them you can use the | |
sufficient for most use cases, but if you need to change them, you can use the |
If you change one timeout, the other timeout is automatically adjusted so the | ||
<literal>msgwait-timeout</literal> is double the <literal>watchdog-timeout</literal>. You |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you change one timeout, the other timeout is automatically adjusted so the | |
<literal>msgwait-timeout</literal> is double the <literal>watchdog-timeout</literal>. You | |
If you change one timeout, the other timeout is automatically adjusted so that the | |
<literal>msgwait-timeout</literal> is twice the <literal>watchdog-timeout</literal>. You |
</step> | ||
<step> | ||
<para> | ||
Check the status of &sbd; to make sure the device was removed from all of the nodes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the status of &sbd; to make sure the device was removed from all of the nodes: | |
Check the status of &sbd; to make sure the device was removed from all the nodes: |
PR creator: Description
This article describes various ways to change an existing SBD configuration. It is almost entirely brand new content for the new
crm sbd
command.PDF:
HA-sbd-changing-configuration_en.pdf (Updated 14 Oct)
PR creator: Are there any relevant issues/feature requests?
PR reviewer: Checklist for editorial review
Apart from the usual checks, please double-check also the following: