After completing this section, you should be able to describe the prerequisites for SAP HANA, operating system, and cluster updates.
Updates are an integral part of any safe, secure, and stable environment. It is therefore highly recommended to keep your systems updated. Update processes are generally smooth, and in many cases can be performed online in standalone systems However, for High Availability cluster environments, updating generally has more complexities compared with standalone environments.
Given that one of the primary responsibilities of a High Availability cluster is to provide continuous service for applications or resources, it is especially important to apply updates in a systematic and consistent fashion to avoid any potential disruption to the availability of those critical services. This chapter aims to outline Red Hat recommended practices for applying updates to the cluster software itself and to the software that comprises the base RHEL operating system, libraries, and utilities.
It is critical when performing software update procedures for RHEL High Availability and Resilient Storage clusters to ensure that any node to undergo updates is not an active member of the cluster before those updates are initiated, https://access.redhat.com/solutions/2059493. Swapping out the software that the cluster stack relies on while in use can lead to various problems and unexpected behaviors, including issues that can cause complete outages of the cluster and services that it manages.
Red Hat does not support in-place upgrades or rolling upgrades of cluster nodes from one major release of RHEL to another. For example, no supported method exists for updating some nodes in a cluster from RHEL 7 to RHEL 8, introducing them into the cluster with existing RHEL 7 nodes to take over resources from them, and then updating the remaining RHEL 7 nodes. Upgrades in major releases of RHEL must be done either as a whole to the entire cluster at once, or through migrating services from a running cluster on the earlier release to another cluster that runs the later release.
In other words:
Upgrade of systems by using the High Availability add-on from RHEL 6 to RHEL 7 is unsupported: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/upgrading_from_rhel_6_to_rhel_7/index#planning-an-upgrade-from-rhel-6-to-rhel-7upgrading-from-rhel-6-to-rhel-7
Upgrade of systems by using the High Availability add-on from RHEL 7 to RHEL 8 is unsupported: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/8/html-single/upgrading_from_rhel_7_to_rhel_8/planning-an-upgrade_upgrading-from-rhel-7-to-rhel-8
Upgrade of systems by using the High Availability add-on from RHEL 8 to RHEL 9 is unsupported.
Red Hat does not support rolling upgrades of shared storage that is exported with samba+ctdb; see Does ctdb Shared Storage Support Rolling Upgrades?, https://access.redhat.com/solutions/2803441
While an update is in process, do not change your cluster configuration. For example, do not add or remove resources or constraints.
Although it is not required, when upgrading a Pacemaker cluster, it is a good practice to upgrade all cluster nodes before upgrading any Pacemaker remote nodes or Docker containers that are used in bundles.
Contact Red Hat Global Support Services at https://access.redhat.com/support/contact/technicalSupport for any help with planning any kind of update, upgrade, or migration. Proper planning and risk mitigation are key to a successful update or migration, and Red Hat experts can help to ensure a smooth process.
Use one of the following general ways to update packages that make up the RHEL High Availability and Resilient Storage Add-Ons, either individually or as a whole:
Rolling updates: The basic idea is to take a fully formed and active cluster, remove one node from service by stopping its relevant services and daemons, update its software, and then integrate it back into the cluster before repeating the procedure on another node. In this way, the cluster can continue to provide service and manage resources while each node is updated, and the update nodes can provide service while bringing up the remaining nodes to the same software level. The node that undergoes an update at each stage should not be a member of the cluster while the update is ongoing.
Entire cluster update: When a cluster can undergo a complete outage, it can simplify update procedures greatly. Such situations allow for stopping the entire cluster, applying updates to all nodes simultaneously (or one after another, if preferred), and then starting the cluster back up together. An important benefit of such a procedure is that nodes should not at any time be running separate versions of the software, thereby eliminating any risk of incompatibilities or unexpected behavior due to such mismatches. This option also eliminates any complexity with repeatedly moving resources around in the cluster to accommodate each node stopping and then rejoining.
Risks and Considerations
When performing a rolling update, the presence of different versions of the High Availability and Resilient Storage within the same cluster introduces a risk of unexpected behavior. While Red Hat seeks to eliminate any known incompatibilities between different releases within the same major RHEL release, its testing is limited for different versions of the software that operate simultaneously. Some previously unforeseen incompatibility between versions might cause unexpected behavior, and so the only way to eliminate this risk is to use the entire cluster update method.
New software versions always bring the potential for unexpected behavior, functional changes that might require advance preparation, or in rare cases, bugs that could impact the product operation. Red Hat strongly recommends to configure a test, development, or staging cluster identically to any production clusters, and to use such a cluster to roll out any updates first for thorough testing before the rollout in production.
Performing a rolling update necessarily means reducing the overall capacity and redundancy within the cluster. The size of the cluster dictates whether the absence of a single node poses a significant risk, with larger clusters able to absorb more node failures before reaching the critical limit, and with smaller clusters less capable or not capable at all of withstanding the failure of another node while one is missing. It is important to consider and account for the potential for failure of additional nodes during the update procedure. If possible, taking a complete outage and updating the cluster entirely might be the preferred option, to avoid leaving the cluster to operate in a state where additional failures could lead to an unexpected outage.
The specific steps to follow differs depending on the RHEL release and the style of cluster in use.
RHEL 6, 7, and 8 Clusters with Pacemaker
Perform the following steps to update the base RHEL packages, High Availability Add-On packages, and Resilient Storage Add-On packages on each node in a rolling fashion:
Choose a single node to update the software. Carry out now any needed preparations before stopping or moving the resources or the running software on that node.
Move any managed resources off this node as needed. For any specific requirements or preferences for where to relocate resources, then consider creating location constraints to place the resources on the correct node: https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/high_availability_add-on_reference/index#s1-locationconstraints-HAAR. A strategically chosen resource location can minimize the number of moves throughout the rolling update procedure, rather than moving resources to prepare for every node update. Otherwise, if allowing the cluster to automatically manage placement of resources on its own is acceptable, then the next step automatically take care of it.
Place the chosen node in standby mode, https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/high_availability_add-on_reference/index#s2-standbymode-HAAR, to ensure that it is not considered in service, and to relocate elsewhere any remaining resources, or stop them according to the configuration.
[root@node ~]# # Syntax: # pcs node standby [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs node standby node1.example.comStop the cluster software on the chosen node by using pcs:
[root@node ~]# # Syntax: # pcs cluster stop [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs cluster stop node1.example.comUpdate any software as needed on the chosen node. Various methods to do so are outside the scope of this lecture. Consult the general instructions for installing High Availability software, https://access.redhat.com/solutions/45930, Knowledge Content in the Customer Portal, https://access.redhat.com/search/, or the product documentation, https://access.redhat.com/documentation/en/
If any software was updated that necessitates a reboot, prepare for that reboot. It is recommended to disable cluster software from starting on boot, to verify that the host is fully functional on its new software versions before bringing it into the cluster. To disable the cluster stack from starting on boot on this chosen node, use this command:
[root@node ~]# # Syntax: # pcs cluster disable [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs cluster disable node1.example.comReboot when ready. When complete, ensure that the host is fully functional and is using the correct software in any relevant areas, such as having booted into the latest kernel. If anything does not seem correct, then do not proceed until the situation is resolved. If help is needed, contact Red Hat Global Support Services, https://access.redhat.com/support/contact/technicalSupport
After everything appears to be set up correctly, re-enable the cluster software on this chosen node if it was previously enabled:
[root@node ~]# # Syntax: # pcs cluster enable [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs cluster enable node1.example.comRejoin the updated node into the cluster:
[root@node ~]# # Syntax: # pcs cluster start [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs cluster start node1.example.comReview the pcs status output to determine whether everything appears as it should.
When the node seems to be functioning properly, reactivate it for service by taking it out of standby mode:
[root@node ~]# # Syntax: # pcs node unstandby [<node>]
[root@node ~]# # Example:
[root@node ~]# pcs node unstandby node1.example.comIf any temporary location constraints were created in step 2 to control the placement of resources, then adjust or remove them, so that resources can return to their normally preferred locations. See https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html-single/high_availability_add-on_reference/index#s1-locationconstraints-HAAR
Repeat the previous steps for each remaining node.
The process for updating an entire cluster immediately is nearly identical to the previously described rolling update procedure, except to perform each step on all nodes before moving on to the next step. So, for example, stop the cluster daemons on each node before moving on to updating the software, and reboot each node before moving on to re-enabling the cluster software, and so on. The ultimate goal is to stop the cluster software on all nodes, update those nodes, and then restart the cluster software. You can use these steps as a guide, or simplify them to skip some of the preparation steps if they are not required.
You must follow the SAP guidelines for updating software releases, and also review the PAM (Product Availability Matrix) for the correct combination of releases.
Ensure that the combination of SAP HANA and RHEL versions before and after the upgrade is supported. Otherwise, you must upgrade to a supported combination of versions.
Put the cluster in Maintenance mode, and then start the update process of your SAP HANA-related software or packages. This way, the cluster does not respond to any monitor operation failures during the SAP update process.
[root@node ~]# pcs property set maintenance-mode=trueAnother option is to stop the cluster completely and manually start the SAP software if required for the update process. This option is useful especially when reboots are required.
[root@node ~]#pcs cluster stop --all[root@node ~]#pcs cluster disable --all
If the cluster managed NFS mounts, then if stopping the cluster, they must be manually mounted before you upgrade any SAP software.