We are trying to upgrade lustre-server from 2.10.4 (SL7.6, zfs-0.7.9) to 2.12.3 (SL7.7, zfs-0.7.13), see the compatibility matrix here
Before updating, umount all lustre clients first.create lustre.repo file
Details see /atlas/data08/manage/lustre/
#cat /atlas/data08/manage/lustre/lustre.repo
[lustre-server2123]
name=lustre-server2123
baseurl=http://umt3int05.aglt2.org/repo/lustre-server
\x85\x85
Copy this file to /etc/yum.repo.d/
# cp /atlas/data08/manage/lustre/lustre.repo /etc/yum.repo.d/
#yum update --skip-broken
#yum --enablerepo=*lustre*2123* update kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel
#yum --enablerepo=*lustre*2123* update lustre-dkms lustre-osd-zfs-mount lustre lustre-resource-agents zfs
#modprob -v lustre
#modprob -v zfs
#zpool status
If the zpool is not imported, try to import it
#zpool import ost-001
List the available lustre file systems
#zfs list
The above procedure can be run in parallel on all mdtmgs and oss servers, then
On mdtmgs server
#mount -t lustre ost-001/mdtmgs /lustre/mgt/
After mdtmgs is up, on OSS (OST) server,
#mount -t lustre ost-001/ost0001 /mnt/ost-001/
After all OSS are mounted, on the client
#mount -t lustre 10.10.2.120@tcp:/t3test /luste/t3test/
On all lustre servers/clients:
Copy the lustre repository file
# cp /atlas/data08/manage/lustre/lustre.repo /etc/yum.repo.d/
Update to SLC 7.7
#yum update --skip-broken
Install the most recent kernel (Lustre 2.13 works on the most recent kernel)
#yum --enablerepo=*2123* update kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel;
On all lustre clients (wn, umint03XX, lustre-nfs)
#systemctl stop lustre_umt3
Or
#umount -l /lustre/umt3
On all lustre servers
#reboot
Check if all are booted with the new kernel, if so, then install the zfs and lustre rpms, please note, from SL6 to SL7, zfs update does not work, need to remove the previous zfs rpms then reinstall them.
#screen -dm bash -c "yum --enablerepo=sl -y install asciidoc audit-libs-devel automake bc binutils-devel bison device-mapper-devel elfutils-devel elfutils-libelf-devel expect flex gcc gcc-c++ git glib2 glib2-devel hmaccalc keyutils-libs-devel krb5-devel ksh libattr-devel libblkid-devel libselinux-devel libtool libuuid-devel libyaml-devel lsscsi make ncurses-devel net-snmp-devel net-tools newt-devel numactl-devel parted patchutils pciutils-devel perl-ExtUtils-Embed pesign python-devel redhat-rpm-config rpm-build systemd-devel;yum -y remove kmod-spl;yum -y --enablerepo=*2123* install lustre-dkms lustre-osd-zfs-mount lustre lustre-resource-agents zfs"
Start the lustre service
#mount /mnt/mdtmgs
Check if the service is running, lctl dl should show devices
#lct dl
Import the zpools
(repeat this for all zpools listed in /etc/fstab)
#zpool import ost-001
Check if all zfs are present
#zfs list
Start all OSTs
#mount -t /mnt/ost-001
Check if all the OSTs on this OSS are up
#lct dl|grep obdf
Update the kernel
#yum -y --enablerepo=*2123* install kernel kernel-devel kernel-headers kernel-abi-whitelists kernel-tools kernel-tools-libs kernel-tools-libs-devel;
Reboot to the new kernel
#reboot
Update the lustre-client software, I also tried lustre-client-dkms , it does not work, so had to use kmod-lustre-client instead
#yum -y remove lustre-client kmod-lustre-client;yum -y --enablerepo=*2123* install lustre-client kmod-lustre-client
#systemctl start lustre_umt3
Only updated on umt3int02/03/04/05, umt3int01 is SL6, and all the work nodes need to update and reboot to the new kernel in batches.
-- WenjingWu - 07 Feb 2020