Lustre 2.10 with ZFS 0.7.1 from standard repo
This page documents building the Lustre 2.10 RPMs on CentOS 7.3+ using the default yum install of ZFS 0.7.1. The steps followed are primarily from
http://wiki.lustre.org/Lustre_with_ZFS_Install
The following is valid for CentOS 7.3+
Index of Sections
Building Lustre RPMs
- Prepare System
- Disable SELinux for older clients
- Install the kernel development tools
- Install additional dependencies
- Install ZFS 0.7.1 RPMs
- EPEL release
- For the newest Lustre releases change
/etc/yum.repos.d/zfs.repo
to switch from dkms
to kmod
(more info here)
- Install ZFS and its associated SPL packages
-
kmod
packages for newer releases
- Build Lustre RPMs
- Get Lustre source code
- Configure (
--disable-ldiskfs
for ZFS backend, --without-server
for client only)
- Make and optionally install
rpms
(NOTE: Make sure you have enough space in /tmp to do the build...needs about 3GB)
- service cfengine3 stop
- remove kernel* exclusions in sl and sl-security repo file
- yum install kernel-headers kernel-devel
- Yes, this now worked
- Comment out /home partition in /etc/fstab, and umount it.
- Stop and chkconfig off cfe.
- yum erase kernel-firmware (had to do this first)
- Follow balance of directions
- afs will not start, not surprising, ignore for now.
- stop short of doing the mkfs on the former /home partition, and
- instead dd back the previously saved partition content.
- cd /atlas/data19/zzzNoCopy
- dd if=mgs_dd.dat of=/dev/mapper/vg0-lv_home bs=4096
- mkdir /mnt/mgs
- Add fstab entry that was saved
- mount /mnt/mgs
- Use T3test instead of umt3B in the mount, and the correct IP, being 10.10.1.140
- WORKED!
- dd if=/dev/mapper/vg0-lv_home of=mgs_dd_new.dat bs=4096
- grub2-mkconfig -o /boot/grub2/grub.cfg
Notes on building all the rpms for Lustre 2.10.4
Note, this is all a verbatim as I just want notes recorded for now, and it will be cleaned up afterwards. BB, 5/24/2018
[root@umdist10 ~]# rpm -qa|grep -i lustre
lustre-2.10.1_dirty-1.el7.x86_64
lustre-osd-zfs-mount-2.10.1_dirty-1.el7.x86_64
kmod-lustre-2.10.1_dirty-1.el7.x86_64
lustre-resource-agents-2.10.1_dirty-1.el7.x86_64
kmod-lustre-osd-zfs-2.10.1_dirty-1.el7.x86_64
[root@umdist10 ~]# yum erase lustre lustre-osd-zfs-mount kmod-lustre lustre-resource-agents kmod-lustre-osd-zfs
[root@umdist10 yum.repos.d]# rpm -qa|grep -e zfs -e spl|sort
kmod-spl-0.7.7-1.el7.x86_64
kmod-spl-devel-0.7.7-1.el7.x86_64
kmod-zfs-0.7.7-1.el7.x86_64
kmod-zfs-devel-0.7.7-1.el7.x86_64
libzfs2-0.7.7-1.el7.x86_64
libzfs2-devel-0.7.7-1.el7.x86_64
spl-0.7.7-1.el7.x86_64
spl-debuginfo-0.7.7-1.el7.x86_64
spl-kmod-debuginfo-0.7.7-1.el7.x86_64
zfs-0.7.7-1.el7.x86_64
zfs-debuginfo-0.7.7-1.el7.x86_64
zfs-dracut-0.7.7-1.el7.x86_64
zfs-kmod-debuginfo-0.7.7-1.el7.x86_64
zfs-release-1-5.el7_4.noarch
zfs-test-0.7.7-1.el7.x86_64
-------- Also have lbzpool2 that is updating --------
See this URL for info on installing zfsonlinux
https://github.com/zfsonlinux/zfs/wiki/RHEL-and-CentOS
[root@umdist10 yum.repos.d]# rpm -ql zfs-release
/etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
/etc/yum.repos.d/zfs.repo
Erasing : zfs-release-1-5.el7_4.noarch
yum install http://download.zfsonlinux.org/epel/zfs-release.el7_5.noarch.rpm
Installing:
zfs-release noarch 1-5.el7.centos /zfs-release.el7_5.noarch 2.9 k
yum --enablerepo=zfs-kmod update (all zfs repo were disabled)
reboot
Saved all the zfs rpms at /atlas/data08/ball/admin/LustreSL7/SL7.5/zfs
cd /root
git clone git://git.hpdd.intel.com/fs/lustre-release.git -b v2_10_4
cd lustre-release
sh ./autogen.sh
yum install yaml-cpp yaml-cpp-devel (this was not needed)
yum install libyaml libyaml-devel
./configure --with-spec=redhat
make
make rpms
Now,
117 mkdir ../rpmbuild
118 mkdir ../rpmbuild/SRPMS
119 mkdir ../rpmbuild/RPMS
120 mkdir ../rpmbuild/SOURCE
121 mkdir ../rpmbuild/SPECS
122 mkdir ../rpmbuild/BUILD
123 mkdir ../rpmbuild/BUILDROOT
cp lustre-2.10.4-1.src.rpm ../rpmbuild/SRPMS
rpm -ivh lustre-2.10.4-1.src.rpm
rpmbuild --ba --with zfs --with servers --without ldiskfs ~/rpmbuild/SPECS/lustre.spec
The rpmbuild came up with the same rpms as in the lustre-release directory
It seems that the kernel source code rpm must be installed in order to build "--with ldiskfs" or even with this left off. Also, for ldiskfs need the debuginfo rpms for the current kernel.
rpm -ivh kernel-3.10.0-862.3.2.el7.src.rpm
yum --enablerepo=sl-debuginfo update kernel-debuginfo kernel-debuginfo-common-x86_64
rpmbuild --ba --with zfs --with servers --with ldiskfs ~/rpmbuild/SPECS/lustre.spec
This worked.
Now, make client rpms
cd ~/lustre-release
./configure --disable-server --enable-client
make rpms
cd /atlas/data08/ball/admin/LustreSL7
mkdir SL7.5
cd SL7.5
mkdir zfs
mkdir server
mkdir client
cd zfs
scp root@umdist10.local:/var/cache/yum/zfs-kmod/packages/*.rpm .
cd ../server
scp root@umdist10.local:/root/rpmbuild/RPMS/x86_64/*.rpm .
cd ../client
scp root@umdist10.local:/root/lustre-release/*client*.rpm .
Now, set up umdist10 to be a 2.10.4 OSS
cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server
yum localinstall lustre-2.10.4-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.4-1.el7.x86_64.rpm kmod-lustre-2.10.4-1.el7.x86_64.rpm \
lustre-resource-agents-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.4-1.el7.x86_64.rpm
Mount pretty much without issues.
On bl-11-3, try out the 2.10.3 client on both the test Lustre, and the production Lustre
[root@bl-11-3 ~]# mount -o localflock,lazystatfs -t lustre 10.10.1.140@tcp:/T3test /lustre/umt3
[root@bl-11-3 tmp]# cd /tmp
[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G .
real 20m59.407s
user 0m2.044s
sys 5m48.584s
[root@bl-11-3 tmp]# du -s -x -h copiedTo10G/
20G copiedTo10G/
[ball@umt3int01:~]$ ll /lustre/umt3/copiedTo10G_d/|wc -l
71177
-----------
Now do the same as on the production instance
[root@bl-11-3 tmp]# umount /lustre/umt3
[root@bl-11-3 tmp]# systemctl start lustre_umt3
[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .
real 61m2.582s
user 0m3.239s
sys 9m38.740s
-------------- Move to the SL7.5 kernel and 2.10.4 Lustre on both bl-11-3 and on dc40-4-34
[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .
real 104m22.910s
user 0m3.397s
sys 5m10.465s
[root@c-4-34 tmp]# time cp -ar /lustre/umt3/copiedTo10G .
real 20m43.214s
user 0m2.661s
sys 3m52.731s
Now, switch the data sources between these 2. From the first test, access to our production Lustre
instance from SL7.5 client is slower BY A LOT than access from SL7.4 clients.
[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G .
real 21m30.744s
user 0m1.939s
sys 3m21.433s
[root@c-4-34 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .
real 10m6.469s
user 0m2.012s
sys 3m19.323s
---------------------- Now update the production Lustre servers --------------------------
Stop lustre on all clients. On lustre-nfs, stop nfs, then stop lustre. Wait 5 minutes.
umount all ost on all OSS. Wait 5 minutes
umount /mnt/mdtmgs on mdtmgs.aglt2.org
Back up the Lustre metadata
[root@mdtmgs ~]# time dd if=/dev/sdb of=/atlas/data19/zzzNoCopy/mdtmgs_dd_Jun.5.2018.dat bs=4096
262144000+0 records in
262144000+0 records out
1073741824000 bytes (1.1 TB) copied, 3084.21 s, 348 MB/s
real 51m24.242s
user 0m37.941s
sys 20m19.820s
-------------------------------------------------
[root@umdist01 ~]# cd /etc
[root@umdist01 etc]# cp -p fstab fstab.save
[root@umdist01 etc]# vi /etc/fstab
Remove ost entries for now
yum erase kmod-lustre kmod-lustre-osd-zfs lustre lustre-osd-zfs-mount lustre-resource-agents
Edit sl.repo and sl-security.repo to remove exclusion on "kernel*"
yum update
(check the avaliable space in /boot, may need to deletel old kernel rpms to make room)
yum erase kmod-spl kmod-zfs libnvpair1 libuutil1 libzfs2 libzpool2 spl zfs zfs-dracut
[root@umdist01 ~]# cd /atlas/data08/ball/admin/LustreSL7/SL7.5/zfs
[root@umdist01 zfs]# yum localinstall kmod-spl-0.7.9-1.el7_5.x86_64.rpm kmod-zfs-0.7.9-1.el7_5.x86_64.rpm libnvpair1-0.7.9-1.el7_5.x86_64.rpm libuutil1-0.7.9-1.el7_5.x86_64.rpm libzfs2-0.7.9-1.el7_5.x86_64.rpm libzpool2-0.7.9-1.el7_5.x86_64.rpm spl-0.7.9-1.el7_5.x86_64.rpm zfs-0.7.9-1.el7_5.x86_64.rpm zfs-dracut-0.7.9-1.el7_5.x86_64.rpm
reboot
cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server
[root@umdist01 server]# yum localinstall kmod-lustre-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.4-1.el7.x86_64.rpm lustre-2.10.4-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.4-1.el7.x86_64.rpm lustre-resource-agents-2.10.4-1.el7.x86_64.rpm
update firmware! But not on the PE2950. Then reboot after update completes
Put /etc/fstab.save back in place and mount all of the ost
mount -av
-----------
on mdtmgs do....
[root@mdtmgs ~]# rpm -qa|grep -e lustre
lustre-osd-ldiskfs-mount-2.10.1_dirty-1.el7.x86_64
kmod-lustre-osd-ldiskfs-2.10.1_dirty-1.el7.x86_64
kmod-lustre-2.10.1_dirty-1.el7.x86_64
lustre-2.10.1_dirty-1.el7.x86_64
lustre-resource-agents-2.10.1_dirty-1.el7.x86_64
Comment out mdtmgs in /etc/fstab
Modify sl.rep and sl-security.repo to allow kernel updates
yum update
cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server
yum localupdate lustre-osd-ldiskfs-mount-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.10.4-1.el7.x86_64.rpm \
kmod-lustre-2.10.4-1.el7.x86_64.rpm lustre-2.10.4-1.el7.x86_64.rpm lustre-resource-agents-2.10.4-1.el7.x86_64.rpm
reboot
Upon reboot, uncomment the fstab entry, and then
mount /mnt/mdtmgs
------------
Also updated lustre-nfs in the usual way that a client is updated, and rebooted it.
Testing that mgs.aglt2.org can be built from scratch using the 2.7.58 file suite
mgs.aglt2.org has a valid, test Lustre system on it. We took the ldiskfs combined mgs and, after umount, it was saved via dd command to an NFS storage location. We want to be able to restore that where it came from, the former /home partition, but the Cobbler build wipes that partition. So, this tests that we are able to both install the system from scratch, and recover this dd copy of the mgs.
First, get the rpm list right, starting with fixing cfe for a few "issues".
Hmmm, issue is that we have ONLY kernel and kernel-firmware following the Cobbler build, so, first we get the kernel in place that we want.
Force install of kernel-headers and kernel-devel
- service cfengine3 stop
- remove kernel* exclusions in sl and sl-security repo file
- yum install kernel-headers kernel-devel
Check if cf3 will now install dkms and fusioninventory-agent
Make new rpm list and compare to old; looks good modulo some non-relevant rpms.
Reboot to be clean, followed by
- Comment out /home partition in /etc/fstab, and umount it.
- Stop and chkconfig off cfe.
Now, go to our
LustreZFS Wiki page and try to install the Lustre kernel.
- yum erase kernel-firmware (had to do this first)
- Follow balance of directions
- afs will not start, not surprising, ignore for now.
- stop short of doing the mkfs on the former /home partition, and
- instead dd back the previously saved partition content.
- cd /atlas/data19/zzzNoCopy
- dd if=mgs_dd.dat of=/dev/mapper/vg0-lv_home bs=4096
- mkdir /mnt/mgs
- Add fstab entry that was saved
- mount /mnt/mgs
Test via umdist10 and some WN
- Use T3test instead of umt3B in the mount, and the correct IP, being 10.10.1.140
- WORKED!
So, unwind mounts, and do a new dd to data19 in prep for upgrading to SL7 on this test machine.
- dd if=/dev/mapper/vg0-lv_home of=mgs_dd_new.dat bs=4096
Building mdtmgs and the Production Lustre OSS
For the metadata server, mdtmgs we do the following.
cd /atlas/data08/ball/admin/LustreSL7/download_orig/e2fsprogs
yum localinstall e2fsprogs-1.42.13.wc6-7.el7.x86_64.rpm e2fsprogs-libs-1.42.13.wc6-7.el7.x86_64.rpm libcom_err-1.42.13.wc6-7.el7.x86_64.rpm libcom_err-devel-1.42.13.wc6-7.el7.x86_64.rpm libss-1.42.13.wc6-7.el7.x86_64.rpm libss-devel-1.42.13.wc6-7.el7.x86_64.rpm
cd ../..
yum localinstall kmod-lustre-2.10.1_dirty-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.10.1_dirty-1.el7.x86_64.rpm lustre-2.10.1_dirty-1.el7.x86_64.rpm lustre-osd-ldiskfs-mount-2.10.1_dirty-1.el7.x86_64.rpm lustre-resource-agents-2.10.1_dirty-1.el7.x86_64.rpm
OSS Servers
For the umdist0N, this becomes:
yum -y install --nogpgcheck http://download.zfsonlinux.org/epel/zfs-release.el7_4.noarch.rpm
Edit zfs.repo to choose the zfs-kmod files
yum install kmod-zfs kmod-spl spl zfs libnvpair1 libuutil1 libzfs2 libzpool2
cd /atlas/data08/ball/admin/LustreSL7
yum localinstall kmod-lustre-2.10.1_dirty-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.1_dirty-1.el7.x86_64.rpm lustre-2.10.1_dirty-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.1_dirty-1.el7.x86_64.rpm lustre-resource-agents-2.10.1_dirty-1.el7.x86_64.rpm
modprobe zfs
zpool import -a
zpool upgrade -a
Disable the zfs repo
systemctl enable zfs.target
systemctl enable zfs-import-cache
systemctl enable zfs-mount
systemctl enable zfs-share
systemctl enable zfs-zed
Add back the saved fstab entries
mkdir /mnt/ost-001 (etc)
mount -av
/root/tools/configure_dell_alerts.sh
Note, on the PE1950/PE2950 the monitor hardware is too old and generates a continuous stream (every 10 seconds) of errors on the console and in the logs. To disable this problem, in /etc/default/grub edit the line where GRUB_CMDLINE_LINUX is defined to add the parameter "nomodeset". Then
- grub2-mkconfig -o /boot/grub2/grub.cfg
Following the next reboot, the problem will no longer be extant.
"The newest kernels have moved the video mode setting into the kernel. So all the programming of the hardware specific clock rates and registers on the video card happen in the kernel rather than in the X driver when the X server starts.. This makes it possible to have high resolution nice looking splash (boot) screens and flicker free transitions from boot splash to login screen. Unfortunately, on some cards this doesn't work properly and you end up with a black screen. Adding the nomodeset parameter instructs the kernel to not load video drivers and use BIOS modes instead until X is loaded."
NFS re-export of Lustre
On lustre-nfs, we need to install it as a Lustre client.
cd /atlas/data08/ball/admin/LustreSL7/client
yum localinstall kmod-lustre-client-2.10.1_dirty-1.el7.x86_64.rpm lustre-client-2.10.1_dirty-1.el7.x86_64.rpm