Lustre 2.10 with ZFS 0.7.1 from standard repo

This page documents building the Lustre 2.10 RPMs on CentOS 7.3+ using the default yum install of ZFS 0.7.1. The steps followed are primarily from http://wiki.lustre.org/Lustre_with_ZFS_Install

The following is valid for CentOS 7.3+

Index of Sections

Building Lustre RPMs

  1. Prepare System
    1. Disable SELinux for older clients
    2. Install the kernel development tools
    3. Install additional dependencies
  2. Install ZFS 0.7.1 RPMs
    1. EPEL release
    2. For the newest Lustre releases change /etc/yum.repos.d/zfs.repo to switch from dkms to kmod (more info here)
    3. Install ZFS and its associated SPL packages
      • kmod packages for newer releases
  3. Build Lustre RPMs
    1. Get Lustre source code
    2. Configure (--disable-ldiskfs for ZFS backend, --without-server for client only)
    3. Make and optionally install rpms (NOTE: Make sure you have enough space in /tmp to do the build...needs about 3GB)
    • service cfengine3 stop
    • remove kernel* exclusions in sl and sl-security repo file
    • yum install kernel-headers kernel-devel
    • Yes, this now worked
    • Comment out /home partition in /etc/fstab, and umount it.
    • Stop and chkconfig off cfe.
    • yum erase kernel-firmware (had to do this first)
    • Follow balance of directions
    • afs will not start, not surprising, ignore for now.
    • stop short of doing the mkfs on the former /home partition, and
      • instead dd back the previously saved partition content.
        • cd /atlas/data19/zzzNoCopy
        • dd if=mgs_dd.dat of=/dev/mapper/vg0-lv_home bs=4096
    • mkdir /mnt/mgs
    • Add fstab entry that was saved
    • mount /mnt/mgs
      • YES. IT WORKED!
    • Use T3test instead of umt3B in the mount, and the correct IP, being 10.10.1.140
    • WORKED!
    • dd if=/dev/mapper/vg0-lv_home of=mgs_dd_new.dat bs=4096
    • grub2-mkconfig -o /boot/grub2/grub.cfg

Notes on building all the rpms for Lustre 2.10.4

Note, this is all a verbatim as I just want notes recorded for now, and it will be cleaned up afterwards. BB, 5/24/2018

[root@umdist10 ~]# rpm -qa|grep -i lustre
lustre-2.10.1_dirty-1.el7.x86_64
lustre-osd-zfs-mount-2.10.1_dirty-1.el7.x86_64
kmod-lustre-2.10.1_dirty-1.el7.x86_64
lustre-resource-agents-2.10.1_dirty-1.el7.x86_64
kmod-lustre-osd-zfs-2.10.1_dirty-1.el7.x86_64
[root@umdist10 ~]# yum erase lustre lustre-osd-zfs-mount kmod-lustre lustre-resource-agents kmod-lustre-osd-zfs

[root@umdist10 yum.repos.d]# rpm -qa|grep -e zfs -e spl|sort
kmod-spl-0.7.7-1.el7.x86_64
kmod-spl-devel-0.7.7-1.el7.x86_64
kmod-zfs-0.7.7-1.el7.x86_64
kmod-zfs-devel-0.7.7-1.el7.x86_64
libzfs2-0.7.7-1.el7.x86_64
libzfs2-devel-0.7.7-1.el7.x86_64
spl-0.7.7-1.el7.x86_64
spl-debuginfo-0.7.7-1.el7.x86_64
spl-kmod-debuginfo-0.7.7-1.el7.x86_64
zfs-0.7.7-1.el7.x86_64
zfs-debuginfo-0.7.7-1.el7.x86_64
zfs-dracut-0.7.7-1.el7.x86_64
zfs-kmod-debuginfo-0.7.7-1.el7.x86_64
zfs-release-1-5.el7_4.noarch
zfs-test-0.7.7-1.el7.x86_64
-------- Also have lbzpool2 that is updating --------

See this URL for info on installing zfsonlinux
     https://github.com/zfsonlinux/zfs/wiki/RHEL-and-CentOS

[root@umdist10 yum.repos.d]# rpm -ql zfs-release
/etc/pki/rpm-gpg/RPM-GPG-KEY-zfsonlinux
/etc/yum.repos.d/zfs.repo

  Erasing    : zfs-release-1-5.el7_4.noarch    
 yum install http://download.zfsonlinux.org/epel/zfs-release.el7_5.noarch.rpm
Installing:
 zfs-release    noarch    1-5.el7.centos     /zfs-release.el7_5.noarch    2.9 k

yum --enablerepo=zfs-kmod update (all zfs repo were disabled)
reboot

Saved all the zfs rpms at /atlas/data08/ball/admin/LustreSL7/SL7.5/zfs

cd /root
git clone git://git.hpdd.intel.com/fs/lustre-release.git  -b v2_10_4
cd lustre-release
sh ./autogen.sh
yum install yaml-cpp yaml-cpp-devel (this was not needed)
yum install libyaml libyaml-devel

./configure --with-spec=redhat

make
make rpms

Now, 
  117  mkdir ../rpmbuild
  118  mkdir ../rpmbuild/SRPMS
  119  mkdir ../rpmbuild/RPMS
  120  mkdir ../rpmbuild/SOURCE
  121  mkdir ../rpmbuild/SPECS
  122  mkdir ../rpmbuild/BUILD
  123  mkdir ../rpmbuild/BUILDROOT
cp lustre-2.10.4-1.src.rpm ../rpmbuild/SRPMS
rpm -ivh lustre-2.10.4-1.src.rpm
rpmbuild --ba --with zfs --with servers --without ldiskfs ~/rpmbuild/SPECS/lustre.spec
The rpmbuild came up with the same rpms as in the lustre-release directory

It seems that the kernel source code rpm must be installed in order to build "--with ldiskfs" or even with this left off.  Also, for ldiskfs need the debuginfo rpms for the current kernel.

rpm -ivh kernel-3.10.0-862.3.2.el7.src.rpm
yum --enablerepo=sl-debuginfo update kernel-debuginfo kernel-debuginfo-common-x86_64

rpmbuild --ba --with zfs --with servers --with ldiskfs  ~/rpmbuild/SPECS/lustre.spec
This worked. 

Now, make client rpms
cd ~/lustre-release
./configure  --disable-server --enable-client
make rpms 

cd /atlas/data08/ball/admin/LustreSL7
mkdir SL7.5
cd SL7.5
mkdir zfs
mkdir server
mkdir client
cd zfs
scp root@umdist10.local:/var/cache/yum/zfs-kmod/packages/*.rpm .
cd ../server
scp root@umdist10.local:/root/rpmbuild/RPMS/x86_64/*.rpm .
cd ../client
scp root@umdist10.local:/root/lustre-release/*client*.rpm .

Now, set up umdist10 to be a 2.10.4 OSS
cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server

yum localinstall lustre-2.10.4-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.4-1.el7.x86_64.rpm kmod-lustre-2.10.4-1.el7.x86_64.rpm \
lustre-resource-agents-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.4-1.el7.x86_64.rpm

Mount pretty much without issues.

On bl-11-3, try out the 2.10.3 client on both the test Lustre, and the production Lustre

[root@bl-11-3 ~]# mount -o localflock,lazystatfs -t lustre 10.10.1.140@tcp:/T3test /lustre/umt3

[root@bl-11-3 tmp]# cd /tmp
[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G .

real    20m59.407s
user    0m2.044s
sys     5m48.584s

[root@bl-11-3 tmp]# du -s -x -h copiedTo10G/
20G     copiedTo10G/

[ball@umt3int01:~]$ ll /lustre/umt3/copiedTo10G_d/|wc -l
71177

-----------
Now do the same as on the production instance
[root@bl-11-3 tmp]# umount /lustre/umt3
[root@bl-11-3 tmp]# systemctl start lustre_umt3

[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .

real    61m2.582s
user    0m3.239s
sys     9m38.740s

--------------  Move to the SL7.5 kernel and 2.10.4 Lustre on both bl-11-3 and on dc40-4-34

[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .

real    104m22.910s
user    0m3.397s
sys     5m10.465s

[root@c-4-34 tmp]# time cp -ar /lustre/umt3/copiedTo10G .

real    20m43.214s
user    0m2.661s
sys     3m52.731s

Now, switch the data sources between these 2.  From the first test, access to our production Lustre
instance from SL7.5 client is slower BY A LOT than access from SL7.4 clients.

[root@bl-11-3 tmp]# time cp -ar /lustre/umt3/copiedTo10G .

real    21m30.744s
user    0m1.939s
sys     3m21.433s

[root@c-4-34 tmp]# time cp -ar /lustre/umt3/copiedTo10G_d .

real    10m6.469s
user    0m2.012s
sys     3m19.323s

---------------------- Now update the production Lustre servers --------------------------

Stop lustre on all clients.  On lustre-nfs, stop nfs, then stop lustre.  Wait 5 minutes.
umount all ost on all OSS.  Wait 5 minutes
umount /mnt/mdtmgs on mdtmgs.aglt2.org

Back up the Lustre metadata

[root@mdtmgs ~]# time dd if=/dev/sdb of=/atlas/data19/zzzNoCopy/mdtmgs_dd_Jun.5.2018.dat bs=4096
262144000+0 records in
262144000+0 records out
1073741824000 bytes (1.1 TB) copied, 3084.21 s, 348 MB/s

real    51m24.242s
user    0m37.941s
sys     20m19.820s

-------------------------------------------------

[root@umdist01 ~]# cd /etc
[root@umdist01 etc]# cp -p fstab fstab.save
[root@umdist01 etc]# vi /etc/fstab
   Remove ost entries for now

yum erase kmod-lustre kmod-lustre-osd-zfs lustre lustre-osd-zfs-mount lustre-resource-agents

Edit sl.repo and sl-security.repo to remove exclusion on "kernel*"

yum update
(check the avaliable space in /boot, may need to deletel old kernel rpms to make room)

yum erase kmod-spl kmod-zfs libnvpair1 libuutil1 libzfs2 libzpool2 spl zfs zfs-dracut

[root@umdist01 ~]# cd /atlas/data08/ball/admin/LustreSL7/SL7.5/zfs
[root@umdist01 zfs]# yum localinstall kmod-spl-0.7.9-1.el7_5.x86_64.rpm kmod-zfs-0.7.9-1.el7_5.x86_64.rpm libnvpair1-0.7.9-1.el7_5.x86_64.rpm libuutil1-0.7.9-1.el7_5.x86_64.rpm libzfs2-0.7.9-1.el7_5.x86_64.rpm libzpool2-0.7.9-1.el7_5.x86_64.rpm spl-0.7.9-1.el7_5.x86_64.rpm zfs-0.7.9-1.el7_5.x86_64.rpm zfs-dracut-0.7.9-1.el7_5.x86_64.rpm

reboot

cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server
[root@umdist01 server]# yum localinstall kmod-lustre-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.4-1.el7.x86_64.rpm lustre-2.10.4-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.4-1.el7.x86_64.rpm lustre-resource-agents-2.10.4-1.el7.x86_64.rpm

update firmware!  But not on the PE2950.  Then reboot after update completes


Put /etc/fstab.save back in place and mount all of the ost
mount -av 

-----------
on mdtmgs do....
[root@mdtmgs ~]# rpm -qa|grep -e lustre
lustre-osd-ldiskfs-mount-2.10.1_dirty-1.el7.x86_64
kmod-lustre-osd-ldiskfs-2.10.1_dirty-1.el7.x86_64
kmod-lustre-2.10.1_dirty-1.el7.x86_64
lustre-2.10.1_dirty-1.el7.x86_64
lustre-resource-agents-2.10.1_dirty-1.el7.x86_64

Comment out mdtmgs in /etc/fstab

Modify sl.rep and sl-security.repo to allow kernel updates

yum update

cd /atlas/data08/ball/admin/LustreSL7/SL7.5/server
yum localupdate lustre-osd-ldiskfs-mount-2.10.4-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.10.4-1.el7.x86_64.rpm \
kmod-lustre-2.10.4-1.el7.x86_64.rpm lustre-2.10.4-1.el7.x86_64.rpm lustre-resource-agents-2.10.4-1.el7.x86_64.rpm

reboot

Upon reboot, uncomment the fstab entry, and then
mount /mnt/mdtmgs

------------
Also updated lustre-nfs in the usual way that a client is updated, and rebooted it.


Testing that mgs.aglt2.org can be built from scratch using the 2.7.58 file suite

mgs.aglt2.org has a valid, test Lustre system on it. We took the ldiskfs combined mgs and, after umount, it was saved via dd command to an NFS storage location. We want to be able to restore that where it came from, the former /home partition, but the Cobbler build wipes that partition. So, this tests that we are able to both install the system from scratch, and recover this dd copy of the mgs.

First, get the rpm list right, starting with fixing cfe for a few "issues".

Hmmm, issue is that we have ONLY kernel and kernel-firmware following the Cobbler build, so, first we get the kernel in place that we want.

Force install of kernel-headers and kernel-devel

  • service cfengine3 stop
  • remove kernel* exclusions in sl and sl-security repo file
  • yum install kernel-headers kernel-devel

Check if cf3 will now install dkms and fusioninventory-agent

  • Yes, this now worked

Make new rpm list and compare to old; looks good modulo some non-relevant rpms.

Reboot to be clean, followed by

  • Comment out /home partition in /etc/fstab, and umount it.
  • Stop and chkconfig off cfe.

Now, go to our LustreZFS Wiki page and try to install the Lustre kernel.

  • yum erase kernel-firmware (had to do this first)
  • Follow balance of directions
  • afs will not start, not surprising, ignore for now.
  • stop short of doing the mkfs on the former /home partition, and
    • instead dd back the previously saved partition content.
      • cd /atlas/data19/zzzNoCopy
      • dd if=mgs_dd.dat of=/dev/mapper/vg0-lv_home bs=4096
  • mkdir /mnt/mgs
  • Add fstab entry that was saved
  • mount /mnt/mgs
    • YES. IT WORKED!

Test via umdist10 and some WN

  • Use T3test instead of umt3B in the mount, and the correct IP, being 10.10.1.140
  • WORKED!

So, unwind mounts, and do a new dd to data19 in prep for upgrading to SL7 on this test machine.

  • dd if=/dev/mapper/vg0-lv_home of=mgs_dd_new.dat bs=4096

Building mdtmgs and the Production Lustre OSS

The metadata server mdtmgs

For the metadata server, mdtmgs we do the following.
cd /atlas/data08/ball/admin/LustreSL7/download_orig/e2fsprogs
yum localinstall e2fsprogs-1.42.13.wc6-7.el7.x86_64.rpm e2fsprogs-libs-1.42.13.wc6-7.el7.x86_64.rpm libcom_err-1.42.13.wc6-7.el7.x86_64.rpm libcom_err-devel-1.42.13.wc6-7.el7.x86_64.rpm libss-1.42.13.wc6-7.el7.x86_64.rpm libss-devel-1.42.13.wc6-7.el7.x86_64.rpm

cd ../..
yum localinstall kmod-lustre-2.10.1_dirty-1.el7.x86_64.rpm kmod-lustre-osd-ldiskfs-2.10.1_dirty-1.el7.x86_64.rpm lustre-2.10.1_dirty-1.el7.x86_64.rpm lustre-osd-ldiskfs-mount-2.10.1_dirty-1.el7.x86_64.rpm lustre-resource-agents-2.10.1_dirty-1.el7.x86_64.rpm

OSS Servers

For the umdist0N, this becomes:
yum -y install --nogpgcheck http://download.zfsonlinux.org/epel/zfs-release.el7_4.noarch.rpm
Edit zfs.repo to choose the zfs-kmod files

yum install kmod-zfs kmod-spl spl zfs libnvpair1 libuutil1 libzfs2 libzpool2

cd /atlas/data08/ball/admin/LustreSL7
yum localinstall kmod-lustre-2.10.1_dirty-1.el7.x86_64.rpm kmod-lustre-osd-zfs-2.10.1_dirty-1.el7.x86_64.rpm lustre-2.10.1_dirty-1.el7.x86_64.rpm lustre-osd-zfs-mount-2.10.1_dirty-1.el7.x86_64.rpm lustre-resource-agents-2.10.1_dirty-1.el7.x86_64.rpm

modprobe zfs
zpool import -a
zpool upgrade -a
Disable the zfs repo

systemctl enable zfs.target
systemctl enable zfs-import-cache
systemctl enable zfs-mount
systemctl enable zfs-share
systemctl enable zfs-zed

Add back the saved fstab entries
mkdir /mnt/ost-001 (etc)
mount -av

/root/tools/configure_dell_alerts.sh

Note, on the PE1950/PE2950 the monitor hardware is too old and generates a continuous stream (every 10 seconds) of errors on the console and in the logs. To disable this problem, in /etc/default/grub edit the line where GRUB_CMDLINE_LINUX is defined to add the parameter "nomodeset". Then
  • grub2-mkconfig -o /boot/grub2/grub.cfg
Following the next reboot, the problem will no longer be extant.

"The newest kernels have moved the video mode setting into the kernel. So all the programming of the hardware specific clock rates and registers on the video card happen in the kernel rather than in the X driver when the X server starts.. This makes it possible to have high resolution nice looking splash (boot) screens and flicker free transitions from boot splash to login screen. Unfortunately, on some cards this doesn't work properly and you end up with a black screen. Adding the nomodeset parameter instructs the kernel to not load video drivers and use BIOS modes instead until X is loaded."

NFS re-export of Lustre

On lustre-nfs, we need to install it as a Lustre client.

cd /atlas/data08/ball/admin/LustreSL7/client
yum localinstall kmod-lustre-client-2.10.1_dirty-1.el7.x86_64.rpm lustre-client-2.10.1_dirty-1.el7.x86_64.rpm

Topic revision: r15 - 05 Jun 2018, BobBall
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback