almost 8 years ago

Build a ZFS raidz2 pool, share ZFS storage as iSCSI volume or NFS export and tuning I/O performance for ESXi access.

  • 2015/8/11 utilize server's RAM by writing to /sys/module/zfs/parameters/zfs_arc_max

Install ZFS

Before we can start using ZFS, we need to install it. Simply add the repository to apt-get with the following command:

apt-get install --yes software-properties-common
apt-add-repository --yes ppa:zfs-native/stable
apt-get update
apt-get install ubuntu-zfs

Reboot.
Now, let’s see if it has been correctly compiled and loaded by the kernel

dmesg | grep ZFS

You get an output like this:

# dmesg | grep ZFS
[ 5.979569] ZFS: Loaded module v0.6.4.1-1~trusty, ZFS pool version 5000, ZFS filesystem version 5

Creation of a RAID-Z2 disk array using 7 disks

Here is my server disk layout, from sdb to sdh will be ZFS raid pool

root@nfs1:~$ lsblk
NAME   MAJ:MIN RM   SIZE RO TYPE MOUNTPOINT
sda      8:0    0   2.7T  0 disk
├─sda1   8:1    0     1M  0 part
├─sda2   8:2    0   2.7T  0 part /
└─sda3   8:3    0    32G  0 part [SWAP]
sdb      8:16   0   2.7T  0 disk
sdc      8:32   0   2.7T  0 disk
sdd      8:48   0   2.7T  0 disk
sde      8:64   0   2.7T  0 disk
sdf      8:80   0   2.7T  0 disk
sdg      8:96   0   2.7T  0 disk
sdh      8:112  0   2.7T  0 disk

Create ZFS pool from 7 disks

sudo zpool create -f datastore1 raidz2 /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh
  • We use all capacity of datastore1/iscsi or datastore1/nfs, which is around 13TB
    root@iscsi-storage-2:~# zfs list
    root@nfs1:~# zfs list
    NAME         USED  AVAIL  REFER  MOUNTPOINT
    datastore1  94.9K  13.3T  34.1K  /datastore1
    

Create iSCSI volumn using Linux SCSI Target(Targetcli)

1. Create ZFS data sets for iSCSI

sudo zfs create -o compression=off -o dedup=off -o volblocksize=32K -V 13000G datastore1/iscsi
sudo zfs set sync=disabled datastore1/iscsi

2. Create iscsi adaptor on ESXi:

vCenter > Host > Configuration > Storage Adaptors > Add iscsi adaptor, Get the iSCSI name

We will use block device to create iscsi

root@iscsi-storage-2:~# ll /dev/zvol/datastore1/iscsi
lrwxrwxrwx 1 root root 9 May 26 00:41 /dev/zvol/datastore1/iscsi -> ../../zd0

3. Create iSCSI target using ZFS

root@iscsi-storage-2:~# targetcli
/> cd backstores
/backstores> iblock/ create name=block_backend dev=/dev/zvol/datastore1/iscsi

... create iscsi target ...

/> cd /iscsi/iqn.2003-01.org.linux-iscsi.iscsi-storage-2.x8664:sn.f017b570b1d2/tpgt1/
luns/ create /backstores/iblock/block_backend

... Add portls ...
... Add Acls of ESXI hosts' initiator iqn name ...

/iscsi/iqn.20...570b1d2/tpgt1> / saveconfig
/iscsi/iqn.20...570b1d2/tpgt1> exit

4. Mount iSCSI on ESXi

  • On ESXi
    1. Add iscsi IP:port into iscsi adaptor
    2. Rescan All
    3. Add iSCSI volumn in Host > Configuration > Storage

Create NFS sharing

1. Install NFS service

$ sudo apt-get install nfs-kernel-server
$ sudo reboot
  • commented out && grep -q '^[[:space:]][^#]/' $export_files in /etc/init.d/nfs-kernel-server because I can't start the server with an empty /etc/exports/ file

2. Start NFS service

$ sudo service nfs-kernel-server start
 * Exporting directories for NFS kernel daemon... [ OK ]
 * Starting NFS kernel daemon                     [ OK ]

3. Create ZFS data sets for NFS

sudo zfs create -o compression=off -o dedup=off -o mountpoint=/nfs -o sharenfs=on datastore1/nfs
sudo zfs set sync=disabled datastore1/nfs

4. Test if NFS sharing is exported

root@nfs1:~# apt-get install nfs-common
root@nfs1:~# showmount -e
Export list for nfs1:
/nfs *

5. Auto export NFS folder

root@nfs1:~# vim /etc/rc.local
#!/bin/sh -e
#
# rc.local
#
# This script is executed at the end of each multiuser runlevel.
# Make sure that the script will "exit 0" on success or any other
# value on error.
#
# In order to enable or disable this script just change the execution
# bits.
#
# By default this script does nothing.
zfs unshare datastore1/nfs
zfs share datastore1/nfs
exit 0

Tuning for performance and I/O latency

1. Set the I/O Scheduler to noop (echo noop > /sys/block/sdb/queue/scheduler).

Skip zd0 if using NFS
# for i in zd0 sdb sdc sdd sde sdf sdg sdh; \
do echo noop > /sys/block/$i/queue/scheduler; cat /sys/block/$i/queue/scheduler; done
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq
[noop] deadline cfq

2. Change the IO size to 32KB

Skip zd0 if using NFS
# for i in zd0 sdb sdc sdd sde sdf sdg sdh; \
do echo 32 > /sys/block/$i/queue/max_sectors_kb; echo 4 > /sys/block/$i/queue/nr_requests; done

ZFS iSCSI Benchmark Tests on ESX

...I tested with both 64KB and 32KB, for me 32KB worked out a little better.
...We can see that the avrtq-sz changed to 64 (32kb), which is good and we now see that the avg wait time went down to ~80ms (from ~1000ms). Lowering the number of requests to 4 lowered the DAVG to practically nothing, but the speed wasn’t that great.

3. Enable Disk write-back caching

# for i in sdb sdc sdd sde sdf sdg sdh; do hdparm -W1 /dev/$i; done

Improve hard drive write speed with write-back caching

4. Increase ZFS read cache(ZFS-ARC) size to 50GB

The default ZFS arc has 32GB ARC limitation. We can use higher cache memory size to rise the cache hit rate.

root@nfs1:~# echo $((50*1024*1024*1024)) >> /sys/module/zfs/parameters/zfs_arc_max
root@nfs1:~# echo options zfs zfs_arc_max=$((50*1024*1024*1024)) > /etc/modprobe.d/zfs.conf
root@nfs1:~# cat /etc/modprobe.d/zfs.conf
options zfs zfs_arc_max=53687091200

root@nfs1:~# git clone --depth 1 https://github.com/frankuit/zfs-arcstats-bash.git
root@nfs1:~# zfs-arcstats-bash/arc2
This will display the cache hit and miss ratio's.
for a time limited run (in seconds) add a number of seconds behind this command
|--------------------------------------------------|
|l1reads    l1miss     l1hits     l1hit%     size  |
|--------------------------------------------------|
|175        10         165        94.285%    50 GB  |
|84         0          84         100.000%   49 GB  |
|110        8          102        92.727%    50 GB  |
|100        14         86         86.000%    50 GB  |
|362        14         348        96.132%    50 GB  |
|75         3          72         96.000%    50 GB  |

Benchmark Result

(Left) NFS Storage with 10GbE Network (Right) Local Disk Storage

Refer List

← Prepare Ubuntu-14.04 cloud images on vSphere Build Nginx http2 + PageSpeed + Passenger on ubuntu 14.04 →
 
comments powered by Disqus