RDMA 101 - Buiding virtual setup
We are going to explore RDMA and it’s applications in a series of tutorials starting with this one.
The first step is to build virtual environment where we can run our applications. Usually RDMA communication requires special RDMA capable NIC on your server. Many vendors, such as Mellanox, sell this hardware but for our development effort we are going to build a virtual environment based on qemu and softiwarp software.
Virtual environment
Base image
We start with ubuntu 16.04.2 qemu image. It has linux kernel 4.8 installed and a basic development tools
$ sudo apt-get install linux-generic-hwe-16.04 build-essential \
automake autoconf libtool
Let’s call this image ubuntu-16.04.2-dev
. We are going to create another
image based on this image with all the rdma stack installed on it
called ubuntu-rdma.qcow2
$ qemu-img create -f qcow2 -b ubuntu-16.04.2-dev ubuntu-rdma.qcow2
Formatting 'ubuntu-rdma.qcow2' ...
Let’s start a guest and install RDMA stack inside of it.
kvm -m 1G \
\
-netdev user,hostfwd=tcp::5555-:22,id=net0 \
-device virtio-net-pci,netdev=net0 -drive if=virtio,file=ubuntu-rdma.qcow2,cache=unsafe
We can ssh to guest through local port 5555 which is redirected to port 22 on guest.
Install RDMA libraries and tools
$ sudo apt-get install libibverbs-dev librdmacm-dev \
rdmacm-utils perftest ibverbs-utils
Install SoftiWARP
$ git clone https://github.com/zrlio/softiwarp.git
...
$ pushd softiwarp/kernel
$ make
$ sudo mkdir -p /lib/modules/`uname -r`/kernel/extra
$ sudo cp siw.ko /lib/modules/`uname -r`/kernel/extra
$ sudo depmod -a
$ popd
$ pushd softiwarp/userlib
$ ./autogen.sh && ./configure --prefix= && make && sudo make install
Shutdown
sudo shutdown -h now
Create two guests
Based on the ubuntu-rdma.qcow2
image we just created let’s build two
images vm1.qcow2
and vm2.qcow2
$ for i in {1..2}; do qemu-img create -f qcow2 -b ubuntu-rdma.qcow2 vm${i}.qcow2; done
$ ls vm*
vm1.qcow2 vm2.qcow2
Now we are going to run both guest. Each of them will have two network interfaces. The first interface is configured to allow access from the guest to the internet and to allow access from the host to the guest’s ssh service by connecting to local port 5551 (or 5552 for second vm). The second network interface will be used for rdma access we are going to use qemu’s feature to create network using UDP multicast socket.
for i in {1..2}
do
kvm -name vm${i} -m 1G \
-netdev user,hostfwd=tcp::555${i}-:22,id=net0 \
-device virtio-net-pci,netdev=net0 \
-netdev socket,mcast=230.0.0.1:1234,id=net1 \
-device virtio-net-pci,mac=52:54:00:12:34:0${i},netdev=net1 \
-drive if=virtio,file=vm${i}.qcow2,cache=unsafe &
done
Now login to each of the machines and set hostname and ip for the rdma nic:
- vm1
ssh -p 5551 localhost
vm1 $ sudo bash -c 'echo 127.0.0.1 vm1 >> /etc/hosts'
vm1 $ sudo hostnamectl set-hostname vm1
vm1 $ sudo su
vm1 $ cat << EOF >> /etc/network/interfaces
auto ens4
iface ens4 inet static
address 10.0.0.1
netmask 255.255.255.0
EOF
vm1 $ ifup ens4
vm1 $ exit
vm1 $ exit
- vm2
ssh -p 5552 localhost
vm2 $ sudo bash -c 'echo 127.0.0.1 vm2 >> /etc/hosts'
vm2 $ sudo hostnamectl set-hostname vm2
vm2 $ sudo su
vm2 $ cat << EOF >> /etc/network/interfaces
auto ens4
iface ens4 inet static
address 10.0.0.2
netmask 255.255.255.0
EOF
vm2 $ ifup ens4
vm2 $ exit
vm2 $ exit
Make sure that your host firewall is not blocking the udp broadcast over port 1234
$ sudo iptables -A INPUT -p udp --dport 1234 -j ACCEPT
Check tcp connectivity
vm1 $ ping 10.0.0.2
PING 10.0.0.2 (10.0.0.2) 56(84) bytes of data.
64 bytes from 10.0.0.2: icmp_seq=1 ttl=64 time=0.671 ms
64 bytes from 10.0.0.2: icmp_seq=2 ttl=64 time=0.776 ms
Check RDMA connectivity
- vm1
vm1 $ sudo modprobe -a siw rdma_ucm
vm1 $ rping -s -a 10.0.0.1 -v
server ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
server ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
server DISCONNECT EVENT...
wait for RDMA_READ_ADV state 10
- vm2
vm2 $ sudo modprobe -a siw rdma_ucm
vm2 $ rping -c -a 10.0.0.1 -C 2 -v
ping data: rdma-ping-0: ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqr
ping data: rdma-ping-1: BCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrs
Check RDMA bandwidth and latency
Bandwidth
- vm1
vm1 $ ibv_devices
device node GUID
------ ----------------
siw_ens3 5254001234560000
siw_lo 7369775f6c6f0000
siw_ens4 5254001234010000
vm1 $ sudo su
vm1 $ ulimit -l unlimited
vm1 $ ib_write_bw -R -d siw_ens4 -i 1 -D 10 -F
- vm2
vm2 $ sudo su
vm2 $ ulimit -l unlimited
vm2 $ ib_write_bw -R -d siw_ens4 -i 1 -D 10 -F 10.0.0.1
...
#bytes #iterations BW peak[MB/sec] BW average[MB/sec] MsgRate[Mpps]
65536 3200 0.00 33.33 0.000533
The results are really lame but please remember we are using virtual environment so performance is always bad here
Latency
- vm1
vm1 $ ib_write_lat -R -d siw_ens4 -i 1 -D 10 -F
- vm2
vm2 $ ib_write_lat -R -d siw_ens4 -i 1 -D 10 -F 10.0.0.1
...
#bytes #iterations t_avg[usec]
2 4316 695.11
Same lame results due to the nature of our virtual setup and the use of softiwarp instead of real RDMA capable hardware
Summary
Now we have a working setup of two vm’s which can communicate using RDMA. In the next posts we are going to explore more interesting RDMA stuff.