IB tweaks Mellanox: default ko2iblnd.conf: [root@pg-mds01 ~]# cat /etc/modprobe.d/ko2iblnd.conf alias ko2iblnd-opa ko2iblnd options ko2iblnd-opa peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 install ko2iblnd /usr/sbin/ko2iblnd-probe ko2iblnd-opa is the 'wrong' interface, we use Mellanox cards. To check for example the peer_credits setting in this example: [root@pg-mds01 ~]# cat /proc/sys/lnet/peers nid refs state last max rtr min tx min queue 0@lo 1 NA -1 0 0 0 0 0 0 172.23.52.179@o2ib 1 NA -1 8 8 8 8 -8 0 172.23.52.35@o2ib 1 NA -1 8 8 8 8 -4 0 172.23.52.124@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.69@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.158@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.14@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.103@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.192@o2ib 1 NA -1 8 8 8 8 -10 0 172.23.52.48@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.52.137@o2ib 1 NA -1 8 8 8 8 -9 0 172.23.54.2@o2ib 1 NA -1 8 8 8 8 -8 0 172.23.52.82@o2ib 1 NA -1 8 8 8 8 -11 0 172.23.55.212@o2ib 1 NA -1 8 8 8 8 8 0 You can see the peer_credits have max value of 8.. should be like this: [root@dh2-mds01 ~]# cat /etc/modprobe.d/ko2iblnd.conf options ko2iblnd peer_credits=128 peer_credits_hiw=64 credits=1024 concurrent_sends=256 ntx=2048 map_on_demand=32 fmr_pool_size=2048 fmr_flush_trigger=512 fmr_cache=1 conns_per_peer=4 After reboot/reload of lnd module you can check the peer_credits: [root@dh2-mds01 ~]# cat /proc/sys/lnet/peers nid refs state last max rtr min tx min queue 0@lo 1 NA -1 0 0 0 0 0 0 172.23.53.156@o2ib2 1 NA -1 128 128 128 128 -19 0 172.23.57.43@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.4@o2ib2 2 NA -1 128 128 128 127 -49 672 172.23.53.9@o2ib2 1 NA -1 128 128 128 128 126 0 172.23.53.1@o2ib2 1 NA -1 128 128 128 128 102 0 172.23.53.158@o2ib2 1 NA -1 128 128 128 128 -1601 0 172.23.57.45@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.6@o2ib2 1 NA -1 128 128 128 128 121 0 172.23.53.155@o2ib2 1 NA -1 128 128 128 128 -92 0 172.23.57.42@tcp12 1 NA -1 8 8 8 8 6 0 172.23.57.47@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.8@o2ib2 1 NA -1 128 128 128 128 126 0 172.23.57.52@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.157@o2ib2 1 NA -1 128 128 128 128 32 0 172.23.57.44@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.204@o2ib2 1 NA -1 128 128 128 128 127 0 172.23.53.5@o2ib2 1 NA -1 128 128 128 128 100 0 172.23.57.49@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.10@o2ib2 1 NA -1 128 128 128 128 126 0 172.23.57.41@tcp12 1 NA -1 8 8 8 8 6 0 172.23.53.2@o2ib2 1 NA -1 128 128 128 128 103 0 172.23.57.46@tcp12 1 NA -1 8 8 8 8 6 0 On ib interfaces you can now see the peer_credits are on 128, on tcp (not changed) the peer credits are still on default(8). Futher explanation of the settings: peer_credits=128 - the number of concurrent sends to a single peer peer_credits_hiw=64 - Hold in Wait – when to eagerly return credits credits=1024 - the number of concurrent sends (to all peers) concurrent_sends=256 - send work-queue sizing ntx=2048 - the number of message descriptors that are pre-allocated when the ko2iblnd module is loaded in the kernel map_on_demand=32 - the number of noncontiguous memory regions that will be mapped into a virtual contiguous region fmr_pool_size=2048 - the size of the Fast Memory registration (FMR) pool (must be >= ntx/4) fmr_flush_trigger=512 - the dirty FMR pool flush trigger fmr_cache=1 - enable FMR caching conns_per_peer=4 - create multiple queue pairs per peer to allow higher throughput from a single client. This is of most benefit to OPA interfaces, when coupled with the krcvqs parameter of the OPA hfi1 kernel driver. The hfi1 driver option krcvqs must also be set. It is recommended to set krcvqs=4. In some cases, setting krcvqs=8 will yield improved IO performance, but this can impact other workloads, especially on clients. If queue-pair memory usage becomes excessive, reduce the ko2iblnd conns_per_peer value to 2 and krcvqs=2. The default values used by Lustre if no parameters are given is: peer_credits=8 peer_credits_hiw=8 concurrent_sends=8 credits=64