AN!Cluster Tutorial 2 - Performance Tuning

From Alteeve Wiki
Revision as of 21:35, 8 June 2014 by Digimer (talk | contribs) (→‎ext4 tuning)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

 AN!Wiki :: AN!Cluster Tutorial 2 - Performance Tuning

Warning: This is little more than my raw notes. I plan to clean this up and turn it into a better tutorial later.

Now that you've built your Anvil!, you might want to tune it. How you tune it will depend largely on your anticipated work load.

Note: The optimal/most realistic test is: drbdadm adjust all; sync; dd if=/dev/zero of=/dev/drbd0 bs=4M count=80000 conv=fdatasync oflag=direct,dsync; sync. I need to re-run these tests... For now, I am leaving them here until I have new numbers.

Tuning For Maximum Sequential Write Performance

=====================

All tests below (until later mentioned) are using back-to-back Intel 82599ES controllers
with 3m active twinax SFP+ cables. The MTU is set to 9126 (and verified with fragmentless pings).

All tests were done unsing the ixgbe driver from ELRepo ver. 3.18.7:
# modinfo ixgbe | grep version
version:        3.18.7
srcversion:     B03433222E04D753B357EFB
vermagic:       2.6.32-358.el6.x86_64 SMP mod_unload modversions 

Node hardware is:
- 2x Fujitsu RX200 S8
  - 2x Xeon E5-2637 v2 (3.5 GHz, 4c/8t)
  - 64 GB of RAM (128 GB in mirrored mode)
  - 8x 146GB 15krpm SAS driver on a D3116C (LSI 2208) controller w/ 1 GB of FBWC in RAID level 5
  - 2x Intel 82599ES dual-port 10 Gbps (3x recommended for production)
- 2x Brocade ICX6610 switches, stacked according to https://alteeve.ca/w/Brocade_Notes
- HA Environment built following https://alteeve.ca/w/AN!Cluster_Tutorial_2

=====================

irqbalance running, no IRQ affinity set
 - 297527148544 bytes (298 GB) copied, 537.226 s, 554 MB/s
 - 294935519232 bytes (295 GB) copied, 545.153 s, 541 MB/s

irqbalance stopped, no IRQ affinity set
 - 281672679424 bytes (282 GB) copied, 542.691 s, 519 MB/s
 - 282184634368 bytes (282 GB) copied, 543.16 s, 520 MB/s

irqbalance running, './set_irq_affinity eth1' (script from Intel's ixgbe driver source)
 - 290241642496 bytes (290 GB) copied, 543.348 s, 534 MB/s
 - 289436336128 bytes (289 GB) copied, 542.737 s, 533 MB/s

irqbalance stopped, './set_irq_affinity eth1' (script from Intel's ixgbe driver source)
 - 291361521664 bytes (291 GB) copied, 542.621 s, 537 MB/s
 - 293686243328 bytes (294 GB) copied, 547.416 s, 536 MB/s

Decision: irqbalance enabled, no set affinity.

=====================

Now using: 'drbdadm adjust all; sync; dd if=/dev/zero of=/dev/drbd0 bs=4M count=80000; sync' to get consistent run times

Stock sysctl
 - 335544320000 bytes (336 GB) copied, 663.575 s, 506 MB/s
 - 335544320000 bytes (336 GB) copied, 671.298 s, 500 MB/s

Using AndreasK's[1] sysctl values
 - 335544320000 bytes (336 GB) copied, 669.496 s, 501 MB/s
 - 335544320000 bytes (336 GB) copied, 670.589 s, 500 MB/s

Decision; These values alone do not make it worth adjusting them. Worth investigating other values later.
 
=====================

No DRBD net { } tuning
- 335544320000 bytes (336 GB) copied, 639.747 s, 524 MB/s
- 335544320000 bytes (336 GB) copied, 639.603 s, 525 MB/s
- 335544320000 bytes (336 GB) copied, 645.19 s, 520 MB/s

Setting net { max-buffers 8000; max-epoch-size 8000; }
- 335544320000 bytes (336 GB) copied, 671.794 s, 499 MB/s
- 335544320000 bytes (336 GB) copied, 668.843 s, 502 MB/s
- 335544320000 bytes (336 GB) copied, 674.815 s, 497 MB/s

Setting net { max-buffers 131072; max-epoch-size 20000; }
- 335544320000 bytes (336 GB) copied, 674.334 s, 498 MB/s
- 335544320000 bytes (336 GB) copied, 668.228 s, 502 MB/s
- 335544320000 bytes (336 GB) copied, 674.852 s, 497 MB/s

Setting net { max-buffers 65536; max-epoch-size 10000; }
- 335544320000 bytes (336 GB) copied, 676.738 s, 496 MB/s
- 335544320000 bytes (336 GB) copied, 676.847 s, 496 MB/s
- 335544320000 bytes (336 GB) copied, 675.442 s, 497 MB/s

Setting net { max-buffers 4096; max-epoch-size 625; }
- 335544320000 bytes (336 GB) copied, 612.377 s, 548 MB/s
- 335544320000 bytes (336 GB) copied, 614.833 s, 546 MB/s
- 335544320000 bytes (336 GB) copied, 618.547 s, 542 MB/s

Setting net { max-buffers 2048; max-epoch-size 312; }
- 335544320000 bytes (336 GB) copied, 530.02 s, 633 MB/s
- 335544320000 bytes (336 GB) copied, 531.918 s, 631 MB/s
- 335544320000 bytes (336 GB) copied, 536.644 s, 625 MB/s

Setting net { max-buffers 1024; max-epoch-size 156; }
- 335544320000 bytes (336 GB) copied, 426.228 s, 787 MB/s
- 335544320000 bytes (336 GB) copied, 437.364 s, 767 MB/s
- 335544320000 bytes (336 GB) copied, 439.179 s, 764 MB/s

Setting net { max-buffers 512; max-epoch-size 75; }
- 335544320000 bytes (336 GB) copied, 487.553 s, 688 MB/s
- 335544320000 bytes (336 GB) copied, 488.904 s, 686 MB/s
- 335544320000 bytes (336 GB) copied, 494.6 s, 678 MB/s

Setting net { max-buffers 1024; max-epoch-size 150; }
- 335544320000 bytes (336 GB) copied, 460.909 s, 728 MB/s
- 335544320000 bytes (336 GB) copied, 456.231 s, 735 MB/s
- 335544320000 bytes (336 GB) copied, 461.91 s, 726 MB/s

Setting net { max-buffers 1024; max-epoch-size 156; }
- 335544320000 bytes (336 GB) copied, 440.039 s, 763 MB/s
- 335544320000 bytes (336 GB) copied, 435.829 s, 770 MB/s
- 335544320000 bytes (336 GB) copied, 434.672 s, 772 MB/s

Setting net { max-buffers 1024; max-epoch-size 165; }
- 335544320000 bytes (336 GB) copied, 466.579 s, 719 MB/s
- 335544320000 bytes (336 GB) copied, 456.53 s, 735 MB/s
- 335544320000 bytes (336 GB) copied, 464.027 s, 723 MB/s

Setting net { max-buffers 1024; max-epoch-size 160; }
- 335544320000 bytes (336 GB) copied, 455.77 s, 736 MB/s
- 335544320000 bytes (336 GB) copied, 449.779 s, 746 MB/s
- 335544320000 bytes (336 GB) copied, 454.109 s, 739 MB/s

Setting net { max-buffers 1024; max-epoch-size 158; }
- 335544320000 bytes (336 GB) copied, 467.917 s, 717 MB/s
- 335544320000 bytes (336 GB) copied, 470.458 s, 713 MB/s
- 335544320000 bytes (336 GB) copied, 453.925 s, 739 MB/s

Setting net { max-buffers 1024; max-epoch-size 157; }
- 335544320000 bytes (336 GB) copied, 445.643 s, 753 MB/s
- 335544320000 bytes (336 GB) copied, 442.671 s, 758 MB/s
- 335544320000 bytes (336 GB) copied, 439.446 s, 764 MB/s

Setting net { max-buffers 1024; max-epoch-size 155; }
- 335544320000 bytes (336 GB) copied, 460.006 s, 729 MB/s
- 335544320000 bytes (336 GB) copied, 459.722 s, 730 MB/s
- 335544320000 bytes (336 GB) copied, 457.967 s, 733 MB/s

Setting net { max-buffers 1024; max-epoch-size 156; }
- 335544320000 bytes (336 GB) copied, 437.448 s, 767 MB/s
- 335544320000 bytes (336 GB) copied, 429.44 s, 781 MB/s
- 335544320000 bytes (336 GB) copied, 428.458 s, 783 MB/s

Setting net { max-buffers 2048; max-epoch-size 156; }
- 335544320000 bytes (336 GB) copied, 439.519 s, 763 MB/s
- 335544320000 bytes (336 GB) copied, 436.31 s, 769 MB/s
- 335544320000 bytes (336 GB) copied, 430.178 s, 780 MB/s

Setting net { max-buffers 2048; max-epoch-size 312; }
- 335544320000 bytes (336 GB) copied, 490.959 s, 683 MB/s
- 335544320000 bytes (336 GB) copied, 484.596 s, 692 MB/s
- 335544320000 bytes (336 GB) copied, 481.975 s, 696 MB/s

Setting net { max-buffers 512; max-epoch-size 156; }
- 335544320000 bytes (336 GB) copied, 499.501 s, 672 MB/s
- 335544320000 bytes (336 GB) copied, 501.526 s, 669 MB/s
- 335544320000 bytes (336 GB) copied, 494.954 s, 678 MB/s

Result; Use 'net { max-buffers 1024; max-epoch-size 156; }'

=====================

Setting net { sndbuf-size 0; }
- 335544320000 bytes (336 GB) copied, 452.487 s, 742 MB/s
- 335544320000 bytes (336 GB) copied, 438.927 s, 764 MB/s
- 335544320000 bytes (336 GB) copied, 445.601 s, 753 MB/s

Setting net { sndbuf-size 512k; }
- 335544320000 bytes (336 GB) copied, 467.295 s, 718 MB/s
- 335544320000 bytes (336 GB) copied, 446.848 s, 751 MB/s
- 335544320000 bytes (336 GB) copied, 436.689 s, 768 MB/s

Setting net { sndbuf-size 1024k; }
- 335544320000 bytes (336 GB) copied, 451.897 s, 743 MB/s
- 335544320000 bytes (336 GB) copied, 447.314 s, 750 MB/s
- 335544320000 bytes (336 GB) copied, 447.185 s, 750 MB/s

Setting net { sndbuf-size 2048k; }
- 335544320000 bytes (336 GB) copied, 445.728 s, 753 MB/s
- 335544320000 bytes (336 GB) copied, 441.118 s, 761 MB/s
- 335544320000 bytes (336 GB) copied, 447.029 s, 751 MB/s

Setting net { sndbuf-size 256k; }
- 335544320000 bytes (336 GB) copied, 444.423 s, 755 MB/s
- 335544320000 bytes (336 GB) copied, 434.528 s, 772 MB/s
- 335544320000 bytes (336 GB) copied, 436.324 s, 769 MB/s

Setting net { sndbuf-size 128k; }
- 335544320000 bytes (336 GB) copied, 710.52 s, 472 MB/s
- 335544320000 bytes (336 GB) copied, 706.443 s, 475 MB/s
- 335544320000 bytes (336 GB) copied, 709.618 s, 473 MB/s

Setting net { sndbuf-size 4096k; }
- 335544320000 bytes (336 GB) copied, 443.138 s, 757 MB/s
- 335544320000 bytes (336 GB) copied, 438.445 s, 765 MB/s
- 335544320000 bytes (336 GB) copied, 439.949 s, 763 MB/s

Setting net { sndbuf-size 2048k; rcvbuf-size 2048k; }
- 335544320000 bytes (336 GB) copied, 454.607 s, 738 MB/s
- 335544320000 bytes (336 GB) copied, 424.698 s, 790 MB/s
- 335544320000 bytes (336 GB) copied, 428.138 s, 784 MB/s

Setting net { sndbuf-size 4096k; rcvbuf-size 4096k; }
- 335544320000 bytes (336 GB) copied, 446.872 s, 751 MB/s
- 335544320000 bytes (336 GB) copied, 438.232 s, 766 MB/s
- 335544320000 bytes (336 GB) copied, 447.842 s, 749 MB/s

Setting net { sndbuf-size 1024k; rcvbuf-size 2048k; }
- 335544320000 bytes (336 GB) copied, 413.269 s, 812 MB/s
- 335544320000 bytes (336 GB) copied, 404.55 s, 829 MB/s
- 335544320000 bytes (336 GB) copied, 408.739 s, 821 MB/s

Setting net { sndbuf-size 2048k; rcvbuf-size 4096k; }
- 335544320000 bytes (336 GB) copied, 444.51 s, 755 MB/s
- 335544320000 bytes (336 GB) copied, 444.578 s, 755 MB/s
- 335544320000 bytes (336 GB) copied, 448.775 s, 748 MB/s

Setting net { sndbuf-size 512k; rcvbuf-size 1024k; }
- 335544320000 bytes (336 GB) copied, 420.52 s, 798 MB/s
- 335544320000 bytes (336 GB) copied, 412.969 s, 813 MB/s
- 335544320000 bytes (336 GB) copied, 414.31 s, 810 MB/s

Setting net { sndbuf-size 512k; rcvbuf-size 2048k; }
- 335544320000 bytes (336 GB) copied, 415.999 s, 807 MB/s
- 335544320000 bytes (336 GB) copied, 419.996 s, 799 MB/s
- 335544320000 bytes (336 GB) copied, 410.118 s, 818 MB/s

Setting net { sndbuf-size 2048k; rcvbuf-size 1024k; }
- 335544320000 bytes (336 GB) copied, 418.846 s, 801 MB/s
- 335544320000 bytes (336 GB) copied, 412.257 s, 814 MB/s
- 335544320000 bytes (336 GB) copied, 417.106 s, 804 MB/s

Setting net { sndbuf-size 1024k; rcvbuf-size 2048k; } (possibly forgot to restart DRBD)
- 335544320000 bytes (336 GB) copied, 424.612 s, 790 MB/s
- 335544320000 bytes (336 GB) copied, 415.264 s, 808 MB/s
- 335544320000 bytes (336 GB) copied, 414.391 s, 810 MB/s

Setting net { sndbuf-size 1024k; rcvbuf-size 2048k; }
- 335544320000 bytes (336 GB) copied, 420.197 s, 799 MB/s
- 335544320000 bytes (336 GB) copied, 414.892 s, 809 MB/s
- 335544320000 bytes (336 GB) copied, 407.837 s, 823 MB/s

Decision; 'net { sndbuf-size 1024k; rcvbuf-size 2048k; }'
 *NOTE*: Comes at a slight risk that up to 1 MiB of data could be lost when the source node dies. Only acceptable in restricted cases
         In all other cases, either 'net { sndbuf-size 2048k; rcvbuf-size 2048k; }' or leaving these values at defaults is best.

=====================

This tests adjusting how DRBD handles write-after-write dependency. Check current setting with:
# cat /proc/drbd |grep wo |awk '{print $12}'
- b (barrier)
    The first requires that the driver of the backing storage device support barriers (called 'tagged command queuing' in SCSI and 
    'native command queuing' in SATA speak). The use of this method can be disabled by the --no-disk-barrier option. 
- f (flush)
    The second requires that the backing device support disk flushes (called 'force unit access' in the drive vendors speak). The use
    of this method can be disabled using the --no-disk-flushes option. 
- d (drain) - ONLY SAFE ON CONTROLLERS WITH BBWC/FBWC
    The third method is simply to let write requests drain before write requests of a new reordering domain are issued. That was the
    only implementation before 8.0.9. You can prevent to use of this method by using the --no-disk-drain option. 
- n (none) - ONLY SAFE ON CONTROLLERS WITH BBWC/FBWC
    The fourth method is to not express write-after-write dependencies to the backing store at all. 

All previous tests were run with 'flush'.

Setting "drain" (disk { no-disk-barrier; no-disk-flushes; }) alone, not on MD
- 335544320000 bytes (336 GB) copied, 424.131 s, 791 MB/s
- 335544320000 bytes (336 GB) copied, 417.188 s, 804 MB/s
- 335544320000 bytes (336 GB) copied, 417.087 s, 804 MB/s

Setting "drain" (disk { no-disk-barrier; no-disk-flushes; no-md-flushes; }), setting MD to "drain"
- 335544320000 bytes (336 GB) copied, 419.565 s, 800 MB/s
- 335544320000 bytes (336 GB) copied, 418.416 s, 802 MB/s
- 335544320000 bytes (336 GB) copied, 419.97 s, 799 MB/s

Setting "none" (disk { no-disk-barrier; no-disk-flushes; no-disk-drain; no-md-flushes; }), setting MD to "drain"
- 335544320000 bytes (336 GB) copied, 556.246 s, 603 MB/s
- 335544320000 bytes (336 GB) copied, 534.294 s, 628 MB/s
- 335544320000 bytes (336 GB) copied, 553.462 s, 606 MB/s

Setting "none" (disk { no-disk-barrier; no-disk-flushes; no-disk-drain; }), alone, MD to "flush"
- 335544320000 bytes (336 GB) copied, 557.405 s, 602 MB/s
- 335544320000 bytes (336 GB) copied, 547.835 s, 612 MB/s
- 335544320000 bytes (336 GB) copied, 553.121 s, 607 MB/s

Setting "drain" (disk { no-disk-barrier; no-disk-flushes; }) alone, not on MD
- 335544320000 bytes (336 GB) copied, 416.224 s, 806 MB/s
- 335544320000 bytes (336 GB) copied, 415.484 s, 808 MB/s
- 335544320000 bytes (336 GB) copied, 417.682 s, 803 MB/s

Decision; Leave these out, the default values are fine.

=====================

Testing 'net { no-tcp-cork; }'
- 335544320000 bytes (336 GB) copied, 418.392 s, 802 MB/s
- 335544320000 bytes (336 GB) copied, 415.665 s, 807 MB/s
- 335544320000 bytes (336 GB) copied, 413.712 s, 811 MB/s

Decision; No difference, leave it out.

=====================

Testing 'net { unplug-watermark 1024; }'
- 335544320000 bytes (336 GB) copied, 409.745 s, 819 MB/s
- 335544320000 bytes (336 GB) copied, 409.444 s, 820 MB/s
- 335544320000 bytes (336 GB) copied, 400.931 s, 837 MB/s

Testing 'net { unplug-watermark 16; }'
- 122838581248 bytes (123 GB) copied, 278.723 s, 441 MB/s (aborted, was running under 500 MB/sec)

Testing 'net { unplug-watermark 131072; }'
- 335544320000 bytes (336 GB) copied, 413.226 s, 812 MB/s
- 335544320000 bytes (336 GB) copied, 412.276 s, 814 MB/s
- 335544320000 bytes (336 GB) copied, 409.347 s, 820 MB/s

Decision; For now, set to the same as 'max-buffers', but a wider range of testing is needed later.

=====================

Enabling mode=1 bonding;

First round of testing was done after brinding NICs up and down, chanign IPs, etc. Tests were discouraging:
- 335544320000 bytes (336 GB) copied, 513.835 s, 653 MB/s
- 335544320000 bytes (336 GB) copied, 509.762 s, 658 MB/s
- 335544320000 bytes (336 GB) copied, 504.624 s, 665 MB/s

Decided to reboot to get a fresh setup on the bond;
- 335544320000 bytes (336 GB) copied, 381.467 s, 880 MB/s
- 335544320000 bytes (336 GB) copied, 367.688 s, 913 MB/s
- 335544320000 bytes (336 GB) copied, 371.974 s, 902 MB/s

=====================

Returning to stock ixgbe driver:

[root@an-c07n01 ~]# modinfo ixgbe | grep version
version:        3.15.1-k
srcversion:     A333AC564E95CA461F3205A
vermagic:       2.6.32-431.1.2.el6.x86_64 SMP mod_unload modversions 

- 335544320000 bytes (336 GB) copied, 467.843 s, 717 MB/s
- 335544320000 bytes (336 GB) copied, 448.798 s, 748 MB/s
- 335544320000 bytes (336 GB) copied, 464.088 s, 723 MB/s

Returning to the ELRepo driver:

[root@an-c07n01 ~]# modinfo ixgbe | grep version
version:        3.18.7
srcversion:     B03433222E04D753B357EFB
vermagic:       2.6.32-358.el6.x86_64 SMP mod_unload modversions 

- 335544320000 bytes (336 GB) copied, 453.439 s, 740 MB/s
- 335544320000 bytes (336 GB) copied, 456.027 s, 736 MB/s
- 335544320000 bytes (336 GB) copied, 476.184 s, 705 MB/s

Removed bond; eth1 -> eth1.

- 335544320000 bytes (336 GB) copied, 370.527 s, 906 MB/s

Switched to eth4 -> eth4

- 335544320000 bytes (336 GB) copied, 375.535 s, 894 MB/s

Back to bond1;

- 335544320000 bytes (336 GB) copied, 364.506 s, 921 MB/s
- 335544320000 bytes (336 GB) copied, 361.989 s, 927 MB/s
- 335544320000 bytes (336 GB) copied, 364.65 s, 920 MB/s

Reboot to retest.

(Performance stopped being deterministic for reasons as yet unknown but seemingly related to bonding.)
(I saw fluctations as low as 650, averaging 740. Decided to move on for now, but it is worth spending)
(time on bonding tuning later                                                                        )

=====================

Testing clvmd/LVM overhead:

* Clustered LVM seems to add no discernable overhead. (Tested the same 'dd' test 
directly to the clustered LV)
- 335544320000 bytes (336 GB) copied, 399.889 s, 839 MB/s
- 335544320000 bytes (336 GB) copied, 417.482 s, 804 MB/s
- 335544320000 bytes (336 GB) copied, 411.391 s, 816 MB/s

=====================

Inside the VM! (48 GB, 4x vCPUs, 500MB /boot, 40 GB /, 4 GB <swap>, RHEL 6 minimal, no selinux/iptables)
Write to raw partition using: sync; dd if=/dev/zero of=/dev/vda5 bs=4M count=80000; sync 
- 335544320000 bytes (336 GB) copied, 302.885 s, 1.1 GB/s
- 335544320000 bytes (336 GB) copied, 293.905 s, 1.1 GB/s
- 335544320000 bytes (336 GB) copied, 289.298 s, 1.2 GB/s

Write to ext4 partition using: sync; dd if=/dev/zero of=/mnt/data/zeros.out bs=4M count=80000; sync 
- 335544320000 bytes (336 GB) copied, 290.855 s, 1.2 GB/s
- 335544320000 bytes (336 GB) copied, 347.901 s, 964 MB/s
- 335544320000 bytes (336 GB) copied, 317.877 s, 1.1 GB/s

LUKS partition details (passphrase: 'supersecret'):
[root@vm01-ng-rhel01 ~]# cryptsetup -v status vda5
/dev/mapper/vda5 is active.
  type:  LUKS1
  cipher:  aes-cbc-essiv:sha256
  keysize: 256 bits
  device:  /dev/vda5
  offset:  4096 sectors
  size:    1729118208 sectors
  mode:    read/write
Command successful.

Write to raw LUKS partition using: sync; dd if=/dev/zero of=/dev/mapper/vda5 bs=4M count=80000; sync 
- 335544320000 bytes (336 GB) copied, 1157.27 s, 290 MB/s
  - LUKS gained SMP support in kernel 2.6.37, so RHEL 7 might be a good test
  - These CPUs support AES-NI, not sure how efficiently that is passed up to the VM

Decision; At this time, LUKS is not viable. If RHEL 7 is an option, we can re-test threaded performance.

Final testing:

Write to raw partition using: sync; dd if=/dev/zero of=/dev/vda5 bs=4M count=80000; sync 

Below shows the performance with the server being live-migrated during the write test (trimmed the 'records {in,out}' lines):
----
300668682240 bytes (301 GB) copied, 248.127 s, 1.2 GB/s
369392353280 bytes (369 GB) copied, 308.636 s, 1.2 GB/s
401281646592 bytes (401 GB) copied, 368.621 s, 1.1 GB/s
443954495488 bytes (444 GB) copied, 429.036 s, 1.0 GB/s
507187822592 bytes (507 GB) copied, 489.653 s, 1.0 GB/s
576528056320 bytes (577 GB) copied, 550.282 s, 1.0 GB/s
645255921664 bytes (645 GB) copied, 610.992 s, 1.1 GB/s
712578695168 bytes (713 GB) copied, 671.673 s, 1.1 GB/s
778806755328 bytes (779 GB) copied, 732.41 s, 1.1 GB/s
842757308416 bytes (843 GB) copied, 793.103 s, 1.1 GB/s
----

Standard tests, writing to all the free space on the partition:
- 885310619648 bytes (885 GB) copied, 768.78 s, 1.2 GB/s
- 885310619648 bytes (885 GB) copied, 770.564 s, 1.1 GB/s
- 885310619648 bytes (885 GB) copied, 768.709 s, 1.2 GB/s

=====================

1. AndreasK's sysctl values:
sysctl -w net.core.netdev_max_backlog="300000"
sysctl -w net.core.rmem_max="20971520"
sysctl -w net.core.wmem_max="20971520"
sysctl -w net.ipv4.tcp_rmem="2097152 20971520 20971520"
sysctl -w net.ipv4.tcp_sack="0"
sysctl -w net.ipv4.tcp_timestamps="0"
sysctl -w net.ipv4.tcp_wmem="2097152 20971520 20971520"

   Stock/default sysctl values:
sysctl -w net.core.netdev_max_backlog="1000"
sysctl -w net.core.rmem_max="124928"
sysctl -w net.core.wmem_max="124928"
sysctl -w net.ipv4.tcp_rmem="4096 87380 4194304"
sysctl -w net.ipv4.tcp_sack="1"
sysctl -w net.ipv4.tcp_timestamps="1"
sysctl -w net.ipv4.tcp_wmem="4096 16384 4194304"

FS Tuning

All tests use:

dd if=/dev/zero of=/bulk/zero bs=4M count=50000 conv=fdatasync oflag=direct,dsync

Basics

960 MB/sec raw to /dev/vda5
815 MB/sec to /dev/vda5 on ext4
897 MB/sec to /dev/vda5 on xfs

ext4 tuning

815 MB/sec to /dev/vda5 on ext4 with all defaults

cat /sys/block/vda/queue/scheduler
noop anticipatory deadline [cfq] 

mount /dev/vda5 /bulk -o commit=30
209715200000 bytes (210 GB) copied, 252.504 s, 831 MB/s

mount /dev/vda5 /bulk -o commit=90
209712062464 bytes (210 GB) copied, 367.739 s, 570 MB/s

mount /dev/vda5 /bulk -o commit=30
209715200000 bytes (210 GB) copied, 256.487 s, 818 MB/s

mount /dev/vda5 /bulk -o commit=15
209715200000 bytes (210 GB) copied, 258.536 s, 811 MB/s

mount /dev/vda5 /bulk -o commit=30
209715200000 bytes (210 GB) copied, 256.167 s, 819 MB/s

mount /dev/vda5 /bulk
209715200000 bytes (210 GB) copied, 256.516 s, 818 MB/s

Decicison, 'commit=X' is not worth setting.

mount /dev/vda5 /bulk -o barrier=0
209715200000 bytes (210 GB) copied, 251.55 s, 834 MB/s

Retrying with DRBD set to:
       disk {
                fencing                 resource-and-stonith;
                no-disk-barrier;
                no-disk-flushes;
        }
209715200000 bytes (210 GB) copied, 259.52 s, 808 MB/s

Trying again with the above removed.


 

Any questions, feedback, advice, complaints or meanderings are welcome.
Alteeve's Niche! Enterprise Support:
Alteeve Support
Community Support
© Alteeve's Niche! Inc. 1997-2024   Anvil! "Intelligent Availability®" Platform
legal stuff: All info is provided "As-Is". Do not use anything here unless you are willing and able to take responsibility for your own actions.