Replacing Junos Pulse with OpenConnect
In an attempt to avoid using the Juniper Pulse (Now Pulse Secure) VPN client we tried OpenConnect but found that DNS did not work correctly when connected to the VPN. This bug has now been resolved recently but has not made it’s way into a new build, in fact there have been no releases for 6 months. Luckily the OpenConnect was not too difficult to build from source. Build OpenConnect on OSX Remove old openconnect and install deps brew remove openconnect brew install libxml2 lzlib openssl libtool libevent Build openconnect wget git.infradead.org/users/dwmw2/openconnect.git/snapshot/0f1ec30d17aa674142552e275bf3fac30d891b39.tar.gz tar zxvf 0f1ec30d17aa674142552e275bf3fac30d891b39.tar.gz cd openconnect-0f1ec30 LIBTOOLIZE=glibtoolize ./autogen.sh PATH=/usr/local/opt/gettext/bin:$PATH ./configure make make install To connect sudo openconnect --juniper -u myusername www.myserver.com If you’re comfortable with allowing admin users to run openconnect without entering a sudo password, add the following using sudo visudo: ...
SSD Storage - Two Months In Production
Over the last two months I’ve been running selected IO intensive servers off the the SSD storage cluster, these hosts include (among others) our: Primary Puppetmaster Gitlab server Redmine app and database servers Nagios servers Several Docker database host servers Reliability We haven’t had any software or hardware failures since commissioning the storage units. During this time we have had 3 disk failures on our HP StoreVirtual SANs that have required us to call the supporting vendor and replace failed disks. ...
OS X Software Update Channels For Betas
Set update channel to receive developer beta update sudo softwareupdate --set-catalog https://swscan.apple.com/content/catalogs/others/index-10.11seed-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz Set update channel to receive public beta update sudo softwareupdate --set-catalog https://swscan.apple.com/content/catalogs/others/index-10.11beta-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz List available updates sudo softwareupdate --list Set update channel to receive default, stable updates sudo softwareupdate --clear-catalog Show current settings defaults read /Library/Preferences/com.apple.SoftwareUpdate.plist Write setting manually defaults write /Library/Preferences/com.apple.SoftwareUpdate CatalogURL https://swscan.apple.com/content/catalogs/others/index-10.11beta-10.11-10.10-10.9-mountainlion-lion-snowleopard-leopard.merged-1.sucatalog.gz
iSCSI Benchmarking
The following are benchmarks from our testings of our iSCSI SSD storage. 67,300 read IOP/s on a VM on iSCSI (Disk -> LVM -> MDADM -> DRBD -> iSCSI target -> Network -> XenServer iSCSI Client -> VM) Per VM and scales to 1,000,000 IOP/s total root@dev-samm:/mnt/pmt1 128 # fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=128 --size=2G --readwrite=read test: (g=0): rw=read, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 2.0.8 Starting 1 process bs: 1 (f=1): [R] [55.6% done] [262.1M/0K /s] [67.3K/0 iops] [eta 00m:04s] 38,500 random 4k write IOP/s on a VM on iSCSI (Disk -> LVM -> MDADM -> DRBD -> iSCSI target -> Network -> XenServer iSCSI Client -> VM) Per VM and scales to 700,000 IOP/s total root@dev-samm:/mnt/pmt1 # fio --randrepeat=1 --ioengine=libaio --direct=1 --gtod_reduce=1 --name=test --filename=test --bs=4k --iodepth=128 --size=2G --readwrite=randwrite test: (g=0): rw=randwrite, bs=4K-4K/4K-4K, ioengine=libaio, iodepth=128 2.0.8 Starting 1 process bs: 1 (f=1): [w] [26.3% done] [0K/150.2M /s] [0 /38.5K iops] [eta 00m:14s] Raw device latency on storage units Intel DC3600 1.2T PCIe NVMe root@s1-san6:/proc # ioping /dev/nvme0n1p1 4.0 KiB from /dev/nvme0n1p1 (device 1.1 TiB): request=1 time=104 us 4.0 KiB from /dev/nvme0n1p1 (device 1.1 TiB): request=2 time=83 us 4.0 KiB from /dev/nvme0n1p1 (device 1.1 TiB): request=3 time=51 us 4.0 KiB from /dev/nvme0n1p1 (device 1.1 TiB): request=4 time=71 us SanDisk SDSSDXPS960G SATA root@pm-san5:/proc # ioping /dev/sdc 4.0 KiB from /dev/sdc (device 894.3 GiB): request=1 time=4.2 ms 4.0 KiB from /dev/sdc (device 894.3 GiB): request=2 time=4.1 ms 4.0 KiB from /dev/sdc (device 894.3 GiB): request=3 time=4.1 ms 4.0 KiB from /dev/sdc (device 894.3 GiB): request=4 time=4.1 ms Micron_M600_MTFDDAK1T0MBF SATA root@pm-san5:/proc # ioping /dev/sdf 4.0 KiB from /dev/sdf (device 953.9 GiB): request=1 time=157 us 4.0 KiB from /dev/sdf (device 953.9 GiB): request=2 time=190 us 4.0 KiB from /dev/sdf (device 953.9 GiB): request=3 time=65 us 4.0 KiB from /dev/sdf (device 953.9 GiB): request=4 time=181 us ```shell ## Latency on the a VM - (Disk -> LVM -> MDADM -> DRBD -> iSCSI target -> Network -> XenServer iSCSI Client -> VM) ```shell root@dev-samm:/mnt 127 # ioping pmt1/ 4096 bytes from pmt1/ (ext4 /dev/xvdb1): request=1 time=0.6 ms 4096 bytes from pmt1/ (ext4 /dev/xvdb1): request=2 time=0.7 ms 4096 bytes from pmt1/ (ext4 /dev/xvdb1): request=3 time=0.7 ms --- pmt1/ (ext4 /dev/xvdb1) ioping statistics --- 3 requests completed in 2159.1 ms, 1508 iops, 5.9 mb/s min/avg/max/mdev = 0.6/0.7/0.7/0.1 ms root@dev-samm:/mnt # ioping pmt2/ 4096 bytes from pmt2/ (ext4 /dev/xvdc1): request=1 time=0.6 ms 4096 bytes from pmt2/ (ext4 /dev/xvdc1): request=2 time=0.8 ms --- pmt2/ (ext4 /dev/xvdc1) ioping statistics --- 2 requests completed in 1658.4 ms, 1470 iops, 5.7 mb/s min/avg/max/mdev = 0.6/0.7/0.8/0.1 ms root@dev-samm:/mnt # ioping pmt3/ 4096 bytes from pmt3/ (ext4 /dev/xvde1): request=1 time=0.6 ms 4096 bytes from pmt3/ (ext4 /dev/xvde1): request=2 time=0.9 ms 4096 bytes from pmt3/ (ext4 /dev/xvde1): request=3 time=0.9 ms ...
Delayed Serial STONITH
A modified version of John Sutton’s rcd_serial cable coupled with our Supermicro reset switch hijacker: This works with the rcd_serial fence agent plugin. Reasons rcd_serial makes for a very good STONITH mechanism: It has no dependency on power state. It has no dependency on network state. It has no dependency on node operational state. It has no dependency on external hardware. It costs less that $5 + time to build. It is incredibly simple and reliable. Essentially the most common STONITH agent type in use is probably those that control UPS / PDUs, while this sounds like a good idea in theory there are a number of issues with relying on a UPS / PDU: ...
Video - Cluster Failover Performance Demo
CentOS 7 and HA
First some background… One of the many lessons I’ve learnt from my Linux HA / Storage clustering project is that the Debian HA ecosystem is essentially broken, We reached the point where packages were too old, too buggy or in Debian 8’s case - outright missing. In the past I was very disappointed with RHEL/CentOS 5 / 6 and (until now) have been quite satisfied with Debian as a stable server distribution with historicity more modern packages and kernels. ...
SSD Storage Cluster - Update and Diagram
Due to several recent events beyond my control I’m a bit behind on the project - hence the lack of updates which I apologise for. The goods news is that I’m back working to finish off the clusters and I’m happy to report that all is going to plan. Here is the final digram of the two-node cluster design: Plain text version available here This was generated from the LCMC tool (beware - it’s java!). ...
Video - Storage Cluster Failover Demo
A brief demonstration of the failover and recovery process on the storage clusters I’ve been building.