for some years now i work 95% remotely, visiting office only a few times a year. this means i want to have network connection as stable as it can be. while i do have fiber optics and while it's almost perfect wrt availability, 1-2 times a year there are moment when internet is down… and with my luck, this can strike in a very bad moment.
to combat that i decided to get a redundant network connection. my 2nd ISP is a mobile provider. therefor i have 2 WANs – the main via my PC-router one and a RasPi with USB modem attached. on a home router i have a simple script that pings a set a predefined IPs to check if internet connection is there or not. if it fails, it moves to a secondary link. it goes like this:
#!/bin/bash set -eu -o pipefail app=$(basename "$0") function is_alive # <0:iface> { local i=1 local n=5 for addr in $(sort -R /etc/startup/allowed_external_hosts.ips | head "-$n") do timeout 4 ping -c 2 -I "$1" "$addr" > /dev/null 2>&1 && return 0 echo "$app: ping $addr via $1 failed ($i/$n)" ((++i)) done echo "$app: looks like $1 is dead..." return 1 } function set_default # <0:metric_pri> <1:metric_sec> { echo "$app: setting $if_pri=$1 vs. $if_sec=$2" ( set +e set -x ip r del "${rule_pri[@]}" ip r del "${rule_sec[@]}" ip r add "${rule_pri[@]}" metric "$1" ip r add "${rule_sec[@]}" metric "$2" systemctl stop openvpn systemctl start openvpn sleep 2 # give VPN some time to reconnect, so that the routing table is shown properly ip r ) echo "$app: switchover completed" } function set_pri { set_default 2000 3000 } function set_sec { set_default 3000 2000 } if_pri=wan0 if_sec=wan1 rule_pri=( $(ip r | grep "^default .* $if_pri\>" | sed -e 's/ metric .*//' -e 's/ *$//') ) rule_sec=( $(ip r | grep "^default .* $if_sec\>" | sed -e 's/ metric .*//' -e 's/ *$//') ) current=unknown trap set_pri EXIT while sleep 1 do if is_alive "$if_pri" then if [ "$current" = pri ] then sleep 3 else set_pri current=pri fi continue fi if is_alive "$if_sec" then if [ "$current" = sec ] then sleep 11 else set_sec current=sec fi continue fi echo "$app: both interfaces seem to be dead - that's bad..." done
the script is meant to be ran as a systemd service.
since pinging is possible via an explicitly-specified interface, we can check if a given link is up or not, w/o changing any network settings. then, if link switch is required, it's enough to swap metric of 2 interfaces (here: wan0 and wan1) and restart VPN, so that it connects via a new interface… and that's it. works like a charm for a long time now, with fairly fast switching in case of a problem. it's also quite noise resistant, as many different IPs are used randomly and multiple IPs are checked before deciding to fail over.