2026-01-11 - redundant internet at home

for some years now i work 95% remotely, visiting office only a few times a year. this means i want to have network connection as stable as it can be. while i do have fiber optics and while it's almost perfect wrt availability, 1-2 times a year there are moment when internet is down… and with my luck, this can strike in a very bad moment.

to combat that i decided to get a redundant network connection. my 2nd ISP is a mobile provider. therefor i have 2 WANs – the main via my PC-router one and a RasPi with USB modem attached. on a home router i have a simple script that pings a set a predefined IPs to check if internet connection is there or not. if it fails, it moves to a secondary link. it goes like this:

#!/bin/bash
set -eu -o pipefail
app=$(basename "$0")
 
function is_alive # <0:iface>
{
  local i=1
  local n=5
  for addr in $(sort -R /etc/startup/allowed_external_hosts.ips | head "-$n")
  do
    timeout 4 ping -c 2 -I "$1" "$addr" > /dev/null 2>&1 && return 0
    echo "$app: ping $addr via $1 failed ($i/$n)"
    ((++i))
  done
  echo "$app: looks like $1 is dead..."
  return 1
}
 
function set_default # <0:metric_pri> <1:metric_sec>
{
  echo "$app: setting $if_pri=$1 vs. $if_sec=$2"
  (
    set +e
    set -x
    ip r del "${rule_pri[@]}"
    ip r del "${rule_sec[@]}"
    ip r add "${rule_pri[@]}" metric "$1"
    ip r add "${rule_sec[@]}" metric "$2"
    systemctl stop  openvpn
    systemctl start openvpn
    sleep 2 # give VPN some time to reconnect, so that the routing table is shown properly
    ip r
  )
  echo "$app: switchover completed"
}
 
function set_pri
{
  set_default 2000 3000
}
 
function set_sec
{
  set_default 3000 2000
}
 
if_pri=wan0
if_sec=wan1
 
rule_pri=( $(ip r | grep "^default .* $if_pri\>" | sed -e 's/ metric .*//' -e 's/ *$//') )
rule_sec=( $(ip r | grep "^default .* $if_sec\>" | sed -e 's/ metric .*//' -e 's/ *$//') )
 
current=unknown
 
trap set_pri EXIT
 
while sleep 1
do
  if is_alive "$if_pri"
  then
    if [ "$current" = pri ]
    then
      sleep 3
    else
      set_pri
      current=pri
    fi
    continue
  fi
 
  if is_alive "$if_sec"
  then
    if [ "$current" = sec ]
    then
      sleep 11
    else
      set_sec
      current=sec
    fi
    continue
  fi
 
  echo "$app: both interfaces seem to be dead - that's bad..."
done

the script is meant to be ran as a systemd service.

since pinging is possible via an explicitly-specified interface, we can check if a given link is up or not, w/o changing any network settings. then, if link switch is required, it's enough to swap metric of 2 interfaces (here: wan0 and wan1) and restart VPN, so that it connects via a new interface… and that's it. works like a charm for a long time now, with fairly fast switching in case of a problem. it's also quite noise resistant, as many different IPs are used randomly and multiple IPs are checked before deciding to fail over.