recently, when working on YALS (more info on the project here), i found myself in an odd situation. on the last day of coding SW for it, as the very last feature, i've added watchdog support. device has a real-time processing cycle of 4ms, so i decided that watchdog should fire if we miss it 10x, as this should be fast enough prevent any observable effects, yet long enough to make sure a micro glitch would throw thing of the course.
after writing a small HAL, i wrote a manual test application, just to see if it worked. so the app just printed how long it will sleep before resetting watchdog… and finally, when it reached 40ms, watchdog kicked in, and application started all over again. brain-dead simple… and a bit of a trap! now it reset all the time. w/o a way to stop it… and i was not able to flash it again, since OpenOCD flashing required more time to start the process, than i had available between watchdog resets.
but wait – you may know that RP2040 is unbricable by design, right? you can always press the boot
button and upload a different *.uf2
file to unbrick it. easy as that… except that YALS, due to space constraints, does not have USB. whoopsee… so the fun begun!
the first attempt was simple – just run:
while ! rp2040_openocd -c "program hello_world.elf verify reset exit" do date done
…and pray. while it was running i started to read and think about a different take on my problem.
turned out that i was fortunate enough to setup watchdog just long enough, that i finally managed to grab a window of opportunity and flash a small program. it took a while, but proved it's possible. that kept my mind at peace. through the journey to find a better way out.
btw: i'm using rp2040 SDK project for all the steps here.
while it turned out to be exceptionally difficult to do bruteforce-flashing hack, it turned out that it's ok to connect openocd
to a running system, and not lost it when watchdog resets CPU. that was a big foot in the doors. next good news is that stopping CPU by attaching to a process also stopped watchdog timer. the bad news was that this mode also prevented flashing…
so the next life hack was to start gdb
, then load
new program into the RAM (note that flash is still unchanged). the bad news is that it still kept watchdog running…
that's where rp2040 HAL sources came in handy! watchdog lives in a register that's mapped to RAM address 0x40058000
, at bit 30. so the way to top watchdog (during this run) is to one a terminal and run openocd
:
rp2040_openocd
and then in another one launch gdb
and type in the following commands:
target remote localhost:3333 p (*(unsigned*)0x40058000)=0 load "hello_world.elf" continue &
what it does is:
load
some non-watchdog-enabled programcontinue
it in a background (so that gdb
can be stopped)now i was able to disconnect and use regular rp2040_flash to write new program (w/o watchdog reset loop) into flash.
for a brief moment i though this is going to be my safe-house, and wrote a helper rp2040_load that automate all of this in a single, simple script, that's just parametrized with a new *.elf
file to flash.
so i had a working solution, that was fast and reliable… yet it still smell like a big hack from a mile. can i make it less of a hack? what if openocd
would help sth that could help with that… turned out that it has! there's a command that can write to arbitrary RAM location… so we can now do watchdog turn-off faster, with a simple:
rp2040_openocd \ -c "init" \ -c "rp2040.core0 mww 0x40058000 0 ; exit"
for convenience rp2040_disable_watchdog was added to SDK, too. now it's also a default part of rp2040_flash to make sure watchdog never bothers developer again.