The Robot Rises Again – Adding Vision Processing

Posted 04 April 2026

Well today I exceeded my boredom threshold, so something had to give. I worked out on my basketball court for an hour, flew a glider race in the Condor3 Soaring Flight Simulator, played an online bridge tournament on BBO, and I still hadn’t used up half the day. My workbench is empty, my two 3D printers are silent, and I can’t think of a thing to do. However, while I paced around with my hands in my pockets feeling sorry for myself, I caught sight of Wall-E3, my autonomous wall-following robot gathering dust in its charging cradle in the corner. That robot has gone through a number of iterations, starting with this post in early 2015 and ending with this post in late 2024. The goal of this almost decade-long project was to create a robot that could autonomously navigate around our house, scaring cats and humans alike and recharging itself whenever it got hungry. It did a really good job at the scaring part but unfortunately, I never got it to the point where it wouldn’t get stuck in one of our junk-filled guest bedrooms; close, but no cigar 😒.

Looking at my robot, I realized that in the intervening decade, a lot had changed in the indoor navigation realm, in particular the ability to exploit video camera images for navigation in reasonably constrained environments. So, I decided to revisit my robot project to see if I could utilize the new technology to get closer to my goal of true autonomous navigation.

Part of my interest in starting a new run at the household navigation project has been my positive experience over the last year with Grok, X’s AI. Grok, despite some real issues that need to be kept in mind, is a GREAT research assistant. In the past, I might spend most of a week working my way through internet links and going down various rabbit holes trying to solve a subtle software or hardware problem but Grok can do the same thing in a few seconds. At this point I knew next to nothing about vision-based navigation systems, but I knew for sure that Grok would be able to distill the netiverse down to reasonably digestible chunks pretty quickly.

I started with the following prompt for Grok:

 have a small 4-wheeled robot that uses front, side and rear distance sensors (LIDAR) along with angle information from a  6DOF solid-state gyro to autonomously navigate around my home.  It works OK but always seems to run into 'special cases' where it gets stuck.  I'm thinking of redoing the sensing package to use vision processing instead of just distances.  Can you help me get started?

have a small 4-wheeled robot that uses front, side and rear distance sensors (LIDAR) along with angle information from a 6DOF solid-state gyro to autonomously navigate around my home. It works OK but always seems to run into 'special cases' where it gets stuck. I'm thinking of redoing the sensing package to use vision processing instead of just distances. Can you help me get started?

In less than thirty seconds, Grok produced the following:

Absolutely—switching to vision processing is a smart move for handling those tricky “special cases” where pure distance sensors (like your LIDAR + gyro combo) fall short. Cameras give you way more information: color, texture, edges, and even semantic understanding (e.g., “that’s a chair leg” vs. just “something 30cm away”). This helps with low obstacles, reflective surfaces, dynamic items, or cluttered home environments that confuse LIDAR.

ecovacs.comSince you already have a solid 6DOF gyro (IMU), you can fuse it with vision for excellent Visual-Inertial Odometry (VIO) or Visual SLAM (vSLAM). This is basically what modern robot vacuums do (vSLAM), but fully customizable for your wheeled robot.

ceva-ip.comQuick Reality Check Before Diving InVision isn’t perfect either:

Sensitive to lighting changes (solution: good indoor lights or IR-capable cameras).
Needs decent compute (Raspberry Pi 5 or equivalent works; older Pis may struggle with heavy SLAM).
Depth from vision is usually “good enough” indoors but not as precise as LIDAR in perfect conditions.

Hybrid tip: You can keep your existing LIDAR/gyro as a backup while testing vision.Recommended Hardware to Get Started (2026-Friendly Options)You’ll replace/add a camera that provides both color and depth (RGB-D) or stereo for 3D perception. Top practical picks for small wheeled robots:

Best overall: Luxonis OAK-D series (OAK-D Lite or Pro)
Budget depth option: Arducam T2 RGBD ToF Camera (~$100)
Classic: Intel RealSense D435/D455 (if you can find one) or similar RGB-D.

Mount the camera forward-facing (maybe add a second for 360° if budget allows). Keep your gyro—it will make odometry rock-solid when fused.What’s your current brain? (Raspberry Pi 4/5, Jetson, Arduino + companion computer, etc.?) Budget for the camera? That’ll let me refine this exactly.Software Stack – The Modern Way (ROS2 Recommended). The gold standard for this is ROS 2 (Robot Operating System) + Nav2 navigation stack. It handles mapping, localization, path planning, and obstacle avoidance out of the box.

Core vision libraries:

OpenCV → Basic start (edge detection, simple avoidance).
RTAB-Map or ORB-SLAM3 → Full visual SLAM (builds a 3D map while localizing).
VINS-Fusion or OpenVINS → Perfect VIO that fuses your existing 6DOF gyro with camera data (highly accurate odometry).

All run great on Pi 5 with Ubuntu 24.04 + ROS 2 Jazzy (or newer).

This was a lot like drinking from a firehose, but within an hour or so I had a Raspberry Pi 5 kit and two different cameras (a cheap one arriving tomorrow, and a not-so-cheap one arriving in another week or so). In the meantime I asked Grok to hunt up some tutorials and examples so I could start getting up to speed on all things ROS 2 Jazzy Robot Operating System) and SLAM (Simultaneous Localization and Mapping).

Hopefully in the next week or so I will have some hardware & software to play with, and maybe make a step or two in the direction of upgrading WallE3 to WallE4.

08 April 2026 Update:

Worked with Grok to start adding vision capability to my autonomous 4-wheel wall-tracking robot. The plan is to use a Luxonis OAK-D-Lite Fixed-Focus camera and a Raspberry Pi5 running Ubuntu 24.04 and ROS (Robot Operating System) to add vision processing to the robot. Because there was a long lead time for the Luxonis camera, we started with a cheaper (and faster delivery) alternative, the Arducam ToF Camera. Grok insisted the Arducam would work as a near-term solution until the Luxonis camera arrived, but it didn’t work out quite that way. The Arducam worked OK with the default Pi5 OS without ROS, but when we tried to add ROS and the drivers for the Arducam, we ran into a dead end. Then Grok was *sure* that we could simply replace the Pi5 OS with Ubuntu 24.04 LTS and everything would work – – – NOT! So now I’m waiting for the Luxonis so we can move forward again.

When I braced Grok about the readily apparent difference between what Grok thought was going to work fine and the actual result, I asked Grok to search for ANY successful implementation using Pi5, ROS and Arducam, and Grok could not come up with a single confirmed successful implementation. This is a real problem with Grok, one that the devs should be concerned about, but nobody seems to care.

Grok now uses several ‘agents’ to independently research user questions, and the eventual answer is a consensus (I think) of their findings. I have suggested that at least one of the agents should be dedicated to attempting to falsify the results of the others, but I have seen nothing to indicate this has actually gotten through to the devs. In this case, all the ‘red team’ agent would have to do is perform the above search for successful implementations and come up with ZERO, thereby making it obvious that starting with the Arducam instead of waiting for the Luxonis was probably not going to end well. On a positive note, I asked Grok to do the same search with the Luxonis instead of the Arducam, and it came back with several confirmed successful implementations.

Until Grok implements the ‘red team’ facility, users need to be aware that Grok will quite happily drive a user off a cliff, insisting all the way down that “this next fix will surely solve the problem”. Has anyone else had his happen?

16 April 2026 Update:

A lot has happened since my last post on this subject. Working with Grok, we finally got the OAK-D camera working with ROS, OpenCV and Ubuntu 24.04 LTS running on the Pi5, with real-time imagery displayed on a monitor connected to the Pi5. Of course, this is a LOOOONG way from actually implementing visual navigation on a robot, but ya gotta start somewhere!

When considering how to move the Pi5 and Oak-D camera setup onto the robot, I decided to refresh my old 2-wheel (plus castering wheel) robot rather than trying to shoehorn the Pi5 and camera setup onto my 4-wheel robot. The 2-wheel robot already as a set of nice wheel motors, a pair of DRV8871 motor drivers and a battery/charger setup, and it isn’t being used for anything else at the moment. So, I stripped off the Uno Mega 2560 controller, installed the Pi5 and the camera at one end, and a Teensy 4.1 MCU on the other. I also installed an 8V-to-5V LDO voltage regulator that I had hanging around from a previous project. Here are some photos

Then I started working on the idea of installing magnets on the wheels and Hall-effect sensors on the chassis to produce wheel rotation rate information to integrate with vision processing. Based on a conversation with Grok, I ordered some A3144 Hall-effect sensors from Amazon. When they came in, I implemented a small plugboard circuit to test the sensor, as shown in the following short video:

After I got this working, I created a printed magnet carrier for the wheels, and a small perfboard circuit mount for the sensor. I tested the setup by running the small motor test program Grok wrote for the Teensy 4.1, and the results are shown in the following video:

LED illuminates when the sensor is triggered by a wheel magnet. Scope in the background shows the detection waveform

Here’s a scope trace grab of a typical half-speed wheel motor sequence:

From the above trace, the measured frequency is 4.823Hz. There are four magnets, which implies that the wheel is turning at about 1.2RPS, or about 72RPM. The wheel diameter is about 67mm, which gives a wheel circumference of about 21cm. At the 1.2 RPS rate, this works out to about 0.24m/s.

Of course I’m not anywhere near done yet, as I need two sensors per wheel to be able to determine rotation direction so three more sensor circuits, and a second wheel magnet carrier and magnets for the other wheel. Still though, it’s a good start!

22 April 2026 Update:

I finally got all four A3144 Hall-effect sensor modules and the second wheel magnet carrier mounted and tested. As shown in the following photos, The Hall-effect sensor modules are mounted approximately 90 degrees from each other to provide effective direction sensing.

23 April 2026 Update:

After finishing the Hall-effect sensor project, I worked on implementing a wireless serial connection to the robot for remote status monitoring and for remote ‘Over The Air’ (OTA) programming. I have had this capability for years on the 4-wheel robot and it really paid off, so I didn’t want to get too far down the road with the 2-wheel version without having it. As Grok pointed out, I could use the wireless ethernet capability of the Raspberry Pi5 for this; wire a serial port on the Pi to a hardware serial port on the Teensy 4.1 and send updated .HEX files to the Teensy that way. However, I decided not to do this for three reasons; first and foremost, I disliked the idea of tying Teensy 4.1 program updates to a working Pi5, as that added another huge set of variables; secondly, the Pi5 draws a LOT of power and so I don’t want to have it running until absolutely required, and thirdly I completely solved the Teensy 3x/4x OTA problem some five years back (see this post, this post and this post) and can readily port the solution to the current configuration.

I mounted one of my spare HC-05 Bluetooth Serial modules to the back of my new camera support structure with hot glue and connected it to Serial2 (pins 7 & 8) on the Teensy 4.1. After the usual cussing and screwups getting the Tx & Rx lines connected properly, I was able to demonstrate simple pass-through serial-to-serial capability, so then I moved on to getting OTA working. First I created an Arduino project called ‘2WheelRobot_HC05_OTA1’. Then I cloned Joe Pasquariello’s FlasherX repository to my PC and copied the latest versions of FlashTxx.h/.cpp, FxUtil.h/.cpp, and FlasherTxx.ino from my local repo to the local folder of that project. Next I ported the ‘CheckForUserInput()’ function from one of my 4Wheel robot projects into the OTA1 project, removing everything but the ‘U’ (Update Firmware) case. After the usual number of mistakes and cussing, I got this working to the point where I could manually send a new .HEX file to the Teensy 4.1 using the HC-05 Bluetooth serial link and TerraTerm, as shown in the following screengrab:

The next step is to automate this procedure as I did back in 2021. I should be able to use the same ‘board.txt’ and ‘TeensyOTA1.ttl’ files from back then. At this point I decided to try my ‘symlink’ trick instead of copying the files directly. Here is the relevant text from ‘Symbolic Link HOWTO.txt’:

//04/19/24 Arduino projects now use an extra folder layer, so instead of

mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\TeensyOTA1.ttl “C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl”
mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\board.txt “C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt”

The commands need to be:

mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\230918_WallE3_MoveTo_Test_V1\TeensyOTA1.ttl “C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl”
mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\230918_WallE3_MoveTo_Test_V1\board.txt “C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt”

I opened a command prompt and pasted the first mklink command in, and then edited it to point to my ‘2WheelRobot_HC05_OTA1’ folder. When I was done, I had this command line:

C:\Windows\System32>mklink C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\TeensyOTA1.ttl "C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl"
symbolic link created for C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\TeensyOTA1.ttl <<===>> C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl

C:\Windows\System32>mklink C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\board.txt "C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt"
symbolic link created for C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\board.txt <<===>> C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt

C:\Windows\System32>mklink C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\TeensyOTA1.ttl "C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl"

symbolic link created for C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\TeensyOTA1.ttl <<===>> C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl

C:\Windows\System32>mklink C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\board.txt "C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt"

symbolic link created for C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1\board.txt <<===>> C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt

After executing these commands, I confirmed that the ‘board.txt’ and ‘TeensyOTA1.ttl’ files now appear in the ‘C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1’ folder – yay!

To confirm proper operation, I changed the ‘LED_ONOFF_MSEC’ parameter from 250 to 1000 in the .ino file, changed the COM port to COM12 and hit F5.

Rats! Close – but no Cigar! It started off OK and got to the point where TeraTerm should send the HEX file, and then I got this error message:

Hmm, turns out it actually worked – now the LED is blinking at 1000mSec ON, 1000mSec OFF, instead of 250 ON/OFF. So I tried again and this time I got “unable to connect to Teensy on COM12

Stay tuned,

Frank

Post Views: 129

Paynter's Palace

The Robot Rises Again – Adding Vision Processing

08 April 2026 Update:

16 April 2026 Update:

22 April 2026 Update:

23 April 2026 Update:

Leave a Reply Cancel reply