2 Wheel Robot with Vision Processing, Part II

This is the second of (hopefully) many posts on my project to add modern vision processing to my autonomous wall-following robot. The first post is here.

Lots of changes since my first post. I discovered that my 8.4-to-5V LDO regulator board wouldn’t reliably drive the Raspberry Pi5, so I had Grok look around for other options. He found a step-down converter module at Pololu advertising 85+% efficiency and a much cleaner output. I have it on order so we’ll see.

I also struggled to get the Luxonis OAK-D Lite stereo camera working on my Pi5/Ubuntu 24.04LTS setup. Previously we had gotten it to work with the Pi5 running the Raspberry OS, but getting it to work with the Ubuntu OS was a lot more challenging. This effort also ran afoul of Grok’s complete inability to realize that it is in a hole and to stop digging. We went through dozens of Python scripts designed to get the camera to connect and show some data (I was connecting to the RPi5 via SSH from my windows box, so showing images wasn’t possible), and they all failed due to one subtle problem or another. After several days of getting nowhere I finally called a halt, took a couple of days off, and came back determined to start over from first principals. Instead of using Grok, I started from scratch with some web searches to find other successful implementations of The OAK-D Lite camera. At the Luxonis ‘Documentation’ site I found this page describing a viewer for the OAK-D Lite (and other) cameras. The OAK Viewer is available for windows and *nix OS, so I decided to start by trying to get images from the camera using the Windows version to bypass all the annoyances associated with peripheral handling in Linux. This turned out to be pretty much plug-and-play, and immediately the OAK-D camera showed up in the list of available devices. When I clicked on ‘Connect’ instead of connecting immediately the app immediately started measuring the available bandwidth of the USB connection as shown below.

After several USB connect/disconnect cycles, I got this display:

Oops! I hadn’t even considered the USB cable/connector bandwidth issue – and neither had Grok. For convenience I had plugged the camera cable into my USB hub, which is definitely not ‘super speed’ whatever the heck that is. After some cable and connector switching, I found that a heavy-duty Type-C cable connected directly to a Type-C connector on my Dell XP15-9530 laptop allowed the bandwidth check to succeed, and now I got some images showing up on my Windows 11 display – yay!

OAK-D Lite images. Depth pseudo-color on left, raw RGB image on right

So the moral of this story is – the Grok path was never going to work because Grok never considered that cable/usb connector bandwidth might be an issue. By going back to ‘first principals’ and taking the simplest possible path to a working camera/display configuration with a known-good Windows 11 app, I was able to immediately identify a completely unknown (to me and to Grok) – but fatal – stumbling block – USB cable/connector bandwidth. Grok has no sense of time, so every iteration was just like a puppy chasing a dog – willing to chase that ball an infinite number of times without ever thinking about the fact that ‘chasing the ball’ and ‘progress toward the goal’ aren’t necessarily the same thing. It took a mere mortal like me to say “whoa – this isn’t getting us anywhere – maybe a different approach?”

Now that I had demonstrated that the OAK-D Lite camera and the proper cable/connector combination worked – at least on Windows 11, I had a ‘known-good baseline’ that I could always retreat to, I started working on getting the OAK-D Lite camera working on the RPi5/Ubuntu camera with the same OAK Viewer application (but in the Linux flavor).

This turned out to be another maze to navigate. The Luxonis site has detailed instructions for the Linux version of the viewer, but it involves installing from a *.deb package, which unfortunately is targeted at the amd64 64-bit chip ecology – but the RPi5 uses arm64 – a different animal entirely. When I tried to install the ‘viewer.deb’ package, I got the following errors:

The following packages have unmet dependencies: oak-viewer:amd64 : Depends: libgtk-3-0:amd64 but it is not installable Depends: libnotify4:amd64 but it is not installable Depends: libnss3:amd64 but it is not installable Depends: libatspi2.0-0:amd64 but it is not installable Depends: libdrm2:amd64 but it is not installable Depends: libgbm1:amd64 but it is not installable Depends: libxcb-dri3-0:amd64 but it is not installable Recommends: pulseaudio:amd64 or libasound2:amd64 but it is not installable

It was at this point that I re-engaged Grok and started to make real progress. Grok immediately identified the Pi5/Ubuntu-compatible DepthAI Python library as the way to go and guided me through the installation process. Fortunately, this had a happy ending, even though there were several ‘gotchas’ along the road. However, since I knew for a fact that the hardware (and USB cable) were ‘known good’ elements due to my Windows 11 work, I was pretty sure any detours were software-only. After working my way through the various twists and turns with Grok’s help, we got to here – success!

Initial images captured by the OAK-D Lite camera running on my RPi5 with the Linux Ubuntu OS

Getting from my easy Windows 11 camera demo to the RPi5/Ubuntu camera demo would have been improbable if not impossible for me to do without Grok’s help. I might have gotten there, but it would have involved days/weeks of web searches and forum posts at the very least. I believe this is where Grok really shines – a definite problem with a definite end, with very few (none in my case) outside corrupting factors like the USB bandwidth/cable issue.

Interestingly, after getting the Pi5/Ubuntu/OAK-D Lite combination working, I asked Grok to help me find an easier way to take screen shots on the Pi5, and Grok obliged by offering the ‘Flameshot’ app as a substitute for the built-in Gnome keystroke shortcuts. And then we went down another rabbit-hole, and I ended up wasting an hour or so trying to get Flameshot and Gnome to work and play well together, only to wind up removing Flameshot and learning how to better use the Gnome built-in shortcuts.

So Grok is definitely a mixed blessing, and I cannot imagine how a younger less-experienced engineer would do without the (literally) lifetime’s worth of experience I have in troubleshooting hardware/software systems. When I was that young less-experienced engineer half a century ago I was trying to troubleshoot a RF EMI problem with a small electronics device made by Motorola. Eventually my supervisor suggested that I travel to Motorola and work with their engineers to figure out the problem. I did, and over the space of two days a very experienced Motorola engineer taught me the ‘divide and conquer’ method of troubleshooting that I use to this day. When Grok inevitably goes down a rabbit-hole with this young engineer in tow, who’s going to be there to throw them a life-line?

A Status Byte/Word for Wall-E3?

Posted 25 May 2024

Lately I have been working (or re-working) on Wall-E3’s ‘MoveToDesiredFrontDistCm()’ and ‘MoveToDesiredRearDistCm()’ capability. As I worked on these functions, I realized that the failure modes weren’t particularly straightforward. Both the ‘Front’ and ‘Rear’ functions can direct forward or rearward movement to achieve the desired distance, so in theory either function could experience a ‘stuck’ condition in either direction – ugh! In addition, I’m not sure what to do if either function experiences a ‘front obstacle’, ‘front offset distance’ or ‘rear obstacle’ condition.

Currently I have a set of enums that describe expected anomaly conditions, as shown below.

And, in the current version of the ‘MoveTo..’ functions, the above are used as exit conditions from the movement loop, like the following snippet:

gl_LastAnomalyCode is updated in the ‘UpdateAllEnvironmentParameters()’ function as shown below

This function retrieves all the current distance readings, the current heading, and the current reading from the charger IR beam detection sensor. Then it assigns one of the ANOMALY enum values to gl_LastAnomalyCode based on one or more direct or derived sensor values. For instance, the ‘STUCK_AHEAD’ or ‘STUCK_BEHIND’ codes are assigned based on the value computed for the ‘front’ or ‘rear’ variance calculations.

The problem (or at least what I *think* is a problem) is that gl_LastAnomalyCode can only contain one value, and in the case of more than one ‘anomaly’ existing at the same time, the value assigned to gl_LastAnomalyCode is the one encountered first, as a result of the ‘else’ – ‘else if’ structure.

So, I started thinking that maybe I should instead implement a status byte or status word with bit positions assigned to possible error/anomaly conditions. Instead of an ‘else’ – ‘else if’ structure, the ‘UpdateAllEnvironmentParameters()’ would simply update each bit in the status object independently of all the others. Then functions whose behavior gets modified by one or more anomalies can consult the status object to determine what to do.

I think this approach is much more robust and generalized than the current ‘only one anomaly type at a time’ approach, but it is also much more complicated, in at least three different ways:

  • Managing updates to the status object would require ‘bit diddling’ operations for clearing or setting individual bits.
  • Deciding what to do could become much more complicated. Currently the ‘gl_LastAnomalyCode’ can contain only one value, so any behavior modification decisions are basically CASE blocks. With a status object behavior could depend on more than one parameter.
  • sdaf;ljsadkf

I’m really intrigued by the power and generality of the status byte/word idea, but more than a little worried about ripping out the ANOMALY_CODE stuff and replacing it with status. Maybe I’ll just add the status byte/word stuff in parallel with the current setup and see how it goes.

Stay Tuned,

Frank

The Robot Rises Again – Adding Vision Processing

Posted 04 April 2026

Well today I exceeded my boredom threshold, so something had to give. I worked out on my basketball court for an hour, flew a glider race in the Condor3 Soaring Flight Simulator, played an online bridge tournament on BBO, and I still hadn’t used up half the day. My workbench is empty, my two 3D printers are silent, and I can’t think of a thing to do. However, while I paced around with my hands in my pockets feeling sorry for myself, I caught sight of Wall-E3, my autonomous wall-following robot gathering dust in its charging cradle in the corner. That robot has gone through a number of iterations, starting with this post in early 2015 and ending with this post in late 2024. The goal of this almost decade-long project was to create a robot that could autonomously navigate around our house, scaring cats and humans alike and recharging itself whenever it got hungry. It did a really good job at the scaring part but unfortunately, I never got it to the point where it wouldn’t get stuck in one of our junk-filled guest bedrooms; close, but no cigar 😒.

Looking at my robot, I realized that in the intervening decade, a lot had changed in the indoor navigation realm, in particular the ability to exploit video camera images for navigation in reasonably constrained environments. So, I decided to revisit my robot project to see if I could utilize the new technology to get closer to my goal of true autonomous navigation.

Part of my interest in starting a new run at the household navigation project has been my positive experience over the last year with Grok, X’s AI. Grok, despite some real issues that need to be kept in mind, is a GREAT research assistant. In the past, I might spend most of a week working my way through internet links and going down various rabbit holes trying to solve a subtle software or hardware problem but Grok can do the same thing in a few seconds. At this point I knew next to nothing about vision-based navigation systems, but I knew for sure that Grok would be able to distill the netiverse down to reasonably digestible chunks pretty quickly.

I started with the following prompt for Grok:

In less than thirty seconds, Grok produced the following:

Absolutely—switching to vision processing is a smart move for handling those tricky “special cases” where pure distance sensors (like your LIDAR + gyro combo) fall short. Cameras give you way more information: color, texture, edges, and even semantic understanding (e.g., “that’s a chair leg” vs. just “something 30cm away”). This helps with low obstacles, reflective surfaces, dynamic items, or cluttered home environments that confuse LIDAR.

Since you already have a solid 6DOF gyro (IMU), you can fuse it with vision for excellent Visual-Inertial Odometry (VIO) or Visual SLAM (vSLAM). This is basically what modern robot vacuums do (vSLAM), but fully customizable for your wheeled robot.

Quick Reality Check Before Diving InVision isn’t perfect either:

  • Sensitive to lighting changes (solution: good indoor lights or IR-capable cameras).
  • Needs decent compute (Raspberry Pi 5 or equivalent works; older Pis may struggle with heavy SLAM).
  • Depth from vision is usually “good enough” indoors but not as precise as LIDAR in perfect conditions.

Hybrid tip: You can keep your existing LIDAR/gyro as a backup while testing vision.Recommended Hardware to Get Started (2026-Friendly Options)You’ll replace/add a camera that provides both color and depth (RGB-D) or stereo for 3D perception. Top practical picks for small wheeled robots:

  1. Best overall: Luxonis OAK-D series (OAK-D Lite or Pro)
  2. Budget depth option: Arducam T2 RGBD ToF Camera (~$100)
  3. Classic: Intel RealSense D435/D455 (if you can find one) or similar RGB-D.

Mount the camera forward-facing (maybe add a second for 360° if budget allows). Keep your gyro—it will make odometry rock-solid when fused.What’s your current brain? (Raspberry Pi 4/5, Jetson, Arduino + companion computer, etc.?) Budget for the camera? That’ll let me refine this exactly.Software Stack – The Modern Way (ROS2 Recommended). The gold standard for this is ROS 2 (Robot Operating System) + Nav2 navigation stack. It handles mapping, localization, path planning, and obstacle avoidance out of the box.

Core vision libraries:

  • OpenCV → Basic start (edge detection, simple avoidance).
  • RTAB-Map or ORB-SLAM3 → Full visual SLAM (builds a 3D map while localizing).
  • VINS-Fusion or OpenVINS → Perfect VIO that fuses your existing 6DOF gyro with camera data (highly accurate odometry).

All run great on Pi 5 with Ubuntu 24.04 + ROS 2 Jazzy (or newer).

This was a lot like drinking from a firehose, but within an hour or so I had a Raspberry Pi 5 kit and two different cameras (a cheap one arriving tomorrow, and a not-so-cheap one arriving in another week or so). In the meantime I asked Grok to hunt up some tutorials and examples so I could start getting up to speed on all things ROS 2 Jazzy Robot Operating System) and SLAM (Simultaneous Localization and Mapping).

Hopefully in the next week or so I will have some hardware & software to play with, and maybe make a step or two in the direction of upgrading WallE3 to WallE4.

08 April 2026 Update:

Worked with Grok to start adding vision capability to my autonomous 4-wheel wall-tracking robot. The plan is to use a Luxonis OAK-D-Lite Fixed-Focus camera and a Raspberry Pi5 running Ubuntu 24.04 and ROS (Robot Operating System) to add vision processing to the robot. Because there was a long lead time for the Luxonis camera, we started with a cheaper (and faster delivery) alternative, the Arducam ToF Camera. Grok insisted the Arducam would work as a near-term solution until the Luxonis camera arrived, but it didn’t work out quite that way. The Arducam worked OK with the default Pi5 OS without ROS, but when we tried to add ROS and the drivers for the Arducam, we ran into a dead end. Then Grok was *sure* that we could simply replace the Pi5 OS with Ubuntu 24.04 LTS and everything would work – – – NOT! So now I’m waiting for the Luxonis so we can move forward again.

When I braced Grok about the readily apparent difference between what Grok thought was going to work fine and the actual result, I asked Grok to search for ANY successful implementation using Pi5, ROS and Arducam, and Grok could not come up with a single confirmed successful implementation. This is a real problem with Grok, one that the devs should be concerned about, but nobody seems to care.

Grok now uses several ‘agents’ to independently research user questions, and the eventual answer is a consensus (I think) of their findings. I have suggested that at least one of the agents should be dedicated to attempting to falsify the results of the others, but I have seen nothing to indicate this has actually gotten through to the devs. In this case, all the ‘red team’ agent would have to do is perform the above search for successful implementations and come up with ZERO, thereby making it obvious that starting with the Arducam instead of waiting for the Luxonis was probably not going to end well. On a positive note, I asked Grok to do the same search with the Luxonis instead of the Arducam, and it came back with several confirmed successful implementations.

Until Grok implements the ‘red team’ facility, users need to be aware that Grok will quite happily drive a user off a cliff, insisting all the way down that “this next fix will surely solve the problem”. Has anyone else had his happen?

16 April 2026 Update:

A lot has happened since my last post on this subject. Working with Grok, we finally got the OAK-D camera working on the Pi5, with real-time imagery displayed on a monitor connected to the Pi5. Of course, this is a LOOOONG way from actually implementing visual navigation on a robot, but ya gotta start somewhere!

When considering how to move the Pi5 and Oak-D camera setup onto the robot, I decided to refresh my old 2-wheel (plus castering wheel) robot rather than trying to shoehorn the Pi5 and camera setup onto my 4-wheel robot. The 2-wheel robot already as a set of nice wheel motors, a pair of DRV8871 motor drivers and a battery/charger setup, and it isn’t being used for anything else at the moment. So, I stripped off the Uno Mega 2560 controller, installed the Pi5 and the camera at one end, and a Teensy 4.1 MCU on the other. I also installed an 8V-to-5V LDO voltage regulator that I had hanging around from a previous project. Here are some photos

Then I started working on the idea of installing magnets on the wheels and Hall-effect sensors on the chassis to produce wheel rotation rate information to integrate with vision processing. Based on a conversation with Grok, I ordered some A3144 Hall-effect sensors from Amazon. When they came in, I implemented a small plugboard circuit to test the sensor, as shown in the following short video:

After I got this working, I created a printed magnet carrier for the wheels, and a small perfboard circuit mount for the sensor. I tested the setup by running the small motor test program Grok wrote for the Teensy 4.1, and the results are shown in the following video:

LED illuminates when the sensor is triggered by a wheel magnet. Scope in the background shows the detection waveform

Here’s a scope trace grab of a typical half-speed wheel motor sequence:

From the above trace, the measured frequency is 4.823Hz. There are four magnets, which implies that the wheel is turning at about 1.2RPS, or about 72RPM. The wheel diameter is about 67mm, which gives a wheel circumference of about 21cm. At the 1.2 RPS rate, this works out to about 0.24m/s.

Of course I’m not anywhere near done yet, as I need two sensors per wheel to be able to determine rotation direction so three more sensor circuits, and a second wheel magnet carrier and magnets for the other wheel. Still though, it’s a good start!

22 April 2026 Update:

I finally got all four A3144 Hall-effect sensor modules and the second wheel magnet carrier mounted and tested. As shown in the following photos, The Hall-effect sensor modules are mounted approximately 90 degrees from each other to provide effective direction sensing.

23 April 2026 Update:

After finishing the Hall-effect sensor project, I worked on implementing a wireless serial connection to the robot for remote status monitoring and for remote ‘Over The Air’ (OTA) programming. I have had this capability for years on the 4-wheel robot and it really paid off, so I didn’t want to get too far down the road with the 2-wheel version without having it. As Grok pointed out, I could use the wireless ethernet capability of the Raspberry Pi5 for this; wire a serial port on the Pi to a hardware serial port on the Teensy 4.1 and send updated .HEX files to the Teensy that way. However, I decided not to do this for three reasons; first and foremost, I disliked the idea of tying Teensy 4.1 program updates to a working Pi5, as that added another huge set of variables; secondly, the Pi5 draws a LOT of power and so I don’t want to have it running until absolutely required, and thirdly I completely solved the Teensy 3x/4x OTA problem some five years back (see this post, this post and this post) and can readily port the solution to the current configuration.

I mounted one of my spare HC-05 Bluetooth Serial modules to the back of my new camera support structure with hot glue and connected it to Serial2 (pins 7 & 8) on the Teensy 4.1. After the usual cussing and screwups getting the Tx & Rx lines connected properly, I was able to demonstrate simple pass-through serial-to-serial capability, so then I moved on to getting OTA working. First I created an Arduino project called ‘2WheelRobot_HC05_OTA1’. Then I cloned Joe Pasquariello’s FlasherX repository to my PC and copied the latest versions of FlashTxx.h/.cpp, FxUtil.h/.cpp, and FlasherTxx.ino from my local repo to the local folder of that project. Next I ported the ‘CheckForUserInput()’ function from one of my 4Wheel robot projects into the OTA1 project, removing everything but the ‘U’ (Update Firmware) case. After the usual number of mistakes and cussing, I got this working to the point where I could manually send a new .HEX file to the Teensy 4.1 using the HC-05 Bluetooth serial link and TerraTerm, as shown in the following screengrab:

The next step is to automate this procedure as I did back in 2021. I should be able to use the same ‘board.txt’ and ‘TeensyOTA1.ttl’ files from back then. At this point I decided to try my ‘symlink’ trick instead of copying the files directly. Here is the relevant text from ‘Symbolic Link HOWTO.txt’:

//04/19/24 Arduino projects now use an extra folder layer, so instead of

mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\TeensyOTA1.ttl “C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl”
mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\board.txt “C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt”

The commands need to be:

mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\230918_WallE3_MoveTo_Test_V1\TeensyOTA1.ttl “C:\Users\Frank\Documents\Arduino\Robot Common Files\TeensyOTA1.ttl”
mklink C:\Users\Frank\Documents\Arduino\230918_WallE3_MoveTo_Test_V1\230918_WallE3_MoveTo_Test_V1\board.txt “C:\Users\Frank\Documents\Arduino\Robot Common Files\board.txt”

I opened a command prompt and pasted the first mklink command in, and then edited it to point to my ‘2WheelRobot_HC05_OTA1’ folder. When I was done, I had this command line:

After executing these commands, I confirmed that the ‘board.txt’ and ‘TeensyOTA1.ttl’ files now appear in the ‘C:\Users\Frank\Documents\Arduino\2WheelRobot_HC05_OTA1\2WheelRobot_HC05_OTA1’ folder – yay!

To confirm proper operation, I changed the ‘LED_ONOFF_MSEC’ parameter from 250 to 1000 in the .ino file, changed the COM port to COM12 and hit F5.

Rats! Close – but no Cigar! It started off OK and got to the point where TeraTerm should send the HEX file, and then I got this error message:

Hmm, turns out it actually worked – now the LED is blinking at 1000mSec ON, 1000mSec OFF, instead of 250 ON/OFF. So I tried again and this time I got “unable to connect to Teensy on COM12

25 April 2026 Update:

With a lot of help from Grok, I finally got to the point where I can now do ‘over-the-air’ (OTA) program updates to my 2-wheel robot via the onboard HC-05 BT serial port connected to the Teensy 4.1 Serial2 port (pins 7 & 8). For all the details, see the ’25 April 2026 Update’ section in this post.

At this point in the project, I have a 2 wheel robot with the following characteristics:

  • Teensy 4.1 processor and HC-05 BT Serial (Serial2) allowing OTA program updates
  • Dual DRV8871 motor controllers
  • 7.4V (dual Lithium-Oxide) battery pack with 5V LDO voltage regulator
  • Raspberry Pi5 micro-computer with ROS and other relevant software installed
  • Luxonis OAK-D Lite camera connected to the Pi5 and functionally tested
  • 4ea A3144 Hall-effect sensors mounted in physically quadrature pairs on each wheel, opposite a 4-magnet disc for speed & direction sensing.

At the moment the robot has no distance sensing capability, although the plan is to mount a pair of VL53L1X ToF sensors for side-distance monitoring.

the Pi5 and camera are not yet connected, as their current draw is significant and may require an upgraded battery and regulator for practical use. Here are some photos of the current robot configuration:

Here is the current system schematic. Note that theRaspberry Pi5, Luxonis OAK-D camera and MPU6050 6DoF gyro have not yet been incorporated into the system.

And here is the complete Teensy 4.1 sketch that was used to verify OTA programming capabilities.

27 April 2026 Update:

I ran into a problem with the way the ‘quadrature’ Hall-effect sensor module was mounted on the wheel motors. I had simply hot-glued them to the motor casing, but after a day or so the module fell off, so I needed something a little more permanent.

I designed a printed piece to snap-fit onto the motor body, with a slot in the top for the Hall-effect sensor perfboard, as shown below:

Here is a photo showing the Hall-effect sensor modules mounted on the underside of the robot.

02 May 2026 Update:

As it turns out, the above ‘physical quadrature’ arrangement was optimally bad, because it does not provide the 00 01 10 11 activation sequence required for accurate speed and direction sensing. After figuring this out, I wound up manually adjusting the position of one of the sensor modules to get the proper sequence, and then hot gluing the module down at that point. The following photo shows the somewhat messy current arrangement.

My plan is to design another version of the motor mount for a more permanent installation.

I also spent some time neatening up the topside of the robot and updating the schematic, as shown in the following photos.

04 May 2026 Update:

Grok and I worked together to remove a lot of unneeded PID code and add forward & reverse tick-count based motion control. Now the robot will move forward or backward a specified distance with a command such as “D, 0.3” or “D, -0.5” (meters). It will also stop in response to an ‘S’ command. Here’s the full code:

I also had Grok generate a Python script to automate the task of creating symlinks to my custom ‘board.txt’ and ‘TeensyOTA1.ttl’ files necessary for OTA updates. Now I can double-click a desktop shortcut, point to a folder, and have the above two files appear magically as symlinks to the files in my new “C:\Users\Frank\Documents\Arduino\Robot Common Files 2026” folder. This is wonderful time-saver and I only wish I had done this earlier. Here’s the code:

05 May 2026 Update:

I added a tick-count reset ‘R/r’ operation to the available commands in the ‘CheckForUserInput() function. At this point the robot will go forwards and backwards by a set distance (tick count), will stop on command, and the tick count can be reset to zero at any time. Here is the ‘final’ code:

The next big step in the Teensy code will be to integrate the MPU6050 6DoF gyro into the code so that the robot can make accurate PID-controlled specified-angle turns

Stay tuned,

Frank

Return of the Robot

Posted 14 October 2024

It’s been almost four months since I did anything with Wall-E3, my autonomous wall-following robot. I’ve been busy with building a new 3D printer (Prusa Mk4) quarrelling with (and losing to) my other 3D printer (Flashforge Creator Pro 2), and some other stuff, but I’m now between projects so I want to spend some more time with Wall-E3. I ran a couple of field tests on my normal home track, and saw that the robot is still having trouble with managing transitions from one wall to another, especially at the end of the kitchen counter (the ‘B’ position shown in the diagram in this post).

Looking through the telemetry log, I was struck by the fact that it appears that the HandleAnomalousConditions() function always called with ANOMALY_CODE == ANOMALY_NONE. Looking through the code, I’m not quite sure where this happens. There is code in UpdateAllEnvironmentParameters(), but it shouldn’t execute unless all the other ‘elseif’s fail. Looking further, I found this code in the HandleAnomalousConditions() function:

However, it doesn’t appear that HandleExcessSteervalCase() is being called at all – hmm.

20 October 2024 Update:

So I think I figured some of this out. Here’s the relevant code in loop():

So the program simply loops through UpdateAllEnvironmentParameters() and HandleAnomalousConditions. The Update function retrieves all sensor information and uses it to update the global variable gl_LastAnomalyCode. HandleAnomalousConditions then uses the value of gl_LastAnomalyCode in a CASE statement to figure out what to do. Here’s the relevant code from HandleAnomalousConditions

The problem I was seeing was, HandleAnomalousConditions() was being called each time, but the ANOMALY_EXCESS_STEER_VAL case was never executed. Somehow, the anomaly code was being changed to ANOMALY_NONE before the call to HandleAnomalousConditions().

I think I finally figured out what was happening. The very first time through loop(), UpdateAllEnvironmentParameters() is called and (normally) sets gl_LastAnomalyCode to ANOMALY_NONE. Then HandleAnomalousConditions() is called and the ANOMALY_NONE case executes. This launches either the TrackRight or TrackLeft functions, which have their own internal loop that doesn’t exit until a non-NONE anomaly is detected. When this happens, the tracking loop exits, but because it started in the ANOMALY_NONE case section of HandleAnomalousConditions(), the next instruction to be executed is the one immediately after HandleAnomalousConditions() exits! The next relevant instruction is the call to UpdateAllEnvironmentParameters() at the top of loop(), which can (and apparently does) modify gl_LastAnomalyCode from ANOMALY_EXCESS_STEER_VAL to ANOMALY_NONE, which means the ANOMALY_EXCESS_STEER_VAL case is never executed.

So, it is critical that nothing changes gl_LastAnomalyCode between HandleAnomalousConditions() calls. As it turns out, the fix was simply to remove the call to HandleAnomalousConditions() at the top of loop(), resulting in the following in loop():

This works because all functions that contain a local loop (like TrackLeft/RightWall()) call UpdateAllEnvironmentParameters() each time through their loop, and exit when a non-NONE anomaly code is detected. With the removal of the call to UpdateAllEnvironmentParameters() in loop(), the next call to HandleAnomalousConditions() ‘sees’ the correct value in gl_LastAnomalyCode.

I made this change and made another run, with the following telemetry output:

In the above telemetry, the significant points are:

  • The first pass through HandleAnomalousConditions(NEITHER) ANOMALY_NONE CASE decides which wall to track and selects TrackLeftWallOffset()
  • Tracking starts at 0.7Sec and ends at 3.0Sec when an EXCESS_STEER_VAL anomaly is detected.
  • In HandleAnomalousConditions(LEFT) with last anomaly code = EXCESS_STEER_VAL shows that the gl_LastAnomalyCode has not been changed
  • In HandleAnomalousConditions(LEFT) ANOMALY_EXCESS_STEER_VAL case shows that the proper case section is executing.
  • HandleExcessSteervalCase(LEFT) open doorway block shows that the appropriate handler function is launched.
  • Ultimately this results in the right wall being captured and tracked (the run was terminated just before the robot actually started tracking the righthand wall).

WallE3 Doesn’t Like Reflective Surfaces

Posted 08 December 2023

WallE3 went with us last month when we travelled to St. Louis for Thanksgiving with family, and I showed off his autonomous wall following skills. WallE3 actually did great for quite a while – that is until he found himself staring at the side of a floor-mounted wine cooler (wine ‘safe’?). As can be seen in the following short video, WallE3 fell in love with the cooler, and showed his love by repeatedly head-butting it – oops!

After looking at the telemetry data for the run, I saw that WallE3 was measuring much larger front distances – like several hundred centimeters – when it was only a few centimeters from the object. It appears he was backing up to a defined front distance in response to a ‘WALL_OFFSET_DISTANCE_AHEAD’ anomaly, but somehow convinced himself that instead of 20cm from the wall, he was actually more like 100cm away. Of course, since he wanted to be at 30cm (the desired wall offset distance), he drove forward to lessen the distance, thereby bonking into the wall. Then, when he hit the wall, the front LIDAR line of sight geometry changed enough to produce a true measurement of just a few centimeters, which then sent WallE3 running backwards to open up the distance. Lather, rinse, repeat. Here’s an Excel plot of a representative (but not exact – I somehow lost the actual telemetry data for this run).

Representative reflective surface ‘headbutt’ telemetry

As can be seen from the above the measured distance oscillates between the maximum measurable distance of 1000cm to nearly zero. Here’s a photo of the experimental setup that produced the above data.

WallE3 and a reflective surface

The ‘reflective surface’ is a piece of glossy black translucent plastic, oriented at an angle to reflect WallE3’s LIDAR beam upward to the ceiling and then the reflected signal from the ceiling back to WallE3.

I’ve been thinking about this issue ever since first seeing it in St. Louis, but hadn’t come up with any firm ideas about how to solve it. I tinkered with the idea of generating two running averages of the front distance when in the ‘MoveToDesiredFrontDistCm()’ function, with the two averages separated in time by some amount. When approaching a normal non-reflective surface, the two averages would closely track each other, but when approaching a reflective surface that produced the above dramatic distance shifts, then the two averages would be dramatically different around the transitions. This could then be detected, and something done to recover. Then last night while falling asleep, I wondered whether or not the STMicro VL53LXX infra-red LIDAR sensors would have the same problem – hmm, maybe not! If that were the case, then I could probably run one in parallel with the Garmin LIDAR unit, and use it instead of the Garmin for all ‘MoveToDesiredFront/RearDistCm()’ calls.

I tried this experiment using the currently installed rear distance sensors by calling ‘MoveToDesiredRearDistCm()’ with the same reflective surface setup as before, as shown in the following photo:

‘MoveToDesiredRearDistCm()’ setup with reflective surface

Here’s an excel plot of the rear distance run:

MoveToRearDistCm() with 30cm target and reflective surface

As can be seen in the above plot, the rear distance run was completely normal, so the VL53LXX infra-red LIDAR sensors don’t have the same problem – at least not with this translucent glossy plastic material.

10 December 2023 Update:

I got to thinking that maybe the reason the rear distance sensor worked so well with the shiny black material is that it could be IR transparent, meaning that while the front sensor (an LED LIDAR system) would see a ‘mirror’, the rear sensor would just see the toolbox. So, I jumped up on Amazon and got a cheap mirror square so I could answer that question. Here is a short video and Excel plot showing a ‘MoveToDesiredFrontDistCm()’ run with the new mirror square.

As can be seen from the video, WallE was perfectly happy to drive right through the mirror, but I had visions of mirror pieces all over the bench, the floor, and me, so I manually prevented that from happening. Here’s an Excel plot showing the same run:

MoveToFrontDist(cm) approaching tilted mirror

As the Excel plot shows, the robot did OK for the first 20 measurements (about 1sec) but immediately thereafter started a steady 200cm, and it stayed that way until I stopped it at about 2sec

Then I tried the same experiment, but this time utilizing the STMicro VL53LXX IR LIDAR sensor on the rear of the robot, as shown in the following short video:

As the video shows, the IR LIDAR behaved pretty much the same as the LED LIDAR (no real surprise, as they are both LIDAR technology, but still a bummer!

Here’s the Excel plot for this run:

As the Excel plot shows, the distance decreased monotonically for the first 30 points (about 1.5sec) but then shot up to 200 due to the mirror. I believe the lower distances after about point 40 (2sec) were due to me interfering with the IR beam.

So, unfortunately my theory about the rear IR sensor doing better with the shiny black plastic ‘mirror’ because it appeared transparent at that wavelength seems to be bolstered, so I now think that using the VL53LXX sensor instead of the Garmin LIDAR LED sensor for ‘MoveToFrontDistCm()’ operations is NOT going to work. Back to the drawing board :(.

23 December 2023 Update:

I’ve been thinking about this problem for a while now, and have not come up with a good answer; WallE3’s perception of the world around it depends entirely on LIDAR-type distance sensors, so if those sensors produce ‘false’ distance reports due to mirror or mirror-ish surfaces, WallE3 has no way to know the ‘true’ distance – bummer!

So, what I decided to do is to simply detect the symptoms of the ‘mirrored surface’ situation, halt the robot and yell for help. The detection algorithm uses the knowledge that when the robot is moving forward toward a ‘normal’ object, the front distance should decrease monotonically with time. Similarly when the robot is moving backwards toward a ‘normal’ object, the rear distance should decrease monotonically with time. Conversely, when approaching a ‘mirror-like’ object, the measured distance tends to be unstable, with distance increasing instead of decreasing.

The detection algorithm calculates a three-point average each time through the action loop, and compares the result to the last time through the loop. If the new average is greater than the old average, a ‘mirrored surface detection’ is declared and the robot calls ‘YellForHelp()’ which stops the motors and emits an audible Morse code ‘SOS’.

Here’s the full code for ‘DoOneMoveToFrontDistCm()’ function that actually moves the robot and checks for ‘mirrored surface’ error conditions

02 January 2024 Update:

I wasn’t really happy with my previous attempt at ‘mirrored surface’ detection as it seemed pretty to produce false positives. After thinking about the problem some more, I thought I might be able to use a distance variance calculation as a more robust detection method. The idea is that a normal monotonic increase or decrease in distance measurements would have a pretty low variance, while a distance reversal would generate a much larger value.

So, I ran some simulations in Excel using a 5-point running variance calculation, and the results were encouraging. Then, with the help of my lovely lab assistant, I set up an experiment in my office sandbox to see if I could capture a representative mirrored surface ‘screwup’, as shown below:

And here is an Excel plot showing the results (both the distance and 5-pt variance vertical scales have been truncated to show the smaller scale variations)

Front Distance and 5-pt Variance (truncated vertical scales for better visibility)

As can be see in the video and the Excel plot, the robot undergoes a number of distinct front/back oscillations, but then eventually settles down. It is clear from the plot that the 5-pt variance calculation is a good indicator of a ‘mirrored surface’ condition. Just looking at the plot, it appears that a variance threshold of 40-60 should provide for robust detection without much of a risk of false positives.

One other note about this experiment. To get multiple oscillations as shown in the video and plot, the mirrored surface had to be slanted slightly up toward the ceiling. If the surface was oriented vertically like a normal wall, the robot would often miss the first distance and hit the wall, but then would typically back off to the correct distance. I think this indicates that in the vertical configuration, there is enough backscatter from the floor and/or the robot itself to get a reasonable (if not entirely accurate) LIDAR distance measurement.

07 January 2024 Update:

After playing around some more with this issue, I think the above 5pt variance calculation for ‘mirrored surface’ detection will work, so I revised my current well-tested ‘CalcBruteFrontDistArrayVariance()’ function to take a integer argument denoting the number of elements to use, starting at the end (most recent data) and working backwards.

Then I ran another test in my lab, but this time on a non-mirrored surface, to verify that the 5pt variance calc would still work properly and to settle on a good threshold for ‘mirror’ detection. The Excel plot below shows the results of a run where the robot moved backwards (still using the front LIDAR sensor) to 90cm from the wall.

Running 5-point variance of front distances

As can be seen, the variance starts out in the 10 to 20 range during the initial ‘coarse’ distance movement, but then drops into the 0 to 5 range during the second ‘fine tuning’ movement. Comparing this to the previous ‘mirrored surface’ plot leads me to believe that a threshold of 40 would almost certainly (eventually) detect a ‘mirrored surface’ condition.

24 January 2024 Update:

I added the ‘CalcFrontNPointVar(uint16_t N)’ function to the robot code so I could obtain the front variance using just the last N front distances instead of the entire 100 point array. This turned out to work very well for detecting the ‘Mirrored Surface’ anomaly condition. Then I added ‘ANOMALY_MIRRORED_SFC’ to the list of anomaly codes, and added a ‘case’ block in ‘HandleAnomalousConditions()’ to deal with this condition. The handler function is:

At the moment, all it does is call the ‘RunToDaylightV2()’ function, which does a 360º search for the best direction in which to move next. Here’s the telemetry and a short video showing the action.

This experiment led to one of those “Well, DUH!! moments, as the robot’s preferred ‘RunToDaylight()’ heading was right back at the mirrored surface!

After recovering from my ‘face-palm’ moment, I realized I needed to modify the ‘RunToDaylight() function to take heading-start & heading-end parameters to exclude the mirrored surface sector from the ‘RunToDaylight()’ search for an appropriate recovery heading.

28 January 2024 Update:

I modified ‘RunToDaylightV2()’ to take two float parameters denoting the start and ending headings for the search. In the normal non-mirrored surface case, ‘RunToDaylightV2()’ is called with no arguments. The no-argument overload simply calls ‘RunToDaylightV2(startHdg, endHdg)’ with startHdg = endHdg. In the mirrored surface case, the two-parameter version of ‘RunToDaylightV2()’ is called from the ‘ANOMALY_MIRRORED_SFC’ case block of ‘HandleAnomalousConditions()’

Here’s a short video and the telemetry from a test run toward a mirrored surface.

Stay tuned!

Frank

PID Integral Windup Problem in MoveToDesired(Front|Back|Left|Right)Distance()

Posted 14 November 2023

After getting the charging station (re)integrated with the rest of the system, I have been working on complete travel-charge-travel charge cycles, where WallE3 travels around the house, finds and connects to the charger, disconnects, travels around the house some more, and then finds its way back to the charger – lather, rinse, repeat.

However, in the process I have run into a problem with the MoveToDesired(Front|Back|Left|Right)Distance() function. On several occasions the robot has blown right by the desired distance and run headlong (or backlong?) into a wall. Investigating has led me to realize that the cause of this problem is the infamous ‘integral windup’ characteristic inherent in insufficiently sophisticated PID algorithms. Here’s the relevant telemetry from a recent MoveToDesiredFrontDistance() run, and an Excel plot showing the ‘integral windup’ issue.

MoveToDesiredFrontDistance run showing integral windup problem

In the above Excel plot, the initial error is -221, which causes an output of ~300. The motor speed is clamped to -75, so the distance and error start slowly heading toward the target of 30cm and an error of zero. However, the integral (I) term continues to increase from near zero to well over -1000, and the output term was completely dominated by the integral value, keeping the motor speed clamped at -75 even as the measured distance approached and passed the target. In this particular case, I got lucky as the actual distance and the target distance were within the +/- 1cm termination window and the loop terminated. In other cases where the actual distance went through the termination window too rapidly, the robot would basically continued forever – or at least as long as necessary to ‘unwind’ the integral term.

So, what to do? Reading up on ‘integral windup’, I found this article from around 1990 (judging by dates on the references), and I decided to try the method described there as the ‘back-calculation and tracking’ method. The idea is that when the output gets clamped to some maximum value (+/- 75 in this application), the integral value is recomputed to a value that would produce the output that would naturally produce the clamp value. For instance for an error value of -221, the Ival is -22.1 which results in an output of -309, which gets clamped to -75. For an output of -75 with an input error of -221 we have -221 + Ival*(-221) + (-221*1.5) = -75 –> Ival = (-75 + 331.5)/-221 = -1.16. Checking, -1.16*(-221) = 256.36, 1.5*(-221) = 331.5, 256.36 – 331.5 = -75.14.

After thinking about this some more, I wondered if instead of back-calculating a new I value, maybe I could simply zero out the retained ‘last_Ival’ parameter whenever the PID output value is high enough so that it would get clamped. Assuming the same clamping value of 75 and a ‘P’ value of 1.5, this would happen any time the absolute value of the error value is greater or equal to 75/1.5 = 50.

Using Excel to apply this algorithm to the telemetry above telemetry data, I get the following plot.

MoveToDesiredFrontDistance run showing integral windup problem, with modified Integral term algorithm

In the above plot, the integral term (yellow line) was modified to be exactly zero whenever the output would have been > 75 or < -75 (the motor speed clamping values), even with a zero I value. As can be seen above, it stays zero until point 75. After that it smoothly decreases to about -55.6 at point 99, and then smoothly increases to about 30.98 at point 135, at which point is gets clamped to zero again. The motor speed value (green line) stays at -75 to point 75, at which point it smoothly (and linearly it appears) increases to +75, where it is clamped again.

I believe this might just do the trick. It will certainly prevent infinite runaway if the error term goes through zero too quickly to cause the loop to exit, as the robot will stop and then back up to the target distance.

I *think* I can modify my ‘PIDCalcs()’ function, and then everything that uses it will get the benefit of the ‘non-winding integral term’ algorithm.

17 November 2023 Update:

One day to go until SpaceX makes its second attempt at getting the world’s largest rocket into space – yeah!

To work the ‘integral term windup’ issue, I ported the FrontBackMotionTest code from an earlier program into the WallE3_Quicksort_V4 project so I could iterate easier, and got that running. As a baseline, here is the telemetry and a short video from the first run:

As can be seen in the above telemetry, the ‘Ival’ column shows that the integral term increases monotonically from -19.7 to -998.4, forcing the speed to its maximum value (75) for the entire run. This causes the robot to badly overshoot the target distance, which is why I decided to have the robot perform the ‘second time through’ action shown in the telemetry and video.

I added the following code to MoveToDesiredFrontDistCm() to zero out the ‘lastIval’ value if the error term * Kp > MOTOR_SPEED_QTR

Here is the telemetry and the video from the run with the above change

As can be seen from the above, this change did not affect the robot’s behavior significantly – it still badly overshot the target distance and the ‘second try’ was still required to bring the robot back to nearer the target. However, from inspection of the ‘Ival’ values it is clear that the added code is doing its job of zeroing out the ‘lastIval’ term when [error_term]*Kp > Max speed. Here’s an Excel plot showing the ‘Ival’ and ‘output’ terms from both the above runs.

comparing Ival, output, speed vals before and after Ival clamp mod

In the above plot, the ‘after Ival term stays at a very low (negative) value for almost the entire run, due to the new clamping code. Consequently the ‘after’ output term decreases linearly with the decreasing error term until the resultant ‘after’ speed command comes off the -75 ‘stop’ as shown by the gray line in the above plot.

Unfortunately, the speed reduction from ‘lastIval’ clamping isn’t enough to prevent the robot from overshooting almost as much as it did before. This indicates (at least to me) that at least the Kp value is way too high. Just as a thought, the Kp value should be just high enough so that if the distance error is, say, 100cm, then the output would just clamp at MOTOR_SPEED_LOW –> 75. So, 100*Kp = 75 –> Kp = 0.75, or about half its current value.

It turned out that I not only needed to cut the Kp value in half, but the Ki value as well, so now PID = (0.75, 0.05, 0.2). Here’s the telemetry and video from a run using these values:

I wasn’t able to eliminate the ‘second try’ requirement, although I was able to reduce the overshoot to about half its previous value.

One possible fly in the ointment is the rapid drop of the front variance value during the second run through MoveToDesiredFrontDistCm(); this *should* have triggered an exit from the function with anomaly code = ANOMALY_STUCK_AHEAD, but didn’t – and I don’t know why.

Too tired to go down this rabbit hole tonight – try again tomorrow while waiting for Space X to make history with its second StarShip test flight!

18 November 2023 Update:

Wow! Wow! Wow! I had the pleasure of watching SpaceX’s literally historic second Starship test launch this morning, and I’m still psyched! I have now had the pleasure of watching both a Saturn V launch as part of the Apollo moon landing program and now the Starship and booster launch as part of (I hope) the Mars landing program.

OK, back to robots. Last night while drifting off to sleep, it occurred to me that the ANOMALY_STUCK_AHEAD alert is conditioned on the current motor configuration as well as the front variance value. Here is the actual code that determines this:

So in this case, the motors were running in reverse, so neither of the motor conditions were TRUE, and therefore IsStuckAhead() returned FALSE. Mystery solved!

At this point I think I’ve done as much as I can with the FrontBackMotionTest program. I will modify the other ‘MoveTo’ functions to clamp the Ival term as described above. However, it is also clear that the ‘second try’ part of the ‘MoveTo’ functions cannot be eliminated without having to accept significant target distance overshoot, and that the original PID values (1.5, 0.1, 0.2) actually work better than (0.75, 0.05, 0.2), as the lower PI values cause the second try operation to take significantly longer. Here’s an Excel plot comparing just the ‘second try’ results for both:

‘Second try’ results for (0.75,0.1,0.2) and (0.75, 0.05, 0.2)

In the above plot, the ‘2’ results are for (0.75, 0.05, 0.2). As can be seen, the ‘2’ configuration takes over twice as long to complete as the ‘normal’ (1.5, 0.1, 0.2) run, and the issue with the front variance value going to zero doesn’t exist for the ‘normal’ run.

This makes it clear that while clamping the Ival does help significantly, the change in PID values does not.

18 November 2023 8:43PM EST Update:

While cleaning up the code, I noticed I had missed one spot where the global OffsetDistKp value should have been replaced by ‘Kp’, the local value, and unfortunately it was in the section that determined whether or not the Ival value would be clamped, as shown below:

This had the effect of skewing the point at which the Ival stopped being clamped, so the previous results are a bit suspect. I ran a few more tests, and discovered the function actually performed better with a higher Kp value than the original 1.5. The reason for this is that a higher Kp value moves the point at which Ival can start accumulating later in the run (closer to the target distance thus less error), and so more quickly reduces motor speed. In addition, the Ival clamping action has no effect on the second pass through the function as the motor speed values never approach the max speed, so this part operates as designed as well. Here’s a run showing the result of using a PID of (2.0, 0.1, 0.2):

As the data shows, the overshoot is only 6cm (as opposed to the 18cm overshoot without the Ival clamp), and the recovery pass completes in about 2 sec instead of over 5. Here are Excel plots showing the primary and secondary passes.

Second pass correction with PID (2.0,0.1,0.2). Time scale is 20 units/sec, so total time ~ 1.5sec

So, it looks like I want to change the global OffsetKp value from 1.5 to 2.0, leaving Ki & Kd unchanged.

19 November 2023 Update:

Looking back at the results from the above tests, I realized that the second call to MoveToDesiredFrontDistCm(Offset, Kp, Ki, Kd) (the ‘second time though’ step) isn’t actually in the MoveToDesiredFrontDistCm() function – it is part of the test harness. So, I think I need to incorporate the second pass directly into the function so it will get performed everywhere MoveToDesiredFrontDistCm() is called. I wonder if I can do this recursively, by calling MoveToDesiredFrontDistCm() from inside MoveToDesiredFrontDistCm()?

So, I moved the ‘Second Time Through’ code from the test harness into MoveToDesiredFrontDistCm(), as shown below:

Here’s the telemetry from the run with recursive MoveToDesiredFrontDistCm() calls:

The first run through MoveToDesiredFrontDistCm() occurs normally, with the exit at the 30cm target distance as expected, and then the expected overshoot to 25cm. Then the second call to MoveToDesiredFrontDistCm() starts at 24cm and gets to 29cm in approximately 1.5sec, followed by two more 10-item distance displays. I guess the first distance display is from the one inside the MoveToDesiredFrontDistCm() call, and the second one is the one from the test harness. I need the delay caused by the 10-distance display, but I can probably replace it with just a 1-sec delay.

Here’s another run with the loop replaced by a 1-sec delay:

From the above, it looks like the function is doing what it should. It keeps iterating until the distance measured after a 1-sec delay fits within the +/- 1cm window.

MoveToDesiredRearDistCm():

Now that MoveToDesiredFrontDistCm() is working, I started on MoveToDesiredRearDistCm(). I started by copying MoveToDesiredFrontDistCm() and changing all relevant occurrences of ‘Front’ to ‘Rear’. This was about 99% of the required effort. However, there were two issues that surfaced during the port. The first was that the test program needed the ability to pass the PID values into the function, but all the mainline code uses just a single parameter (the desired offset). As a temporary fix I created a four-parameter version of the function with all the code, and had the single parameter version pass the global PID values to the four-parameter on. However a much better solution was to remove the single parameter version of MoveToDesiredFrontDistCm() entirely, and instead declare the four-parameter version with three default values right after the definition, as shown here:

The second issue was the sign of the speed variable in the RunBothMotorsBidirectional() function. For MoveToDesiredFrontDistCm() the parameters to this function had to both be negative, as in RunBothMotorsBidirectional(-speed, -speed), but in the MoveToDesiredRearDistCm() they have to be positive as in RunBothMotorsBidirectional(speed, speed). Once this was accomplished, the MoveToDesiredRearDistCm() operation was successful.

MoveToDesiredLeft/RightDistCm():

I started on these two functions, but soon realized that these two functions aren’t called anymore. They were originally used as part of wall offset capture operations, but were abandoned in favor of an algorithm that uses a perpendicular approach. Soooo, I will just remove these two functions entirely and call it good!

Stay tuned,

Frank

Charging Station Connect/Disconnect Cycle

Posted 11 November 2023

After getting the wide-body IR homing PID values defined properly, the next challenge was to actually get the robot to connect to the charging station, and after getting charged, to disconnect from it successfully. After the usual number of errors and goofs, I believe I have it working now. Here is a short video and the telemetry for a complete walltracking – IR home to charging station – disconnect from charging station – back to walltracking cycle

And here is the telemetry for the run:

Wide Body Robot Charging Station Homing PID Tuning

Posted 30 October 2023

Almost exactly two years ago I redid the robot form factor for better turning performance, as described in this post. Now, two years later when integrating the new ‘wide body’ robot with the charging station, I belatedly discovered that the PID values optimized for the old ‘narrow body’ robot don’t do so well anymore. Here’s a short video and telemetry showing the wide body robot homing with the old narrow body PID values.

I strongly suspect that the new form factor will require a significantly different set of PID values. Here

04 November 2023 Update:

Well, I went down a bit of a rabbit-hole on this one, but I think I have successfully found my way back to the real world. It’s been a while since I’ve had to do any PID tuning, so my skills (and code) have atrophied significantly. Fortunately, my former self left me some rudimentary clues, so I didn’t have to recreate everything from scratch.

In previous tuning sessions, I had constructed a test harness that accepted PID values on the command line of the OTA serial connection to the robot, allowing rapid parameter optimization. Unfortunately, I didn’t document this in my previous posts (what was I thinking!), so it took me a while to find some previous programs that contained the desired test harness code. After porting the test harness code to my current WallE3_Quicksort_V4 program and modifying it a bit for more testing convenience, I am documenting it here so a future me will have a better chance of avoiding unnecessary work.

PID Tuning Test Harness:

The idea behind the PID tuning test harness is to allow the user to input PID (and other) parameters on the command line and have the robot execute the desired action – in this case homing to the charging station – without having to go through the entire program structure. The test harness code is placed at the end of the setup() routine and doesn’t exit but returns to the point where a new set of parameters can be provided. This allows for rapid iteration over the PID parameter space. Here is the code as it appears in WallE3_Quicksort_V4.ino

Note that the ‘CheckForUserInput()’ function calls in the above assume a boolean return value; this change was necessary to allow the test harness to be run multiple times without having to reboot the robot code. This also required adding a new function key (*) to the list of keys handled by CheckForUserInput(). Here are the new versions of the two flavors of CheckForUserInput():

Now that I’ve gotten the test harness set up, I can actually start re-tuning the charge station homing PID values for the new wide-body robot – yay! Here’s a short video and telemetry from the first run with PID(50,0,0).

Here’s another run, this time with PID = (100,0,0)

Here’s the PID = (100,0,0) run again, but starting with an initial orientation offset:

And here is an Excel plot showing the steering value vs time:

This is a pretty nice result, and I’m tempted to leave it just like this.

28 October 2023 ‘Field Test’

Posted 28 October 2023

Field test starting in bedroom hallway, just before MBR door. I have elected to start here, as WallE3 has been consistently making it this far with no problems whatsoever. Everything wen OK except for a couple of strange occurrences where it appeared that WallE3 was simply going straight without tracking anything. One of these occurrences was right at the end of the run. Here’s the video of the run:

And here’s the telemetry:

The strange behavior occurs just after WallE3 changed from the left to right-hand wall when it hits the open doorway of the larger guest bedroom, and then makes the hard right-hand turn to follow the near wall when it exits the hallway into the dining room area.

From the video, this occurs at approximately 203 sec, so I have excerpted the telemetry starting at the point where the robot transitions from the left to right wall.

At 215.3 sec WallE3 experiences another EXCESS_STEER_VAL anomaly (as the robot exits the hallway into the dining room area). WallE3 makes a slight right turn, moves forward a bit, and stops; this is the normal ‘move ahead one skosh’ after an EXCESS_STEER_VAL anomaly detection.

Then at 216.6 sec with gl_Left/RightCenterCm = 219.7/77.2, Left/RightSteerVal = 1.00/-1.00 it makes another right turn to, presumably, follow the right-hand wall, but it doesn’t try to capture the 30 cm offset and track – it just goes straight, eventually running into the wall on the far side of the open doorway into the kitchen area bathroom. From the telemetry it looks like WallE3 called ‘MoveToDesiredFrontDistCm() with a target distance of 30 cm as part of ‘CaptureWallOffset()’ but started the process almost parallel to the wall rather than perpendicular.

Matching the code up with the telemetry and the video, it becomes apparent that ‘HandleExcessSteervalCase(RIGHT)’ is called by HandleAnomalousConditions(RIGHT) as soon as the EXCESS_STEER_VAL anomaly is detected.

The first thing that HandleExcessSteervalCase() does is move forward for 500 mSec – the ‘skosh’ intended to clear the point where the anomaly occurred to obtain valid distance measurements. In this case, however, the video shows that the ‘forward’ direction was slanted off to the right instead of straight ahead, which meant that instead of detecting an ‘open corner’ condition with no trackable wall in range, the robot saw a trackable wall to the right at 77.2 cm, as the following telemetry line shows:

216672: gl_Left/RightCenterCm = 219.7/77.2, Left/RightSteerVal = 1.00/-1.00  

So, instead of the ‘open corner’ block in HandleExcessSteervalCase() executing, the ‘else if (trkcase == TRACKING_RIGHT)’ block was run , as shown below:

Since HandleExcessSteervalCase was called with TRACKING_RIGHT, and because gl_RightCenterCm = 77.2 was less than MAX_TRACKING_DISTANCE_CM (100 cm), ‘TrackRightWallOffset(WALL_OFFSET_TRACK_Kp, WALL_OFFSET_TRACK_Ki, WALL_OFFSET_TRACK_Kd, WALL_OFFSET_TGTDIST_CM)’ was called without running ‘ChooseBetterTrackingSide()’.

In TrackRightWallOffset(), CaptureWallOffset(TRACKING_RIGHT, 77.2) was called because the starting offset was too high for immediate tracking. In CaptureWallOffset(), a 90º CW turn was performed to (supposedly) point WallE3 directly at the wall to be tracked, and then ‘MoveToDesiredFrontDistCm(tgt_offset_cm)’ was called to capture the wall offset of 30 cm.

Unfortunately, the 90º CW turn step assumes the robot is already oriented parallel to the wall to be tracked, but because the actual physical configuration at this point was an open corner instead of an open doorway, WallE3 wasn’t at all parallel to the wall. So, the turn just oriented the robot to about 40-45º to the wall rather than 90º. The next step in the offset capture routine is to call MoveToDesiredFrontDistCm() to move the robot to the desired offset distance (in this case, 30 cm). But the front distance measurement never got down to 30 cm, as the robot wound up running along the baseboard, parallel to the wall. This would have continued indefinitely except WallE3 ran into the far edge of the doorway into the kitchen area bathroom.

So, everything worked just like it was supposed to, except that when the robot exited the bedroom hallway it was turned just enough so that it saw what looked like a trackable wall to the right, instead of gl_LeftCenterCm & gl_RightCenterCm > MAX_TRACKING_DIST_CM, so the open corner case wasn’t executed.

To fix this problem (and hopefully not create other ones) I added the condition that to bypass the ‘open corner’ case, the absolute value of the ‘trackable side’ steering value must be less than 1/2 the max steerval of +/- 1.

Well, that didn’t work as well as I thought. I made another field run, this time starting just before the end of the bedroom hallway where it opens into the dining area. What should have happened is the robot should have detected the ‘open corner’ configuration, made a 90º CW turn and carried on tracking the right-hand wall. What actually happened wasn’t that. Here’s a short video showing the action, along with the telemetry from the run

So, it appears the same sort of thing happened again, only this time it was the left-side distance that was less than MAX_TRACKING_DIST_CM. Since both steervals were +/- 1 this should have caused the ‘open corner’ block to execute.

Fixed (again!). Running another test.

ALRIGHT!! This time things worked just like they are supposed to! Here’s a short video showing the action, and the telemetry from the run.

As can be seen from the video and telemetry, the ‘open corner’ configuration was properly detected even with gl_Left/RightCenterCm = 84.1/235.6, Left/RightSteerVal = 1.00/1.00 (left side within MAX_TRACKING_DIST_CM but abs(left steerval) > MAX_STEERVAL ). Yay!!

Stay Tuned,

Frank

The WALL_OFFSET_DIST_AHEAD Anomaly

Posted 21 October 2023,

The WALL_OFFSET_DIST_AHEAD anomaly is triggered when WallE3 is about to run head-on into an upcoming wall. The idea behind handling this anomaly is to allow WallE3 to navigate around internal corners. Up until a few days ago, WALL_OFFSET_DIST_AHEAD anomaly just called ‘BackupAndTurn90Deg(bool bIsCCW)’. This function backed the robot up to achieve the desired wall offset, turned 90º in the direction away from the last tracked wall to parallel the upcoming wall and then exited, whereupon loop() was re-entered from the top and a new tracking assessment was made.

However, this treatment failed spectacularly in the guest bedroom, as the ‘new’ wall was too short to track properly. This problem led to the development of the ‘RunToDaylight’ algorithm to allow WallE3 to find the best direction in which to travel. I now have RunToDaylightV2() working very nicely, but now I have the opposite problem; now RunToDaylightV2() is more likely than not to simply turn WallE3 180º and go back the way he came – perfectly legitimate from RunToDaylight’s point of view, but boring and too simplistic.

So, how to mix the two approaches (BackupAndTurn90Deg & RunToDaylight) so that RunToDaylight is used in tight corners, but BackupAndTurn90Deg is used for ‘normal’ wall configurations where there is ample (or at least reasonable) room for travel on the perpendicular wall.

I’m going to try combining the two options. The idea is that BackupAndTurn90Deg would be the default response to a WALL_OFFSET_DIST_AHEAD anomaly, but if the available front distance after the 90º turn is less than something like 2 x WALL_OFFSET_DIST_AHEAD (currently set at 30cm), then the RunToDaylight will be called to find a better direction in which to move.

This turned out to be pretty easy. The changes necessary to the WALL_OFFSET_DIST_AHEAD anomaly case in HandleAnomalousConditions is shown below:

The following short video and telemetry shows the action when WallE3 encounters a corner with not enough room to track the next wall in the internal corner.

And the following video and telemetry shows the action when WallE3 does have room enough to follow the internal corner wall. Note that in these two conditions, the code was not changed – the only thing that changed between the two runs is the ‘third wall’ was moved away from the first one to give WallE3 sufficient room to follow the internal corner wall.

So it seems this problem is pretty much solved – YAY!!

Stay Tuned,

Frank