Wednesday, January 29, 2025

Fully Local AI Vehicle Detection

Intro

You know those old school driveway alarms? The ones you can put at the end of your driveway, which make a chime inside your house go off when a vehicle crosses them? You know: the ones that end up going off when a person walks by, a squirrel runs nearby, or the wind somehow sets it off for seemingly no reason? Well this is my take at a modern, high-tech version of that.

Let me start with the fun part: a demo of the finished product (make sure to turn sound on):

What you're seeing here is a TP-Link Tapo security camera halfway down the driveway detecting a vehicle has turned into the driveway using AI. The camera sends a push notification to Home Assistant, which then sends out a message to a Home Assistant Voice and a Google Nest Speaker (cloud dependent, for comparison).


Hardware

  • TP-Link Tapo C325WB - Tapo is fast becoming my favorite camera line. It has local 24/7 recording to a microSD card, supports the open ONVIF camera protocol, implements the RTSP streaming protocol, has a local AI model with person, pet, and vehicle detection, and has an optional subscription service for cloud clip upload/backup.
  • Beelink EQ14 Mini PC - Intel N150, 16GB DDR4, 500GB M.2 SSD - this is running Home Assistant. It's overkill for what I need here, but these N100/N150 boxes are really great!
  • Home Assistant Voice Preview Edition - the recently released smart speaker from Nabu Casa, the company behind Home Assistant. It supports fully local control, which is why you hear the message on this speaker first.
  • Google Nest Mini - the second (lower) speaker that announces the message after a several second delay, since it's the only cloud-dependent (optional) portion of this automation.
  • Deco X50-Outdoor - the Wi-Fi AP mounted to the outside wall of my garage that the camera connects to. Since the camera is powered from a lamp post halfway down the driveway, I don't have an easy way to run a hard wired data connection. I've found the Deco APs to have great range, and the C325WB's Wi-Fi radio seems to have a fairly stable connection.
  • Vertical Pole Mount - to mount the Tapo camera to the lamp post.
  • 3D Printed Stand for the HA Voice - special shout out to RuddO for making a great 3d model for the speaker stand.
Here's a view of the camera mounted on the lamp post and the AP mounted in my garage:

Deco X50-Outdoor mounted inside garage

Software

  • Home Assistant is the core of this automation, receiving the events from the camera and sending events out to the speakers. Home Assistant has become incredibly popular and has an integration for almost everything at this point. I highly recommend it as the best home automation platform.
  • Home Assistant ONVIF Integration - Open Network Video Interface Forum (ONVIF for short) is an open industry standard for controlling IP-based cameras. One of its features is event detection, where it has both a "pull" protocol and a "push" protocol for getting events from a camera. In this case, vehicle detection is one of the events sent out by the Tapo cameras.
  • HomeAssistant - Tapo: Cameras Control / pytapo - An integration that adds much more functionality for controlling Tapo devices over the base ONVIF library built in to Home Assistant.
  • Piper - the text to speech (TTS) system developed by the Open Home Foundation. This is a very fast, fully local TTS engine which I'm using to convert "vehicle in driveway" to speech that can be sent to a speaker.
  • Chime TTS - An integration that can combine sound effects and TTS audio to create a combined output. I'm using it to add the little "chord" sound that plays just before the "vehicle in driveway" message on the Home Assistant Voice speaker.
  • Google Assistant SDK - An integration for Home Assistant that allows connecting to Google Nest speaker devices. With this integration, you can send a broadcast to one or more speakers. Unfortunately, this is the one component that requires the cloud, which introduces a delay. It also adds an "Incoming broadcast: it says:" prefix before the "vehicle in driveway" message, which can't be removed. It also doesn't allow specifying a volume for the message. All of that combined makes this pretty unfortunate, because I have several of these around my house.

Coding

It wouldn't have been as fun of a project if it all worked out of the box, right? When I pulled in my Tapo cameras to Home Assistant, I noticed that they had a binary sensor for motion detection but not the more advanced person, vehicle, and pet detectors. The motion detector also didn't seem to be working properly.

I started looking into home assistant logs and noticed a strange 500 error communicating with the camera using ONVIF:

illegal status line: bytearray(b'POST /onvif/service\x00HTTP/1.1\x00\x00Host\x00 192.168.56.109:2020\x00\x00Accept\x00 */*\x00\x00Accept-Encoding\x00 gzip, deflate, br\x00\x00Connection\x00 keep-alive\x00\x00User-Agent\x00 ZHTTP/1.1 500 Internal Server Error'

What's going on here? If you remember above I said that ONVIF has two delivery mechanisms in the events spec: pull and push. The pull method is similar to HTTP long polling, where you send a request to the camera asking for recent events along with a timeout (e.g. 60 seconds). The camera responds when it has an event to deliver or times out after the 60 seconds with nothing. With push, you give the camera a URL to send HTTP messages to when an event occurs. Push is faster and more efficient, but requires more set up on the client.

The Home Assistant ONVIF integration tries both but prefers push. If it can't set up a push notification, it falls back to pull, and this 500 error was happening while trying to set up a push notification. The camera was responding with a 500 error that wasn't even valid HTTP, containing null bytes as a separator instead of newlines.

To debug this, I installed an independent ONVIF client from Happytimesoft to see if it worked, and it did! I used Wireshark to capture XML payloads from the two clients to compare them and see what might be different that triggered the 500 error. Eventually, I figured out the camera's XML parser was not happy with inline namespaces such as <ns0:Subscribe xmlns:ns0="http://docs.oasis-open.org/wsn/b-2"> vs. defining the namespaces at the top of the document and using shorthand in each element, e.g. <wsnt:Subscribe>.

I sent a pull request to set a namespace prefix in python-onvif-zeep-async, the library used by the Home Assistant ONVIF integration, then a pull request to Home Assistant to pull in the updated version of the library.

I'm not sure why, but the Tapo cameras don't seem to send events reliably (or at all) when using a pull subscription. Push subscriptions got me up and running, receiving motion events reliably in Home Assistant. Next up, though, the ONVIF integration has a parser library that parses events to turn them into sensors. The ONVIF events spec is intentionally wide open in terms of the event payloads. I found this message in the log telling me that the event coming from the Tapo camera wasn't recognized:

clipped for brevity
Driveway: No registered handler for event from tapo_controltapo.driveway.c325wb.lan: { ...  'Topic': { '_value_1': 'tns1:RuleEngine/TPSmartEventDetector/TPSmartEvent', ... 'Data': { 'SimpleItem': [ { 'Name': 'IsVehicle', 'Value': 'true' } ... }

Okay, easy enough, I thought. I found the parser module that looked easily extensible to add more types. Unfortunately, it had no unit tests, which made it very difficult to test new parser types. A debugging workflow that includes getting into my car to drive down the driveway clearly wasn't going to work. So I added a new unit test for event types (which took a while to get working!) and sent a pull request to Home Assistant.

Conclusion

Yes, it's a little silly, but I think it's a great demonstration of the power of open protocols, open source code, and local processing. Like many in the home automation community, I value having local control of my smart devices. We often hear that cloud services are required to enable powerful features, but by carefully selecting the hardware components you buy and the software components you run, you can unlock impressive building blocks! Most importantly, I had a blast working on this and learned a lot along the way.