In this video, we're diving into setting up the Raspberry Pi AI HAT with YOLO object detection and learning how to integrate it into your projects using Python code. We'll cover installing the HAT hardware and software, and explore demo scripts that perform actions based on object detection, such as identifying specific objects, counting objects, and determining their position on the screen. To follow along, you'll need a Raspberry Pi 5, preferably a 2GB or larger model, an AI HAT (compatible with both the 13 and 26 top versions), a camera module (we're using the V3 module), and possibly a camera cable adapter. The Pi 5 has a smaller camera connector, so ensure you have the necessary adapter. Links to these items and a written guide with commands and code are provided below. Installing the HAT is straightforward, but note that the header extender might not allow the pins to fully protrude. If you need exposed pins for additional hardware, longer pins are available. To install the hat, place the header extension on the Pi's GPIO pins, attach the four standoffs, insert the HAT cable into the PCIe tab, and secure it. Connect the camera cable to the camera and the Pi using the tab locking system. Secure the HAT with four screws, and you're ready to proceed.
Next, install PiOS onto the microSD card, insert it into your Pi, and complete the initial setup. Once on the desktop, open a terminal window to update your Pi with Update and Upgrade. Install the necessary drivers and software for the hat, which may take about five minutes. After installation, restart your Pi. Now, clone the basic Python pipelines from Halo's GitHub using a simple command. Once downloaded, navigate to your Pi's home folder to find the newly created folder, which will house all our Python scripts. Change the terminal's working directory to this folder using a command. Run the installation file provided by Halo to set up the remaining components, which may take another five minutes. If you change the HAT model, rerun this command to ensure the Pi recognizes the new hat. After installation, reboot your Pi again. In the Halo example folder, you'll find two crucial folders: the resources folder, containing YOLO models converted to the HEF file format for the AI hats, and the basic pipelines folder with pipeline code and demo scripts. Pipelines simplify interaction with the HAT by allowing high-level code to execute complex tasks behind the scenes. Although creating custom pipelines is possible, we'll use Halo's example object detection pipeline.
To test the setup, we'll run demo object detection code. Scripts must be executed from the command line, requiring terminal setup. Use the change directory command to navigate to the correct folder, then run a source command to activate the virtual environment established by the installation command. This setup must be repeated if the terminal is closed or the Pi is restarted. Run a Python script from the basic pipelines folder, named detection.py, to initiate object detection. The demo analyzes test footage, demonstrating the hat's object detection capabilities. To stop the script, click into the terminal window and press Control C. Next, we'll use the camera as the model's input. Run the same line with the addition of dash dash help to view available options, including using the camera as input with dash IR pi. While not covering all options, some are worth exploring. Execute the code with the camera as input, and observe real-time object detection, identifying objects like a person, cup, and keyboard. With the HAT running object detection via the camera, we'll explore practical applications. Open the basic pipelines folder and examine detection.py in Thonny. This script demonstrates how to incorporate object detection into projects using Python code. The script consists of three main sections: importing libraries, defining a class, and a function extending to the end.
In programming, parallels can be drawn between regular coding practices and specific functions within a system. For instance, the function named `app_callback` can be likened to the `while true` loop in our code, as it executes every time a frame from the camera is fed into YOLO. The class at the top of the code acts like the initial setup phase where variables are declared, pins are initialized, and necessary imports are made. This section runs once, and if a variable isn't declared here, it won't be accessible in the `app_callback` function, which serves as our `while true` loop. This structure forms the foundation for operating the camera, setting the stage for the core functionality of the code. The main task involves processing each detected item, retrieving its label, bounding box, and confidence rating. Logic is implemented to perform actions based on detections, such as identifying a person and executing subsequent tasks. This setup is sufficient for many to start coding their projects, but we've expanded on it with three practical demos that allow for customization based on specific triggers.
One practical application addresses a common issue in the office: not being aware of someone approaching from behind while wearing headphones. The solution involves using a camera to detect a person behind and rotating a servo to alert the user. This setup requires creating a new file, `watcher.py`, and opening it in Thonny. The code, available on our written page, builds on the previous example with refinements for practicality and ease of use. At the top, we import `AngularServo` from the `GPIO Zero` library, which simplifies controlling the Pi's GPIO pins. Within the class, variables are defined for various purposes, including debounce, and the servo is set up. Each variable is prefixed with `self.` due to the framework's structure. The `app_callback` function, our `while true` loop, remains unchanged initially as it handles camera setup and operation.
When analyzing detections, the code checks the name and confidence of each detection. If the confidence exceeds a threshold (e.g., 0.4), and the detected item is a person, the `object_detected` flag is set to true. This mechanism allows for customization, such as detecting other objects like a cup or a dog. Upon detecting a person, the detection counter increments. This counter ensures consistent detection over multiple frames before triggering an action, preventing false positives. For instance, if the code is used to open a doggy door, it requires consistent detection of a dog over four frames to avoid mistakenly opening for a cat. Conversely, if no object is detected, the no detection counter increments, and if it exceeds five frames, the servo resets, indicating the absence of the object.
An additional feature is the `is_it_active` variable, ensuring actions are triggered only once per detection cycle. This prevents repetitive actions, such as sending multiple emails when a cup is detected. Without this, the system would send an email for every frame containing the cup, resulting in excessive notifications. In summary, if a person is detected for more than four frames, the servo moves once, and if not detected for more than five frames, it resets. When using variables in the `while true` loop, they are accessed with `user_data.` prefix, contrasting with the `self.` prefix used during declaration. To run a custom Python script like `watcher.py`, the command remains the same, substituting the script name. Running the script demonstrates the detection and notification process, with the system indicating when an object enters or exits the frame. For those who prefer not to display FPS in the shell, adjustments can be made in the basic pipelines by locating the `Halo RPi Common`.
We're just going to open it up in Thonny and then scroll all the way down. You want to look for the function called on FPS measurement. This line right here is the one that's printing out the FPS. So we're just going to comment it out like so, by putting a little hash in front of it. Save that file. If we run our script again, you can see we no longer have that FPS being printed out into the shell. It might be a bit easier to see. No person detected, object gone, object back, object gone, object back, object gone, object back. Please don't put that in, Luke. An excellent demo code is very similar. Instead of, if an object is detected, do something, it's if a certain number of objects are detected, then do something. This is another solution to a problem that I have in my office. I constantly have cups of water. I go and grab one, and I come back, and they start stacking up on my desk. We're going to build this to say, if three or more cups are detected at my desk, then we're going to turn on the siren and tell me to go and clean up the cups.
So same deal as before, I've gone ahead and copied the code and saved it into a Python file in our basic pipelines folder. Just a quick run through, it's very similar to the last one. Instead of setting up a server, we set up an LED, which is the easiest way to control the GPIO pin. We are controlling an LED in this example. At the top here, we create a variable for the object we're detecting. It's just a different way of doing it, and we're obviously going to be looking for a cup. We're also going to set up our green and red LEDs. When you set anything up, you need this self dot in front of it. We're also going to start by setting the red one off and the green one on. Anytime we use a variable or anything in here, self dot, just trying to drill that into you. The code is pretty much exactly the same all through here. We start the code by saying we've detected zero objects so far. Then we go through every single thing that's been detected. If the confidence is more than 0.4 and the label is target object, which is our cup, as you can see up here, if it has detected a cup, we're going to increase the count by one.
If it detects five cups, this counter will be set to five and four and so on. Here we say if we detect more than three cups in this frame, we're going to increase the detection counter by one. This is the exact same debounce code as before. We need to detect more than three cups for more than four frames because we have to be confident that there are that many cups. It's very easy for a mouse or a bowl or something else to be detected as a cup for one single frame. If there are more than three cups, we'll turn on the red LED, turn off the green LED. If there aren't more than three cups, we're going to turn off the red LED and turn on the green LED. So pretty much the same as before, but we only do something if there is a certain amount of an object being counted. You can go ahead and set this to three cups or 10 horses or 50 cars or whatever your project calls for.
Alright, last bit of demo code. This one detects the location of an object detected. It detects where it is in the frame. Another practical solution behind me is a whole bunch of resin printers. Resin is quite toxic. You've got to be very careful when you handle it. We're going to use this one as a bit of an advanced security system. We're going to set it up in the corner to watch it. If anyone gets close to it while it's printing or curing the resin, it's going to set off an alarm. Before we begin, we're going to be dealing with relative X and Y coordinates. On the X axis running this way across the screen, here is going to be zero and here is going to be 1.0. Halfway through is going to be 0.5. Three quarters of the way is going to be 0.75. On the Y axis, the top of the screen is going to be zero. The bottom of the screen is going to be one. Halfway is 0.5, and so on.
Very similar again, we have the object that we're going to be looking for. Here we create the zones that we want to create as a warning area. This is saying X min is 0.4 and X max is 0.6, which means this is the danger zone here if it's between 40% and 60% of the screen. We also have our Y min and Y max. Our Y min is if it's in between 30% and 70% of the screen. These are going to kind of create a box that if the centre of the object is in, it's going to do something. Pretty much everything is the exact same as before. But instead of using detection counter and no detection counter, we're just saying, is it in zone or is it out of the zone? Down here, we get the name of it. We get the confidence of it. If the confidence is more than 0.4 and it's the person, then we're going to get the location of it. This is just basically going to get the data coming out of the YOLO model containing the bounding boxes that it's drawing around all of the objects. Then it's going to calculate the centre point of that box with these lines here. Then we're just going to print out where on the screen that object has been detected.
We're going to use the X min, X max, and Y min, Y max to see if the centre of that detected object is in the box that we defined at the top. Again, we have our debounce, which means it has to be detected in that box for four frames for it to do something. We're just going to be doing lights here. You can go ahead and modify this exactly the same as you did with your previous scripts. Well, that about wraps that up. Again, you can find all of those code snippets on our written guide linked below. In there, you might also find some other goodies like how to use a different YOLO model and where to find and how to download a bigger, even more powerful model. Regardless of what you want to do, we hope this video has taught you how to set up the AI hat, how to use it in your projects. We hope that one of the demo codes we've provided you has helped along the way. If you make anything cool with this or you just need a hand with something, feel free to head on over to our community forums. We're all makers over there, and we're happy to help. Until next time, though, happy making.
Makers love reviews as much as you do, please follow this link to review the products you have purchased.