This is the new AI Hat 2, and this is the older AI Hat 1. From an arm's length away, and maybe if you squint a little bit, they look pretty similar. But there are quite a few differences going on under the hood. So let's take a look and see what these boards can do, what makes them different, and more importantly, which one is right for your next project.
Welcome back to Core Electronics. Today we are diving into AI Hats. Now, to learn about the AI Hat 2, we have to first look at the AI Hat 1, as well as explain a bit of technical gobbledygook. If you are a bit of an AI noob, you're going to learn some stuff today. The original AI Hat came out in late 2024, and it comes in two models, the 13-top version and the 26-top version. TOPS is a measurement of processing power. It measures how many tera operations per second a chip can compute. Basically, more TOPS is more power, and the 26-top AI Hat will be about twice as fast as the 13-top Hat.
Both of these AI Hat 1 models run off the Halo 8 chip, an AI accelerator designed to run AI-related number crunching for only a few watts of power, about 2.5 watts in this case. And this Halo 8 chip is optimized for one thing, convoluted neural networks, or CNNs. It's just a fancy word for a type of AI architecture, a specific way that the numbers are being crunched. Some examples of CNNs are computer vision tasks like YOLO object detection or MobileNet or ResNet. Most object recognition or pose estimation or license plate reading or handwriting recognition, image segmentation, basically anything that looks for patterns in pixels, they're probably going to be CNNs. And that is what the AI Hat 1 does extremely well.
Most use cases for the AI 1 are related to, you know, taking an image or a video feed and looking at it to identify people or cars or cups or keyboards, you know, looking for certain objects in there. Or you might also, you know, track the position of people moving around in a room. So essentially, we have two boards that can run computer vision tasks. We've got a slow one and a fast one. Now enter the AI Hat 2. The new AI Hat has an upgraded Halo 10H chip. And if you look on the box, it says 40 tops of compute power. And that's much more than 26. So it's going to be a lot faster at computer vision tasks, right?
While it does have 40 tops, that's not an Apple to Apple comparison here. It's not very straightforward and we'll get into it later. But this 40 tops of compute power is the speed that it can achieve when processing something else. In the world of CNNs like YOLO, it actually has half of that with the speed of 20 tops. So AI Hat 2, 20 tops of power, old AI Hat 13 or 26 tops. What does that actually mean for your projects? Let's look at an example with a medium-sized YOLO V8 model, a decently powerful object detection model that you might commonly encounter in the wild. Now there's going to be variation in your results here depending on your setup, but just some rough numbers.
If you run that YOLO model on the 13 top AI Hat, you can expect about 20 to 25 FPS. On the 26 top Hat, you can expect about 50 FPS. And on the AI Hat 2, you can expect about 45 FPS. And just for a fun little comparison as well, you can actually run that YOLO V8 model on the Pi itself without an AI Hat. But the best you could hope for is about 1 FPS. So you're in slideshow territory here. When you do that, you're also using 100% of your Pi 5's CPU. So you can't really do anything else. If you need something more speedy, you can also run the small version of YOLO V8. And on the 13 top, you'll get about 80 FPS. On the 26 top, you'll get about 200 FPS. And on the AI Hat, you're probably looking at about the 170 FPS mark.
Now looking at these numbers, and if you've seen the cost, you might be wondering, hold up, the AI Hat is more expensive than the 26 top AI Hat 1, and it gets less FPS. Why on earth would anyone buy that? Well, it's got a few tricks up its sleeve. We've got a few AI Hat 1 guides, and the most common question we get after somebody watches it is, how do I run an LLM on it? An LLM is a large language model like ChatGPT or Gemini. And we always have to say, the AI Hat cannot run an LLM. Now we have to say, only the AI Hat 1 can't run an LLM because the AI Hat 2 can. Why? It's built different, and we mean that literally.
The original AI Hat would store the computer vision model on the RAM of the Raspberry Pi, and it would send that data from the RAM to the Hat through the PCIe connection. And that works fine for computer vision models. But LLMs are much more memory intensive, and that PCIe connection can't jam enough data through it to run an LLM. The AI Hat 2, on the other hand, has 8 gigabytes of RAM directly soldered onto the Hat itself, so it doesn't have this PCIe bottleneck issue. And because of that, it can just run LLMs. By the way, this is where that 40 tops number comes from on the box. It has 40 tops of compute power when running int4 generative models.
Now, these LLMs are not nearly as powerful as the latest and greatest, you know, ChatGPT or Gemini models that you might pay a subscription for. Even with an expensive, you know, $5,000 game local models that you run on your own hardware offline are still a little bit behind. And these models are even smaller and more lightweight. So for example, you can't just ask it to name the capital of every city on earth in the year 1600. It's probably going to hallucinate and make something up. Nor would you be able to ask it the name of all the World Cup winners from the 1960s. However, something you could do with it is give it a paragraph of text and ask it to extract the World Cup winners and format it as a CSV list.
It's just ChatGPT, but it's a lot less intelligent, but it's still functional enough for some projects. The uses for low power LLMs like this can get a little bit niche, but there are definitely some application in things like home automation stuff that it's viable. By putting this in your loop somewhere, it gives you the context to turn things like it's too dark in here to knowing to control the lights and raise them. It can just add a little bit of intelligence and thinking and planning to a project. But I think the coolest thing about the AI Hat 2 is its ability to run VLMs or vision language models. These are just LLMs that can take in a picture and kind of look at it the way that humans do. It's a little bit spooky, but it is way more applicable to make up projects.
Now the AI Hat 1 can run object detection models, but it is wildly different to VLMs. When using something like YOLO, you have to go through a lengthy training process to detect a given thing. Let's say we wanted to detect how many bins we've left out on the curb with maybe our home security system that we're feeding into our home automation network that we're building. With a YOLO model, we'd need to train it to recognize what does a bin look like so it can identify the pixels and see that sort of looks like a bin and I'm confident enough. And then it goes through and counts how many bins we've left out on the road. With a vision language model like QN2, we can just ask it how many bins are on the curb and there's no training needed because it kind of looks at the image and understands it.
We could then ask it, you know, is there a bin with a red lid? Is there a garbage truck? Is there a bin with the green lid? Is the red lid open? Is there rubbish in the red lid? We could just, you know, it's really robust and flexible because it can analyze images far better than something like YOLO because it, you know, looks at the image and it understands it like a human. It is really spooky and it's just such a fun thing to play around with. And this is super helpful in, you know, things like home automation because you can, you know, check if there's rain and then use a camera to check if there is washing on the line or you can see if the chickens are in or out of the pen. You basically can take a photo of something and then get a very simple, you know, human to look at it and tell you something about it. It is fantastic.
But I think the thing it does best in, and probably because I did a degree in it, is robotics projects because you can make it, you know, start to see the world around you. You could have a Python script that takes a photo, sends an image to your VLM running on the hat and say, is there a car parked in the driveway? Answer only yes or no. And then you can use that output text to control some decision making in your Python code or, you know, something like that. Again, there are limitations to it. It's not infinitely, you know, all powerful seeing. Sometimes it can just go a bit off the rails. So there is a bit of experimentation involved as to figuring out, you know, what prompts work and what it can realistically do in day to day use.
Also worth mentioning here, the Pi 5 without the hat can run LLMs pretty decently actually, sometimes getting even comparable performance to the AI hat too. However, if you were to do this, you are using again, 100% of your Pi 5 CPU and you really wouldn't be able to do anything. You would probably struggle to open a text file or execute some other code running on the Pi. The AI hat too offloads this processing so that your Pi 5 CPU is free to do other things while the hat crunches all the numbers for, you know, your LLM or VLM. It's also a bit more power efficient as well as it only, you know, draws about three watts while the Pi 5 CPU can easily draw, you know, 10 to 12 watts when you really push it.
One more thing before we wrap this all up. The models that the hat can run right now are the worst that it will ever be. Every few months, some, you know, new model comes out that's able to think harder and run quicker while using the same amount of system resources. And chances are, you'll be able to get that running on the hat when it comes out. You will need to go through Halo's conversion process to convert the model to a format that the hat can use, which is not a very fun process. It is a little bit involved, but it is possible. Regardless though, the models that you run a few years after this video comes out are probably going to be more powerful than what we're using here.
Alrighty, let's wrap this up. What AI hat should you get for your project? Well, if you want to just run some computer vision tasks like YOLO and you're on a budget, the 13 top AI hat one is probably going to be good enough. You can run a pretty decent YOLO model at 25-ish FPS and a more lightweight model at 80 FPS if you need that, you know, extra frame rate. If you need something a bit more beefy for computer vision, get the 26 top version and you'll get, you know, more than double the FPS in these models. However, I would only get the 26 top version if you know for certain that the only thing you'll be doing with it is computer.
If you have zero interest in running LLMs or VLMs on your Pi, get the 26 top hat, save a little bit of money, and get that extra, you know, tiny bit of processing performance. If, however, you think you might want to run an LLM or VLM sometime in the future, the little bit of extra cost and the slight hit to computer vision performance might make it a bit better of a deal for your needs. It's a little bit extra for a board that's more versatile and about the same speed. Regardless though, we hope that this clarified all the differences between the old and the new board. If you have any questions, drop them below or on our community forums. We're all makers here and we're happy to help. Until next time though, happy making.

Makers love reviews as much as you do, please follow this link to review the products you have purchased.