Stop reading and solve a real world ML problem

One of the best ways of learning comes from solving a real world problem. A machine learning colleague suggested I solve a real world problem that I was facing using machine learning. My mind was drawn to a problem on the road where I live. Many drivers drive too fast down the road in the evening. I wanted to raise awareness to the police about how fast these vehicles were going. 

I created a github repo with the work. Please take a look, and https://github.com/amplifiedengineering/opencv-radar. The following post is about the process I did to come to that solution.

My first approach was to find a LIDAR sensing device to measure the speed of the vehicles, and capture the license plate information through a camera. As I began researching LIDAR devices, I realized some of the less expensive models were custom built for a specific purpose, for example this devices’ spec sheet shows it does a single reading which is built for measuring the depth for drones to avoid ground collisions. That wouldn’t help me in my tracking, where I need to be able to measure multiple targets potentially (two cars going the same way, or two cars going different ways). 

As I was describing this to a friend who is into robotics, he suggested a different approach. Why not measure the speed using video? I liked this approach, as I could record video from my iPhone, without buying expensive single use hardware, and then run the calculation using software. As I started thinking about how to solve the speed calculation, it seemed straightforward: have a set of lines on the video that mark a well known distance. Then calculate the speed based upon the number of frames it takes for the vehicle to transit between the two lines. 

Speed (miles per hour): miles (distance in feet / distance in feet of a mile) by hour (total # of frames * frames per seconds * ( 1 / seconds per hour )

To be specific, I traded off real-time video processing for using post-processing. So, this approach wouldn’t work for all applications, but I was more interested in grabbing a video file from my phone, doing post processing and finding how many people were speeding. This way, I didn’t have to worry about the frame rate causing a queue to develop in the stream (which can be solved by sampling frames rather than processing each frame).

The first problem was detecting where the vehicle was in the image. I knew about the openCV library, so I started down a path on exploring this. I used a Haar classifier to detect the cars and experimented with multiple different hyperparameters to tune the detection of the cars and trade of the # of frames per second I could process using detectMultiScale API. Some of the key trade-offs here were the size of the video frame to be processed (the smaller the faster the processing, but the less ability to detect), the scaleFactor (API), and the minNeighbors (API). I played around with how much I would resize the greyscale image of the video frame before passing it into the detectMultiScale API. It was working “ok”, but I noticed that there was a bit more choppiness (jumpiness) in the object detection borders, so I looked for a way to improve it. 

I found this blog post informative. It uses Haar classifiers with background averaging to detect the differences in pixels and then creates bounding boxes on those detected contours differences from the background. This again improved the performance slightly. However, I was still seeing very unstable bounding boxes around the cars, which left me wanting a more stable detection mechanism.

Finally, as I was talking with another principal level developer about an unrelated topic, he was sharing about some work which he had done using OpenCV and Yolo. I decided to look up Yolo. I was impressed with the improved stability in object detection and similar performance. At first, I was a bit disappointed that the Yolo architecture requires a color image, I was previously using greyscale (with less channels used in inference I assumed it would be faster). After that I discovered that the architecture would also do all of the resizing for you when you did inference, and that the previous resizing I was doing was superfluous. Finally, I modified from using the v3 predict function to find the objects to the v8 track function

Figure 1: setup screen to verify the start / end lines for video

Figure 2: tracking cars using the YoloV8 track API function, with tracking trails

Hope you enjoyed the ride. A few potential improvements which I’m thinking of now, detecting the speed based upon well known sizes and distance (specifically I could use license plate size), and ability to collate together multiple views to tag a higher resolution view of the license plate.

“founder mode” – a simple rebuttal

Paul Graham – I’m not buying what you’re selling. Why are you asking us to suspend belief that management doesn’t work and only founders can lead companies to success?

I read through the founder mode blog post, and it left me profoundly sad. Sad? Yes, because Brian Chesky, at least as Paul Graham portrays the talk (I can’t seem to find a video of the talk anywhere), is bucking conventional wisdom to avoid managing and advocating for more of a hands-on approach as a leader. The so-called ‘conventional wisdom’ they disguise is that this advice doesn’t scale and leads to worse outcomes. Propping up “founder mode” as the next great leadership style is a fool’s errand.

I’m sad because Paul seems to selectively forget success cases for non-founder run companies, e.g. Microsoft (Satya), Apple (Tim), Amazon (Andy). These are not small companies with a combined market cap of ~$8 T. That’s $8,000,000,000,000. What are these “manager mode” companies doing from which start-ups could draw inspiration? Obviously, the above companies have been successful, what is the logical basis for this argument?

The key failure in logic in the argument of “founder mode” vs “manager mode” is conflating a company being “run into the ground” that conventional management simply doesn’t work. Looking deeper, a great manager will dissect and investigate why. Is it an easy process to investigate what’s broken? No. However, building a company whose success is inextricably bound to a particular leader (founder mode) is super dangerous (especially to shareholders).

The danger of relying solely on a single leader highlights the need for robust systems that transcend any one individual. A successful way to ensure smooth operation in business is creating auditing mechanisms to roll-up status and risks in various business units. These recurring auditing events help leaders to stay informed and ask questions to tweak execution (a project is taking too long, or is going off the rails) and act as a forcing function to hold the business leaders accountable to the reporting and risks. Unfortunately, it takes courage to hold people accountable. 

A case study, when I worked at Amazon, there was great company lore around how Jeff Bezos would show up to six-page review meetings. He would allow the least senior people (after reading for 20 mins) to give comments on the doc before asking his questions. This gives oxygen for other ideas and sets an example that Jeff isn’t the most important person in the room. Jeff Bezos, at Amazon scale, couldn’t possibly dive deep into each business frequently. He was able to scale and trust leaders by establishing mechanisms which improved the successful outcomes (e.g. six pager reviews and operations / budget planning). Examples of great successes that have been driven through the six page process are Kindle, Amazon Prime, and every Amazon product which has launched since 2004. These documents because of the rigorous review process drive clarity of thought in the decision making process of what to build for the customer, how and why.

The success of Amazon’s structured approach raises a critical question: why then, should we rely on the singular vision of a founder when a well-designed management system can achieve so much more? Why am I skeptical of “founder’s mode”? Because I have worked at companies where the founder was deeply entrenched in “founder mode” philosophy. Every new product idea and innovation had to pass through this founder. Being the single decision maker and hands-on certainly didn’t improve the outcomes because the company still struggles to this day. 

A more nuanced thought around “founder mode” is whether companies are creating the right structures (and mechanisms) to succeed even in the absence of their founders.