“Whether you think you can, or you think you can’t — you’re right.”
— Henry Ford
In some ways, self-driving cars are robots that require solving both hardware and software problems. A self-driving car needs to identify its surrounding environment with cameras, radar, or other instruments. Its software needs to understand what is around it, know the physical location of the car, and plan the next steps that it needs to take to reach its destination.
Many companies currently build technology for autonomous cars as well as others just entering the field. But I focus on the three most transformative players in the space: Tesla, Waymo, and Comma.ai. Each of these companies tackles the problem with very different approaches.
Tesla, founded by Martin Eberhard and Marc Tarpenning in 2003, is known as the Apple of cars because of its revolutionary car design and for thinking outside the box when creating its vehicles. Tesla develops its cars based on first principles, from the air conditioning system that uses perpendicular vents to how they form their chassis and suspension. With its innovation and work, the Tesla Model 3 is the safest car in the world, followed by the Tesla Model S and Model X. But Tesla is not only innovative with their hardware, it also invests heavily in its Autopilot technology.
In 2014, Tesla quietly installed several pieces of hardware to increase the safety of their vehicles. A few months later, they released a technology package for an additional $4,250 to enable the use of the sensors — 12 ultrasonic sensors, a forward-facing camera, a front radar, a GPS, and digitally controlled brakes. In a rapid release streak, Tesla launched features in the upcoming months, and a year later, rolled it out to 60,000 cars. It was its first version of the Autopilot — known as Tesla Version 7.0.
It gave drivers features like steer within a lane, change lanes, and park itself. Other companies, including Mercedes, BMW, and GM, already offered some of the capabilities, however. But self-steering was suddenly, overnight, a software update and a giant leap toward autonomy. And, Tesla customers were delighted with the software update, releasing videos on the internet of the software “driving” their Teslas, hands-free.
Tesla not only makes the software but the hardware for its cars, enabling it to release new features and update its software over the air (OTA). Because it has released cars that have the necessary hardware components for self-driving capability since 2014, Tesla has a widely distributed test fleet. Other car manufacturers, like Google and GM, only have a small fleet of cars for self-driving with the required hardware.
From the introduction of the Tesla hardware package until October 2018, a total of 32 months, Tesla accrued around 1.5 billion miles driven with the newest hardware. Not only that, but the Tesla servers store the data these cars accumulate so that the Autopilot team can make changes to its software based on what it learns. At the time of this writing, Tesla had around 5.5 million miles of data per day for its newest system, taking only around four hours to collect 1 million miles. For comparison, Waymo has the next most data with about 10 million miles driven in its lifetime, i.e., in 2 days, Tesla acquires more data from its cars than Waymo has in its lifetime.
This rate increases with more cars on the streets, and Tesla has been speeding up the pace that they make their cars. Tesla has more miles accumulated than its competitors, but when it tested its self-driving capability with the California Department of Motor Vehicles (DMV) — the state government organization that regulates the registration of vehicles — Tesla had a much higher count of disengagement compared to other competitors. Disengagements are the number of times that the test drivers for a self-driving car must take over control of the vehicle.
Disengagements are a metric that the average person uses to compare autonomous systems. It provides a rough count of how often the car fails so badly that the test-driver takes over. It is only a proxy of the performance because this metric does not take into account the variables that may affect the vehicle, like the impact of weather or how and where these problems occurred. An increase in disengagement could mean that a major problem exists or that the company is testing its software in more challenging situations such as a city.
At the end of 2015, Tesla numbers showed that it was far behind its competitors. If we normalize the numbers of miles per disengagement, Tesla had 1,000 times worse software compared to Waymo. But Tesla continues to hone its system, year after year. And, Tesla has an advantage over other carmakers: It can update the system by doing an over-the-air update and make the system better without having to sell new cars or have them serviced.
Waymo’s self-driving fleet has the lowest number of disengagements per mile, but even this metric does not approach human performance. Waymo has 1 disengagement per 1,000 miles. Theoretically, humans have around 100 times fewer disengagements than the Waymo self-driving software.
But Tesla has another advantage: It has a large fleet of cars enabled for testing its newest self-driving car software update. This technology enables Tesla to develop software in-house and release it in shadow mode for millions of miles before releasing the software to the public. Shadow mode allows Tesla to test its algorithms on board silently, which provides them with an abundant testbed for its software from the real world.
Unlike other companies, Tesla bets that they can run a self-driving car that performs better than a human without a Light Detection and Ranging (LIDAR) hardware device. LIDAR is a sensor similar to a radar — its name came from a portmanteau of light and radar. It maps physical space by bouncing laser beams off objects. Radar cannot see much detail, and cameras do not perform as well in conditions of low light or glare. LIDAR lets a car “see” what is around it with much more detail than other sensors. The problem with LIDAR is that it does not work well in several different lighting conditions, including when it is foggy, raining, or snowing.
Another problem is that LIDAR is expensive, originally starting at around $75,000 although the cost is now considerably less, and the hardware is bulky, resembling KFC buckets. LIDAR helps autonomous cars process and build a 3D model of the world around them, called Simultaneous Localization and Mapping (SLAM) in autonomy. Still, Tesla continues to improve their software and lower their disengagement rate, which is one of the reasons Tesla bet on not using such a device. To perform as well as humans, cars need the same type of hardware. Humans drive only with their eyes. So, it makes sense that self-driving cars could perform as well as humans with cameras alone.
A Tesla vehicle running the autopilot software ran into a tractor-trailer in June 2016 after its software could not detect the trailer against the bright sky, resulting in the death of its driver. According to some, LIDAR could have prevented that accident. Since then, Tesla added radars to its cars for these situations. One of the providers of the base software, Mobileye, parted ways with Tesla because of the fatality. They thought Tesla too bullish when introducing its software to the masses and needed more testing to ensure safety for all. Unfortunately, fatalities with self-driving software will always occur, just as with human drivers. Over time, the technology will improve, and the disengagement rate decrease. I predict a time when cars are better than humans at driving, at which point cars will be safer drivers than humans. But deaths will inevitably occur.
Before the fatality, Tesla used Mobileye software to detect cars, people, and other objects in the street. Because of the split, Tesla had to develop the Autopilot 2 from scratch, meaning it built new software to recognize objects and act on it. It took Tesla two years to be at the same state before the breakup. But once it caught up with the old system, it quickly moved past its initial features.
For example, the newest Tesla Autopilot software 9.0 has the largest vision neural network ever trained. They based the neural network on Google’s famous vision neural network architecture Inception. Tesla’s version, however, is 10 times larger than Inception. The number of parameters (weights) in Tesla’s neural network is five times bigger than Inception’s. I expect that Tesla will continue to push the envelope.
But Tesla is not the only self-driving company at the forefront of technology. In fact, Google’s Waymo was one of the first companies to start developing software for autonomous cars. Waymo is a continuation of a project started in a laboratory at Stanford, which began 10 years before the first release of the Tesla Autopilot. It won the DARPA Grand Challenge for self-driving cars, and because of its notoriety, Google acquired it five years later, forming Waymo. Waymo’s cars perform much better than any other self-driving system, but what is surprising, however, is that they have many fewer miles driven in the real world than Tesla and other self-driving car makers.
The DARPA Grand Challenge began in 2004 with a 150-mile course through the desert to spur development of self-driving cars. During the first year, the winner, Waymo, completed seven of the miles, but every vehicle crashed, failed, or caught fire. The technology required for these first-generation cars was sophisticated, expensive, bulky, and not visually attractive. But over time, the cars needed less hardware and improved each year. While the initial challenge was limited to a single location, it expanded to city courses in later years.
With Waymo as the first winner of the competition, they became the leader of autonomous cars. Having the lowest disengagement rate per mile of any self-driving car system means that they have the best software. Some argue that the primary reason for Waymo performing better than the competition is that it tests its software in a simulated world. Waymo, located in a corner of Alphabet’s campus, developed a simulated virtual world called Carcraft — a play on words referring to the popular game World of Warcraft. Originally developed to replay scenes that the car experienced on public roads, this simulated world included the times when the car disengaged. Eventually, Carcraft took an even larger role in its self-driving car software development because it simulated thousands of scenarios to probe the car’s capability.
Waymo used this virtual reality to test its software before releasing it to its real-world test cars. In the simulation, Waymo created fully modeled versions of cities like Austin, Mountain View, and Phoenix as well as other test track simulations. It tested different scenarios in many simulated cars — around 25,000 of these at any single time. Collectively, the cars drive about 8 million miles per day in this virtual world. In 2016 alone, the virtual autonomous cars logged approximately 2.5 billion virtual miles, which is a much higher number than the 3 million miles Waymo’s cars drove on the public roads. Its simulated world has logged 1,000 times more miles than actual cars.
The power of these simulations is that it trains and tests the models with software created for interesting and difficult interactions instead of the car simply putting in miles. For example, Carcraft simulates when traffic circles have many lanes and are hard to navigate. It mimics when other vehicles cut off the simulated car or when a pedestrian unexpectedly crosses the street. These situations rarely happen in the real world, but when they do, they can be fatal. These reasons are why Waymo has a leg up on its competitors. It trains and tests its software in situations other competitors cannot do without the simulated world, regardless of how many miles they log. Personally, I believe testing in the simulated world is essential for making a safe system that can perform better than humans.
The simulation makes the software development cycle much, much faster. For developers, the iteration cycle is extremely important. Instead of taking weeks like in the early days of Waymo’s software construction, after developing Carcraft, the cycle changed to a matter of minutes, meaning engineers can tweak their code and test it quickly instead of waiting long periods of time.
Carcraft tweaks the software and makes it better, but the problem is that a simulation does not test if there are oil slicks on the road, sinkhole-sized potholes, or other weird anomalies that might be present in the real world but not part of the virtual world. To test that, Waymo created an actual test track that simulates the diverse scenarios that these cars can encounter.
As the software improves, Waymo downloads it to their cars and runs and tests it on the test track before uploading it to the cars in the real world. To put this into perspective, Waymo reduced the disengagement rate per mile fourfold from 2015 to 2016. Even though Waymo had a head start in creating a simulated world for testing its software, many other automakers now have programs to create their own simulations and testbeds.
Some report that the strategy for Waymo is to build the operating system for self-driving cars. Google had the same strategy when building Android, the operating system for smartphones. They built the software stack for smartphones and let other companies, like Samsung and Motorola, build the hardware. For self-driving cars, Waymo is building the software stack and wants the carmakers to build the hardware. It reportedly tried to sell its software stack to automakers but was unsuccessful. Auto companies want to build their own self-driving systems. So, Waymo took matters into their own hands and developed an Early Rider taxi service with about 62,000 minivans. In December 2018, Waymo One launched a 24-hour service in the Phoenix area that opened up its ride-sharing service to a few hundred pre-selected people, expanding its private taxi service. These vans, however, will have a Waymo employee in the driver’s seat. This might be the solution to run its self-driving cars in the real world, but it will be difficult to see that solution scale up.
One of the other most important players in the self-driving ecosystem is Comma.ai, started by a hacker in his mid-twenties, George Hotz, in 2015. In 2007 at the age of 17, he became famous for being the first person to hack the iPhone to use on networks other than AT&T. He was also the first person to hack the Sony PlayStation 3 in 2010. Before building a self-driving car, Hotz lived in Silicon Valley and worked for a few companies including Google, Facebook, and an A.I. startup called Vicarious.
Hotz started hacking self-driving cars by retrofitting a white 2016 Acura ILX with a LIDAR on the roof and a camera mounted near the rear view mirror. He added a large monitor where the dashboard sits and a wooden box with a joystick, where you typically find the gearshift, that enables the self-driving software to take over the car. It took him about a month to retrofit his Acura and develop the software needed for the car to drive itself. Hotz spent most of his time adding sensors, the computer, and electronics. Once the systems were up and running, he drove the car for two and a half hours to let the computer observe him driving. He returned home and downloaded the data so that the algorithm could analyze his driving patterns.
The software learned that Hotz tended to stay in the middle lane and maintained a safe distance from the car in front of it. Two weeks later, he went for a second drive to provide more hours of training and also to test the software. The car drove itself for long stretches while remaining within the lanes. The lines on the dash screen — one showed the car’s actual path and the other where the computer wanted to go — overlapped almost perfectly. Sometimes, the Acura seemed to lock onto the car in front of it or take cues from a nearby car. Hotz had not programmed any of these behaviors into the vehicle and could not really explain the reasons for what it did. After automating the steering of the car as well as the gas and brake pedals, Hotz took the car for a third drive, and it stayed in the center of the lane perfectly for miles and miles, and when a car in front of it slowed, so did the Acura.
Because most of the software developed by Hotz used Deep Learning, the code is only around 2,000 lines, while most other systems based on if-then statements have hundreds of thousands of lines.
The technology he built as an entrepreneur represents a fundamental shift from the expensive systems designed by Google into much cheaper systems that depend on software more than hardware. His work impressed many technology companies including Tesla. Elon Musk, who joined Tesla after a Series A funding round and is their current CEO, and Holz met at Tesla’s Fremont, California factory and discussed Artificial Intelligence. The two settled on a deal where Hotz would create software better than Mobileye’s, and Musk would compensate him with a contract worth about $1 million per year. Unfortunately, Holz walked away after Musk continually changed the terms of the deal. “Frankly, I think you should just work at Tesla,” Musk wrote to Hotz in an email. “I’m happy to work out a multimillion-dollar bonus with a longer time horizon that pays out as soon as we discontinue Mobileye.” “I appreciate the offer,” Hotz replied, “but like I’ve said, I’m not looking for a job. I’ll ping you when I crush Mobileye.” Musk simply answered, “OK.”
Since then, Holz has been working on what he calls the Android of self-driving cars, comparing Tesla to the iPhone of autonomous vehicles. He launched a smartphone-like device, which sells for $699 with software installed. The dash cam simply plugs into the most popular cars in the U.S. made after 2012 and provides the equivalent capability of Tesla Autopilot, meaning cars drive themselves on highways from Mountain View to San Francisco with no one touching the wheel.
But soon after launching the product, the National Highway Traffic Safety Administration (NHTSA) sent an inquiry and threatened penalties if Hotz did not submit to oversight considerations. In response, Hotz pulled the product from sale and pursued another path. He decided to market another product that was the hardware-only version of the product.
Then in 2016, he open-sourced the software so that anyone could install it in the appropriate hardware. And with that, Comma.ai abstained from the responsibility of running its software in cars. But consumers still had access to the technology, allowing their cars to drive themselves. Comma.ai continues to develop its software, and drivers can buy and install it in their cars. Some people estimate that around 1,000 of these modified cars run on the streets now.
Three main parts form the brain of an autonomous car: Localization, Perception, and Planning. But even before tackling these three main steps, the software must integrate the data from different sensors, i.e., cameras, radars, LIDAR, and GPS. Different techniques ensure that if a given sensor is noisy, meaning contains unwanted or unclear data, then other sensors help out with their information. And, other methods exist to merge data from these different sensors.
The next step for the software is to know where it is. This process includes finding the physical location of the vehicle and which direction the car needs to head, i.e., which exits it needs to take to deliver the passenger correctly. That is the most straightforward problem to solve with a few hardware components like GPS.
The next part of the software stack is harder. Perception basically involves answering the question of what is around the vehicle. For example, a car needs to find traffic signs and decide which color it is. Or, it needs to see where the lane markings are and where cars, trucks, and buses are. Perception includes lane detection, traffic light detection, object detection and tracking, and free space detection.
The hardest part of this problem is in the long tail, which describes the diverse scenarios that show up occasionally. When driving, that means traffic lights with different colors from the standard red, yellow, and green or roundabouts with multiple lanes. These scenarios happen infrequently, but because there are so many different possibilities, it is essential to have a dataset large enough to cover them all.
The last step, path planning, is by far the hardest. Given the car’s location, what is around it and where does it need to go? The software must calculate the next steps to getting to the desired place, including route planning, prediction, behavior planning, and trajectory planning. The solution ideally includes mimicking human behavior based on actual data from people driving.
These three steps combine to form the actions cars need to take based on the information given. The system decides whether the vehicle needs to turn left, brake, or accelerate. The instructions fed to a control system ensure the car does not do anything unacceptable. This system comes together to make cars drive themselves through the streets and the “magic” behind cars driven by Tesla, Waymo, Comma.ai, and many others.