By James R. Doty, MD and Blair LaCorte
For over three decades, I’ve studied and performed surgery on the human brain. I have always been fascinated by the power, plasticity and adaptability of the brain and by how much of this amazing capacity is dedicated to processing and interpreting data we receive from our senses. With the rapid ascension of Artificial Intelligence (AI), I began to wonder how developers would integrate the complex, multi-layers of human perception to enhance AI’s capabilities. I have been especially interested in how this integration would be applied to robots and autonomous vehicles. It became clear the artificial intelligence that will be needed to drive these vehicles will require artificial perception that is modeled after the greatest perception engine on the planet — the human visual cortex. These vehicles will need to think like a robot, but perceive like a human.
To learn more and to better understand how this level of artificial perception will be created, I recently became advisor to AEye, a company developing cutting edge artificial perception and self-driving technologies, to help them use knowledge of the human brain to better inform their systems. This is known as biomimicry: the concept of learning from and then replicating natural strategies from living systems and beings (plants, animals, humans, etc.) to better adapt design and engineering. Essentially, biomimicry allows us to fit into our existing environment and evolve in the way life has successfully done for the past few billion years. But why is incorporating biomimicry and aspects of human perception integral to the development and success of autonomous vehicles?
Because nothing can take in more information and process it faster and more accurately than the human perception system. Humans classify complex objects at speeds up to 27 Hz, with the brain processing 580 megapixels of data in as little as 13 milliseconds. If we continue using conventional sensor data collection methods, we are more than 25 years away from having AI achieve the capabilities of the human brain in robots and autonomous vehicles. Therefore, to facilitate self-driving cars to safely move independently in crowded urban environments or at highway speeds, we must develop new approaches and technologies to meet or exceed the performance of the human brain. The next question is: how?
Orthogonal data matters
(Creating an advanced, multi-dimensional data type)
Orthogonal data refers to complimentary data sets which ultimately give you more quality information about an object or situation than each would alone, allowing us to make efficient judgements about what in our world is important, and what is not. Orthogonality concepts for high information quality are well understood and rooted in disciplines such as quantum physics where linear algebra is employed and orthogonal basis sets are the minimum pieces of information one needs to represent more complex states without redundancy. When it comes to perception of moving objects, two types of critical orthogonal data sets are often required — spatial and temporal. Spatial data specifies where an object exists in the world, while temporal is where an object exists in time. By integrating these data sets along with other complementary data sets such as color, temperature, sound, smell, etc. our brains generate a real-time model of the world around us, defining how we experience it.
The human brain takes in all kinds of orthogonal data naturally, decoupling and reassembling information instantaneously, without us even realizing it. For example, if you see that a baseball is flying through the air towards you, your brain is gathering all types of information about it, such as spatial (the direction of where the ball is headed) and temporal (how fast it’s moving). While this data is being processed by your visual cortex “in the background” all you’re ultimately aware of is the action you need to take, which might be to duck. The AI perception technology that is able to successfully adopt the manner by which the human brain captures and processes these types of data sets will dominate the market.
Existing robotic sensory data acquisition systems have focused only on single sensor modalities (camera, LiDAR, radar) and only with fixed scan patterns and intensity. Unlike humans, these systems have not learned nor have the ability to efficiently process and optimize 2D and 3D data in real-time while both the sensor and detected objects are in motion. Therefore, they cannot use real-time orthogonal data to learn, prioritize, and focus. To effectively replicate the multi-dimensional sensory processing power of the human visual cortex will require a new approach to thinking about how to capture and process sensory data.
AEye is pioneering one such approach. AEye calls its unique biomimetic system iDAR (Intelligent Detection and Ranging). AEye’s iDAR is an intelligent artificial perception system that physically fuses a unique, agile LiDAR with a hi-res camera to create a new data type they call Dynamic Vixels. These Dynamic Vixels are one of the ways in which AEye acquires orthogonal data. By capturing x, y, z, r, g, b data (along with SWIR intensity), these patented Dynamic Vixels are uniquely created to biomimic the data structure of the human visual cortex. Like the human visual cortex, the intelligence of the Dynamic Vixels is then integrated in the central perception engine and motion planning system which is the functional brain of the vehicle. They are dynamic because as they actively interrogate a scene and adjust to changing conditions, such as increasing the power level of the sensor to cut through rain, or revisiting suspect objects in the same frame to identify obstacles. Better data drives more actionable information.
Not all objects are created equal
(See everything, and focus on what is important)
Humans continuously analyze their environment, always scanning for new objects, then in parallel and as appropriate focus in on elements that are either interesting, engaging, or potentially pose a threat. We process at the visual cortex fast, with incredible accuracy, and with very little of the brain’s immense processing power. If a human brain functioned as autonomous vehicles do today, we would not have survived as a species.
In his book The Power of Fifty Bits, Bob Nease writes of the ten million bits of information the human brain processes each second, but how only fifty bits are devoted to conscious thought. This is due to multiple evolutionary factors, including our adaptation to ignore autonomic processes like our heart beating, or our visual cortex screening out less relevant information in our surroundings (like the sky) to survive. It is an intelligent system design.
This is the nature of our intelligent vision. So, while our eyes are always scanning and searching to identify new objects entering a scene, we focus our attention on objects that matter as they move into areas of concern, allowing us to track them over time. In short, we search a scene, consciously acquire the objects that matter, and track them as required.
As discussed, current autonomous vehicle sensor configurations utilize a combination of LiDAR, cameras, ultrasonics, and radar as their “senses” that are serial collection (one way) and are limited to fixed patterns of search. These “senses” collect as much data as possible, which is then aligned, processed, and analyzed long after the fact. This post-processing is slow and does not allow for situational changes to how sensory data is captured in real-time. Because these sensors don’t intelligently interrogate, up to 95% of the sensory data currently being collected is thrown out as it is either irrelevant or redundant at the time it is processed. This act of triage also itself comes with a latency penalty. At highway speeds, this latency results in a car moving more than 20 feet before the sensor data has been fully processed. Throwing away data you don’t need with the goal of being efficient is inefficient. A better approach exists.
The overwhelming task of sifting through this data — every tree, curb, parked vehicle, the sky, the road, leaves on trees, and other static objects — also requires immense power and data processing resources, which slows down the entire system significantly, and introduces risk. These systems’ goal is to focus on everything and then try to analyze each item in their environment, after the fact, at the expense of timely action. This is the exact opposite of how humans process spatial and temporal data in situations that we associate with driving.
AEye’s iDAR teaches autonomous vehicles to “search, acquire, and track” objects as we do. By defining new data and sensor types that more efficiently communicate actionable information while maintaining the intelligence to analyze this data as quickly and accurately as possible. AEye’s iDAR enables this through its unique foundational solid-state agile LiDAR. Unlike standard LiDAR, AEye’s agile LiDAR is situationally adaptive so that it can modify scan patterns and trade resources such as update rate, resolution, and max range detection among others. This enables iDAR to dynamically adjust as it optimally searches a scene, conserve power and apply that power to efficiently identify and acquire critical objects, and track these objects over time. iDAR’s unique ability to intelligently use power to search, acquire, and track scenes helps identify that the object is a child walking into the street or that it is a car entering the intersection and accelerating high speed. Doing this in real-time is the difference between a safe journey and an avoidable tragedy.
Humans Learn Intuitively
(Feedback loops enable intelligence)
As we have discussed, the human visual cortex can scan ay 27Hz (much faster than current sensors on autonomous vehicles, which average around 10Hz). The brain naturally gathers information from the visual cortex, creating feedback loops that help make each step more efficient. The brain then provides context that directs the eyes to search and focus on certain objects, to identify and prioritize them, and effectively keep track of them while largely ignoring other objects of less importance. This prioritization allows efficiency and increases temporal and spatial sampling, not only scanning smarter, but simply scanning better.
Try it yourself. Look around, and notice how there are a multitude of depths, colors, shadows, and other information to capture with your eyes — and then there is motion. Then, consider what you know from experience about what you are seeing: Is a certain object capable of movement or is it likely to remain static? Could it behave predictably or erratically? Do you find value in the object, or do you consider it disposable? While you don’t consciously make these observations, your brain does.
Current sensor systems on autonomous or semi-autonomous vehicles are optimized for “search” which is then reported back to a central processor. The search is done with individual passive sensors which apply same power, intensity, and search pattern everywhere at every time — regardless of changes in the environment. Even more limiting that data flows one way only — from the passive sensors to the central processor – with no ability to actively adapt or adjust its collection. All intelligence is added after it has been fused and decimated (90 percent thrown out) when it is too late to learn and adjust in real-time.
AEye’s iDAR system mimics this feedback loop to allow similar behavior to be learned and trained so that autonomous vehicles can make better, more accurate decisions, faster. This agile, multi-dimensional system relies on feedback loops to efficiently and effectively cycle information to modify reactions appropriately, and in real-time, just like in humans. The camera can talk to the LiDAR and the sensor system can talk to the path planning system simultaneously in real-time.
In addition to improving response time, these feedback loops enable artificial intelligence to be more effectively integrated with artificial perception. Today’s sensor systems passively return the same type of data no matter the situation. Pushing sensory data capture and processing to the sensor rather than the centralized processor, enables faster integrated feedback loops to inform and queue actions, such as increasing range in the front sensor when driving on a highway, or foveating the sensor to the right in advance of a right turn. In this way, the iDAR system is able to continually learn and get smarter so that over time it can become even more efficient at identifying and tracking objects and situations that could threaten the safety of the autonomous vehicle, its passengers, other drivers and pedestrians.
Looking beyond perception
It excites me to witness how visionaries are empowering machines to more closely perceive the environment as a human would: evaluating risk, accurately gathering information, and responding and adapting to constantly changing conditions. This kind of autonomous vehicle and artificial intelligence integration could all but eliminate the woes of modern car culture.
What does it mean to think like a robot, but perceive like a human? It means logically mitigating human foibles like aggressive behavior, and avoiding the human risks of fatigue, distraction, or alcohol, while striving to mimic the processes of the human visual cortex and brain, the most powerful perception engine ever created. By doing this, we will ultimately save time, money, reduce stress, and improve our safety.
For autonomous vehicles, biomimicry informs us that at the very least artificial perception should put more of the perception processing at the sensor in order to function efficiently. By doing just that, AEye’s iDAR system has changed the way robotic vision is created and has defined new benchmarks for performance. In fact, iDAR sensors can achieve a scan rate in excess of 100Hz (3x human vision) with a detection range approaching one kilometer (5x current LiDAR sensors) — shattering records by going further and faster than any other sensor system and “driving” the promise of safe vehicle autonomy.
Through a biomimetic approach, artificial intelligence for autonomous vehicles will move well beyond merely reacting to environmental stimuli. Rather, these vehicles will have the processing capability available for higher-order human decision-making context, such as intuition and empathy. Therefore, the next topic we must address is an AI system’s ability to empathize and create a decision rooted in compassion. For this ethical discussion, I refer you to Part II of this article: Blind Technology Without Compassion is Ruthless.
James R. Doty, MD, is a clinical professor in the Department of Neurosurgery at Stanford University School of Medicine. He is also the founder and director of the Center for Compassion and Altruism Research and Education at Stanford University. He works with scientists from a number of disciplines examining the neural bases for compassion and altruism. He holds multiple patents and is the former CEO of Accuray. Dr. Doty is the New York Times bestselling author of Into the Magic Shop: A Neurosurgeon’s Quest to Discovery the Mysteries of the Brain and the Secrets of the Heart. He is also the senior editor of the recently released Oxford Handbook of Compassion Science.
Blair LaCorte is the Chief of Staff at AEye and Chairman of the Advisory Board. He is also on the Board of the Positive Coaching Alliance, Kairos Foundation for Entrepreneurship, and has formerly served as a Fellow at the Digital Strategy Center at the Tuck School at Dartmouth College; Executive Director of the Strategic Council on Security Technology; and a member of the Senate High Tech Advisory Board. Mr. LaCorte has had a lifelong passion for innovation, has garnered numerous patents across several domains and received numerous industry accolades including being named “Top 10 Marketer of the Year” by Ad Age and Business Marketing; “Innovator of the Year” by NASA; and “Product of the Year” by Industry Week;. He has authored numerous business school case studies that are currently taught at the top fifty business schools, and is the co-author of an upcoming book on relevancy and perception: “Relevancy…Rules”.