Boris Sofman (Bedrock Robotics) & Malhar Patel (Applied Intuition) Fireside Chat

Share
As a reminder, Chat8VC is a monthly series hosted at our SF office where we bring together founders and early-phase operators excited about the foundation model ecosystem. We focus on highly technical, implementation-oriented conversations, giving builders the chance to showcase what they’re working on through lightning demos and fireside chats.
If this conversation resonates and you'd like to get involved—or join us at a future event—reach out to Vivek Gopalan (vivek@8vc.com) and Bela Becerra (bela@8vc.com).
Boris Sofman: I'm Boris Sofman, co-founder and CEO of Bedrock Robotics. At Bedrock, we are creating autonomy for specialized heavy machinery. These are the sort of machines that you see in construction like excavators, wheel loaders, and bulldozers; more generally, these machines exist across agriculture, mining, lumber, and garbage movement. We're creating autonomy solutions that will be able to generalize across capabilities and machines.
Malhar Patel: I’m Malhar with Applied Intuition. I lead special projects, which encompasses a lot of strategic work across all engineering functions – simply, how can we make sure we can 10x as a company commercially and organizationally. I started off as one of our early engineers seven years ago. We're a vehicle intelligence company, which means we provide all the software needed for anything really that moves: anything from offline work – whether that's simulation, data management, map management – to autonomy for any of these systems as well as the underlying electrical and compute architecture that's needed to run them. This applies to cars, trucks, construction, defense etc.
Alex Kolicich: You both have been at the forefront of some of the foundational shifts in autonomy. R&D is actually working. Boris, you've seen it from the AV side; Malhar, you've seen it from the platform enablement perspective. Boris, what changed over the last ten years? It seemed with Waymo that a few years ago they had made very little progress and then there were many advancements at once. What was the nature of that?
Boris Sofman: The hockey stick finally kicked in.
Before Bedrock, I was at Waymo for 5 years, leading autonomous trucking and then various technology verticals supporting robotics. It was a really fascinating time. Waymo started in 2009, making it an old startup at this point. There were a handful of pretty massive transitions that eventually enabled this growth. The most pivotal one transpired in 2019/2020, when there was a decision to aggressively shift from engineered systems – rules, heuristics, search – to a very data and ML-driven approach, particularly on the behavioral side.
Perception had been using deep learning and supervised learning for a while, but the idea of going from these engineered solutions to really learning from the data (particularly from human drivers) was new: being able to annotate all the nuances and how they interact with everything they encounter. That was necessary to solve fully autonomous driving on the streets of San Francisco.
Now, what's interesting is that Phoenix went driverless with a system that was safer than humans by a good margin, and it was operating in an area with non-trivial mileage. I believe it got up to 100 square miles of territory in the Phoenix suburbs using the old system, but it was brittle.
It was very hard to scale, and you had this almost laughable situation where you'd fix one problem only for another to appear. It really motivated the shift to learning from data, and scale became continually easier. Seeing that we could solve the harder problems in San Francisco, the shift to Los Angeles, Phoenix, and Austin became incredibly easy. Going from a car to a truck became pretty subsidized, and data became the language for how you fix those problems. There's a lot of complexity in that evolution, but fundamentally you can argue that from a core technology standpoint, Waymo almost started too early. They had to get to that point in time where it became feasible to move towards these new architectures and really learn from demonstrations to interpret this infinite complexity of a problem.
Alex Kolicich: Was it something you all figured out that was unique? I mean, obviously, it's true that there are a lot of firms now using data. But also I see Waymos on the street, and I don't see very much else.
Boris Sofman: The other thing that became very clear is that the hardest part of the problem, eventually, is not the autonomy, but the evaluation and qualification. And that's a deceptive insight because when you're early on, you're improving quickly. At first, it's getting the core system running, then it's perception, and then you realize behavior is a lot harder because you have this complex problem where your behavior changes and now all of the things you learned are obsolete if you don't have a good behavioral model.
So, you have all these layers of complexity, but then eventually you get good enough to where you're very good, and you're optimizing for the one-in-a-million-mile issues. And the moment that happens and you're anywhere within its vicinity, you realize it's easy to fix problem after problem but you can create a half-dozen problems – or a dozen problems – for every one that you fix. You can go for a really long period of time fixing problems that aren't actually improving the overall safety. What you realize is that when you're dealing with something that's so sparse like collisions, you actually need really thoughtful metrics that are correlated with the things you care about. These metrics are way more sensitive and provide good signals, and then you can coordinate a team of 500 or 1,000 people improving a system without thrashing. You ultimately get into an even bigger problem which is how do you qualify whether it's even a driverless system? And I could say, "I think it's good enough," but how do you actually know it's good enough?
So that actually became the challenge the whole industry found in 2015 when AV was so hot and there were a dozen companies that received a ton of funding; however, it was a bit of a false start, because the early demos were not good enough.
Malhar Patel: It reminds me of humanoids right now. Another way to think about it is that it's really easy to see a demo of an autonomy stack running in some small shape or way. Basically, every university does this. Everyone has a robotics team or a self-driving car in a targeted way. Getting something that actually works in production, at scale, for long periods of time consistently is that nuance at the end. And that consistent part is what differentiates those that actually last versus those that may go up and go down.
Alex Kolicich: Malhar, you sit at an interesting point in the ecosystem where you help a handful of these companies to commercialize autonomy. Are there any patterns you're seeing emerging between ones that are doing well versus not well?
Malhar Patel: The sad thing is most of the companies that started back in 2015 or 2016 are no longer here. We saw the wave of a lot of autonomy companies, whether in passenger cars or trucking. Some SPAC'd, most died. Even the ones that SPAC'd are not doing well, which is unfortunate. Many failed to realize the other path is not purely a technology problem. If you don't have a real business model on the other side of this, you can't raise infinite funding to finance this, especially when the numbers are so high – e.g. one to two billion dollars each year. And no amount of venture financing can maintain that.
So, that's the failure case. And that usually happens when these companies compete against OEMs. OEMs are original equipment manufacturers – an example being the large automakers. And if you're going head-to-head with them and basically antagonizing them saying we can do it better, we're technologists – that's not the way to work with these companies. At the end of the day, it has to be complementary. They've survived for 50 to 100 years for a reason. That doesn't mean they're right about everything that they do, but you need to find a way to enable them while also making sure they make the right decisions. So the companies that tend to do well don't antagonize them and find ways to work with these OEMs in effective ways.
Boris Sofman: It's really hard to make a vehicle. I mean, I think Waymo had that experience as well when trying to make its own car and then eventually realizing that the partnership route is the better one.
Alex Kolicich: So, the best criteria is on the business model side versus the technology? Are there any elements that make tech work better?
Malhar Patel: Technicals can commoditize pretty fast. Even at the leading edge, most approaches come down to having a real data-driven approach done in a nuanced way but this nuance is quickly shared amongst the industry. So the technical understanding of what needs to happen usually ends up becoming pretty similar, pretty fast. So, actually I don't think that is the differentiator. It is just who actually gets things deployed and has the trust of a public entity and can maintain that over time.
The business side is one element. On the technical side, I view it as more of a break even if we just look at Silicon Valley startups. I think it becomes a very different question when we start talking about OEMs globally, because then you have to start talking about country-by-country nuances. For example, what would happen in Japan versus Europe versus Korea versus Sweden, etc? And in those cases, a lot of it ends up coming down to regulatory work as well. It's a bet that you actually have the right regulatory body that can work with these companies in the right way so they can get the data or licenses they need to actually deploy technology effectively.
Alex Kolicich: Let me ask a question to both of you now in a slightly different way. If you were to start a company over the next year to disrupt the existing players, what would that company look like? What would it be doing differently from the incumbents like Waymo or the other autonomy players?
Boris Sofman: So there's a fundamental property in public-roads autonomy which means you can't build the company in a traditional way: you can't just get a product that's pretty productive and good; then ship it even though it's not perfect, but it's providing value; and then you improve on it; and you build that business while you're making the product better. You can never decompose your functionality from the safety challenges, and your long tail is always safety, so if you want to ship in SF you have to solve 100% of what SF could throw at you because you have no control over that scope.
You could make the argument that public-road autonomy just has such a challenging property where no matter how efficient you are, it's still a huge amount of capital. You're taking a big risk on 5 years with a startup without the backing of an Alphabet or the business model of a Tesla that could actually sustain that sort of operation.
There's certain things you can do to make it more efficient. Waymo today would not reinvent every sensor that it reinvented, because back then there were no automotive-grade cameras that were good enough. That had to be done. There was no LIDAR that was sufficient for the control they needed, so they developed their own LIDAR. There was a lot of compute work because there weren't perfect components to integrate together. There are entire teams that got built that are no longer needed. It's intelligent to build a business on things that are not just requiring you to solve Level 4 autonomy from zero to one.
The other approach – this was our lens coming out of Waymo – is thinking, "What are the spaces where you actually have a more incremental way to build a business and you don't have to solve that holy grail problem first to build it?" Spaces like manufacturing and heavy machinery allow you to isolate problems in approachable and reasonable steps – these are reasonably fundable and allow you to start to build a really great business model that subsidizes the development. You then work your way up to the hundreds of billions or trillion dollar opportunity. That's the physics required so you're not vulnerable to the fluctuations that are inevitable in markets or the ups and downs of a startup.
If you're trying those zero to ones on something as big as public-road autonomy, it's just very difficult to do at the start.
Malhar Patel: I wouldn't start working on zero to one full growth autonomy today. I would instead focus on the auxiliary areas around it – eg factories and manufacturing. But I will say, returning to the business model idea: At the end of the day, who is going to buy the thing you're making and do they actually have enough money that justifies you spending eight, nine or ten figures on R&D spend? And if there aren't buyers on the other end there's probably not a company there.
Boris Sofman: The trap with autonomy though is thinking that if you start at Level 2 you'll jump quickly from Level 2 to Level 4. So, Level 2 is like autopilot where you still have a driver paying attention. Level 4 is full self-driving. There was a misconception that Level 2 is a pure stepping stone to Level 4 and that it's a very efficient path to go between the two of them. The systems and approach for Level 2 and 4 are fundamentally different.
And so, just like teleop doesn't get you to a full autonomy solution, there's a lot of lost effort when you realize that going from a solution that's correct 99.9% of the time to one that's actually Level 4 may require completely redesigned hardware and software. And so that's the part that is very deceptive in the AV space. There were some approaches that got burned by not realizing this.
Alex Kolicich: What about the question of simulation? There are many divergent views in automation. There are some companies predicated on fully training models in simulation; some use a mix. In physical autonomy, I'm sure it's the same. What are your views on the current state of simulation? How good is the data? How representative can it be of the real world and do you think that changes over the next five years or a decade?
Malhar Patel: I think we're probably both on one side of the spectrum of heavily relying on simulation – Applied started off building simulation software. So, we're not in the other camp, which is "you don't need sim at all".
At the end of the day, you just need a block of representative data from the use cases that you want to target and simulation is one way of achieving that. There are some places where you can get physical accuracy with basic properties of environments. You also need to produce that for all situations, in all ways, in every single domain – this remains an unsolved problem in my mind.
But what people can do is use simulation as a way of improving, for instance, the safety case of our vehicle system and that's what matters in terms of capabilities today versus, say, five years ago. So, five years ago, if you were coming out of an academic lab you'd probably go to some open source simulator or CARLA. There's not much else you could do, so most companies at the time did sim in-house – all classical, physics-based simulation.
The big change in the problem over the last year or two is that more people are wanting to consider neural-based simulation. This has high technical promise, but it's still very far from actually being used for anything safety-related or for production. A lot of your classic robotics and humanoid companies will use that in abundance but you will always get your real miles on real deployments in order to actually deploy something in production. Sim does have staying power as I don't believe it will ever be fully replaced.
Boris Sofman: Simulators are still flawed. They're useful, but they have some limitations. As much as people wanted to try and get into pure reinforcement learning and leverage simulators, you see a lot of fallback as well. Humanoid companies are puppeting the humanoids to get data to train. And so, it's about understanding where your simulator is strong and where it's weak. Then you need to figure out how to complement it, which involves a mix of approaches.
If you want to create rare scenarios, you can use synthetic data. You can use structured testing to create enough samples equivalent to hundreds of millions of miles. But it's a necessary tool because it's just an unscalable problem for a company to only use real-world data, particularly if you need statistical significance of exposure for safety-critical situations. There's almost a lens where you use the real world to validate the simulator and you use the simulator to validate your actual system, but you're always maintaining your visibility on "How much do you trust it?" by having real-world data to make sure that you're not fooling yourself by testing your system on your training set or overfitting to the strengths and weaknesses of your sim.
Malhar Patel: Boris, what do you think about all these low-cost simulators that promise you can run 10,000 scenarios parallel on a single machine and call it a day? Do you think that's real?
Boris Sofman: I'm sure you can massively reduce computational complexity if you're okay with giving up certainty on your realism. There might be applications where that's okay. We don't really work in those types of domains, so that's not a trade we'd be willing to make. If you start using simulation approaches where you can't actually understand what you're giving up or what the weaknesses are, it becomes very hard to rely on that for a safety-critical system. Imagine that there are applications where the average case matters more. If you want to save a huge amount of compute, you go that route. A lot of these applications are actually very sensitive to the worst case, so there's no free lunch. What's your thought?
Malhar Patel: I agree with you. At least what I've seen on some X posts these last few months is someone opensources a simulator and says, "Hey, we could run 10,000, or 100,000 of these binaries, in parallel on a single desktop." It does run, you can get results from that. The nuance is whether those results are actually useful for anything. There's an argument saying it increased some metric that is useful in some way. At the end of the day, if I can't upload this on a car or truck or any physical device that I can then trust to not cause an incident, then I have a problem with it. I still don't think it's at that point, but there is an argument to say the capabilities increase and change in the next year or so.
Boris Sofman: That might be the big debate over the next decade or two that opens up a lot more generalization in physical applications. And what's interesting is at Waymo -- we had a statistic where something like 99% of our bugs were found in simulated scenarios, but it wasn't black or white. A lot of those simulated scenarios were evolutions of real scenarios – like you're changing the motion of a vehicle, or you're inserting fake obstacles within a real dataset, or using a real scenario where you're transplanting the signal from a pedestrian into a log to see if the system would react if a child had gotten behind the car.
And so, most of our discovery of issues did not happen by just seeing something on a public road where you're out testing and then causing a collision; it was in this giant upsampling of exposure, but it was still rooted in fundamentally real data. This is deceptive because very little of our simulation was 100% synthetic, because there's just too much complexity with the real world – the blur is actually a really interesting area of complexity where you're leaning into partial simulation on top of real data.
Alex Kolicich: Boris when you work at Bedrock in construction zones – somewhat structured, somewhat unstructured, constantly changing – Is there anything about the autonomy stack or autonomy assumptions that break in that environment or are unique aspects of the problem where you have to change the approach?
Boris Sofman: There are a few things that get easier and a few things that get harder – overall it's a lot less of a narrow path to build a big business. You can start to scale before you tackle 100% of the problem. So, you're in a semi-structured, controlled environment at really low speeds. That is a big advantage because now you can trade what is effectively a Waymo behavioral qualification of your interaction with infinitely complex scenarios (people, cyclists, cars, crazy occurrences on the road etc.) into an environment where you have much more control of your behavior and safety directly, with clear expectations of the types of interactions that will play out. You always have the minimum safety condition of stopping, which is a pretty big advantage.
At Waymo, safety was always the long tail; here, we're pretty convinced that we will have a clear path to solve safety, and then your autonomy and your productivity will end up being a much bigger portion of what you're optimizing for, unlocking more flexibility.
The challenge you have is you're not in a world where there's one clear path. There's a lot of ways to achieve a good solution. There's no single path, so evaluation becomes a little bit trickier. When you think about simulation, in Waymo, you're a participant on the road. You're not actually modifying the world around you. Here, you're actually modifying the world. If you actually want to do closed-loop simulation, are you going to solve earth physics in a very robust way? How much do you care about realism? It's a different kind of simulation complexity, where now, instead of the simulation of the behavior of a human or a car, you have a very complicated physics problem. How much do you trust that realism?
And so you have this trade, but there's a big advantage because you can actually become very confident on a subset of what humans might do, which can be the incredibly valuable beginnings of a product that you can then scale as you develop more capabilities. That's a huge advantage to what we were talking about on the business model-side. And while safety is typically unforgiving, productivity is more forgiving and you can actually use that to your advantage. So, it was a great trade. That was pretty intentional because you don't have that flexibility with driving.
Alex Kolicich: Malhar, I know Applied Intuition works in many environments – defense, agriculture – across the board. Could you share more about what have been the areas where deployment has been optimal? And are there any commonalities amongst those domains?
Malhar Patel: We work a lot with people at OEMs in economies like auto, so a lot of the work is educational – "What should we be doing here if our company is going to survive the next 100 years?" – and that's often just a pure autonomy question. It's very easy for us to say, "Hey, if your vehicle can move itself, that's great." But in reality, there's a lot of underlying software questions that are going to be harder hitting and longer lasting. So that's why as we're building more software that's needed for a vehicle to function, we think about it as, "How do you make sure that the entire electronics architecture is stable and effective?" Or "How can you make software updates to a vehicle?" Because if you can't do basic things like that, it doesn't matter how smart your vehicle is – you can't actually use it reliably over a long period of time.
The consistent part across all these verticals is that the message rings true: if you have what's basically a software OS for... people here like to say "physical AI" or "embodied intelligence", then you can actually interact with all these systems in a real way. As advancements and autonomy or intelligence continue to improve, you actually have a way to use it. You have the proper software update and infrastructure to make use of all the use cases that I mentioned—which are diverse. This applies to a fighter jet. This applies to a new drone. This applies to a heavy-duty truck; a passenger car. Because at the end of the day, all vehicles, at least in our world view, will need that type of software OS as a base.
From a pure autonomy perspective, each domain has a different trade-off you have to think about, in terms of the safety versus technical exposure. And that's why most of these verticals have their own vertical layer. We think about it from a horizontal perspective: You need to take one software foundation, apply it to everything and everyone, and then use the learnings from one for the other. And hopefully, that works out well. But that's the key learning.
Alternatively, there are companies that are not from the US, specifically, and most of these industries are heavily related. A good example is a lot of automotive companies were once defense companies. If you think back to the World Wars, most auto factories turned into defense manufacturing hubs and vice versa. And they often spun out a separate subsidiary. So, they are more related than you think which then often means if you educate one layer, you often educate many others inadvertently.
Audience: Boris, you mentioned the Level 2 red herring not getting you cleanly to Level 4. What's your guess? What maybe is the Level 2 red herring in robotics today and is there a direct lesson you learned that you're applying at Bedrock in, thinking about how you get to the "Level 4" of whatever you're working on?
Boris Sofman: The main reason that becomes so pronounced is that in Level 2 you're optimizing for the average case; in Level 4, you're optimizing for the worst case. It completely changes the way you develop and design the architecture, because it drives so much sensitivity. It changes the way you simulate; it changes the way you evaluate.
Building on what you mentioned earlier about humanoids… if I had to personally guess where we could potentially have a false start that eventually materializes – but where we're off by a generation of development – it might be that: particularly, with respect to the solutions that are trying to be generalizable. The thing that made LLMs work so well as a general solution is that you have this gigantic base of text – publically available that almost anybody can train off of, with a very common base of features and context that ties it all together. That doesn't exist in the physical world today, where different solutions have different sensor data and applications. There are no massive data sets just publicly available. If you want a humanoid to operate on anything that could possibly happen in a consumer home – independent of environment, situation, context, or just to solve a material problem – without a giant breakthrough in simulation, it just is impossible today, even though the demo version of it is probably very doable.
I'd personally be much more excited about a verticalized solution where you're solving a well-framed problem. By the way, these verticals can spawn $100 billion dollar companies very easily. You see that with Waymo, you see that with others. But there, you're focused on the vertical integration of your particular choices of hardware, sensing, data – finding a solution, product, the constraints --and what you're solving, what you're not solving – and you can meet a need and actually ship it.
I worry that the generalized solution is a trap where you can show some really good demos, but the data you need to solve that is so intractable. On top of that, the cost structure with the hardware itself isn't even ready yet, where things like manipulators just don't have the nuance to solve the diversity of problems out there.
Audience: Do you think the rate of progress in physical AI mimics the rate of progress of something like a ChatGPT, given that ChatGPT doesn't operate in physical domains?
Boris Sofman: It'll be slower because ChatGPT benefited from this astronomical amount and availability of data and could bypass the complexities of deeply integrating into the industries that it has to operate in. Considering the hardware solutions and physical constraints of testing and iteration with something like Waymo: even if you were to create the optimal path at the right point in time, that's still not something that could, in a couple of years, come out with an incredible solution. Grok is a great example – xAI is only a few years old or so. They were able to jumpstart this because they immediately could get the data and architectures set up very fast.
We took about as fast of a path as we could because we came out with every learning we could think of from Waymo, which was on Alphabet's dime. Even then, it takes time to build these things up. There's just a physics to working with the physical world, but once it starts working, it can snowball very quickly: Even at Waymo, despite astronomical headwinds with capital and infrastructure and so forth, it's now scaling exponentially, doing millions of miles a week at this point. And so when it works, it can work incredibly quickly and you can actually capture a pretty giant amount of value in a short amount of time. Because at the end of the day, 75 to 80% of the world's GDP is from physical industries like building things, moving things, packaging things.
So, it's slower but very meaningful when it works and probably creates a bit of a moat when it works on the other side, because now you have an advantage from a data and competitive standpoint. I don't imagine there being an xAI equivalent that can shortcut that data and just jump to the front of the line.
Malhar Patel: Another way to think about it is distribution. Currently, there are a few thousand autonomous vehicles on the road. By contrast, 100 million cars are produced globally every year. So even that ratio itself shows you that if one company can just get distribution like that, then you can have a ChatGPT distribution moment.
Audience: I'd like to know your thoughts about the camera-only versus camera-plus-LIDAR systems. If Waymo was started today, would you choose LIDAR?
Boris Sofman: There's absolutely no doubt that the combination of sensors is valuable and helpful; we've seen cases where if you didn't have a LIDAR signal, you would have had zero signal in the camera domain that would have prevented an accident. Does it mean you can't get above human levels of safety? The Waymos are at 5x human safety. A decent part of that is probably the fusion of sensors because they complement each other. Radar passes through rain and dust and fog really easily; LIDAR sees perfectly, day or night. There are trade-offs and advantages, but cameras are very versatile and cheap.
Technology is better than ever before to be able to lean into cameras in a way that you couldn't previously. If Waymo hadn't relied so heavily on LIDAR – and there was a lot of work that went into balancing the camera-radar stack – they wouldn't be as dependent on it. A lot of Waymo's focus as they get costs down is actually removing some of the secondary LIDARs and replacing them with the functionality of cameras. This is totally doable now.
Just for context, we started Bedrock a year and a half ago. We're approaching this to create generalizable solutions that will be able to be used and snowballed quickly across applications. We're camera-primary, but we have LIDAR as a secondary sensor because it will have huge value – not just for safety and productivity, but for all the secondary benefits of being able to survey and understand the environment.
10 years ago, that was a harder case to make because technology wasn't as mature.
Malhar Patel: And purely from a cost perspective as well, five or ten years ago, LIDAR was tens of thousands to put on a car? So, it's prohibitive for any consumer vehicle like Tesla to put a LIDAR on a car and sell it to an individual. We were talking about this right before, actually, but when the cost of LIDAR gets low enough – when the BOM is low hundreds of dollars and it gives you initial improvement from a safety perspective – it does become a bit of a no-brainer. Even if it's not necessary, it is still very good.
Boris Sofman: Yeah, to just confirm, a Waymo car can drive 150 thousand miles a year. That can justify an extra $5-10K investment in sensing. A consumer car is a lot different.


.png)




.gif)