Waymo (ChauffeurNet) versus Telsa (HydraNet)

HYDRANET

(T) Following are some easy to understand summaries of how Waymo and Telsa are approaching the design, architectures, and modeling of their self-driving cars. Note how much they differ. 

Waymo – ChauffeurNet:

  • Product offerings: Waymo One (self driving ride car service), Waymo Via (self driving truck)
  • Physical inputs: Lidar, cameras, radar, sensors data
  • Building-blocks:
    • Data sources: Use large data set of human-labeled images generated both by Waymo and by external vendors
    • Perception: Find road paths, traffic lights, obstacles..Leveraged network architecture search (NAS) to find quickly best architectures for models
    • Behavior Prediction: Leverage Google maps; train agents through ChauffeurNet (recurrent neural networks and reinforcement learning) to estimate trajectories in a simulated environment
    • Planning: Generate trajectories through ChauffeurNet based on feasibility, staying on the road, and avoiding collisions
    • Controls optimizer: Throttle and steering
  • Source: ChauffeurNet presentation at Google I/O 2019

 

 

Tesla – HydraNet:

  • Product offerings: Auto-pilot (keep the car in the lane), Smart Summons (the car can come to the passenger through a mobile app), Self-Driving
  • Physical inputs: 8 cameras and sensors data (no lidar and no high definition maps)
  • Building-blocks:
    • Data sources: collect data from Tesla cars and label them
    • First round of processing – HydraNet:
      • Each camera has the same HydraNet architecture:
        • Shared backbone for shared tasks – a modified ResNet 50 architecture – images are 3, 960, 1280
        • Heads for specific tasks – FPN/DeepLab/UNet architectures
      • Independant images from the 8 cameras can be used for instance in space and time for the layout of the scene
      • Dependant images are used from those cameras at the same time for instance to estimate certain factors such as depth
    • Second round of processing: features from the 8 HydraNets can feed a second model – recurrent – for road layout predictions
    • Training and inference: implement multi-task distributed training on custom-hardware – GPU cluster for training – and proprietary hardware cluster (code-named Dojo) for inference. On-going improvements of features through their deployments to car telemetry
  • Source: Andrej Karpathy’s presentation at PyTorch 2019

 

Note: The picture above is Tesla’s model architecture.

Copyright © 2005-2020 by Serge-Paul Carrasco. All rights reserved.
Contact Us: asvinsider at gmail dot com.