How can a robot learn from its own interactions?
What abstractions are necessary to describe a task?
When does a robot even know that the task is now completed?
In a quest to find answers to the above questions, I am currently a PhD student at ETH Zurich advised by Marco Hutter, and a Research Scientist at NVIDIA Research. I also closely collaborate with Animesh Garg at the University of Toronto.
Over the past few years, I have had the opportunity to work with some amazing robotic groups.
I have been a visiting student researcher at Vector Institute,
a research intern at NNAISENSE, and a part-time research engineer
at ETH Zurich. During my undergrad at IIT Kanpur, I was a visiting student at
University of Freiburg, Germany, working closely with Abhinav Valada and Wolfram Burgard.
I am incredibly thankful to my collaborators and mentors, and enjoy exploring new domains through collaborations. If you have questions or would like to work together, feel free to reach out through
email!
news
Jul 1, 2022 |
Our papers on articulated object and in-hand manipulation are accepted to IROS 2022
|
Jan 31, 2022 |
Our paper on ‘A Collision-Free MPC for Whole-Body Dynamic Locomotion and Manipulation’ is accepted to ICRA 2022
|
Oct 7, 2021 |
Joined Marco Hutter’s group at ETH Zurich as a PhD student
|
Jun 28, 2021 |
Excited to start as a Deep Learning R&D Engineer at NVIDIA!
|
May 18, 2020 |
Excited to start my master thesis with Animesh Garg at
PAIR Lab, University of Toronto!
|
research interests
I am primarily interested in the decision-making and control of robots in human environments.
These days, my efforts are focused on designing perception-based systems for contact-rich manipulation tasks, such as articulated object interaction with mobile manipulators and in-hand manipulation.
Other areas of interest include hierarchical reinforcement learning, optimal control, and 3D vision.
publications
-
A Collision-Free MPC for Whole-Body Dynamic Locomotion and Manipulation
Jia-Ruei Chiu,
Jean-Pierre Sleiman,
Mayank Mittal,
Farbod Farshidian,
and Marco Hutter
ICRA
2022
[Abs]
[arXiv]
[Video]
In this paper, we present a real-time whole-body planner for collision-free legged mobile manipulation. We enforce both self-collision and environment-collision avoidance as soft constraints within a Model Predictive Control (MPC) scheme that solves a multi-contact optimal control problem. By penalizing the signed distances among a set of representative primitive collision bodies, the robot is able to safely execute a variety of dynamic maneuvers while preventing any self-collisions. Moreover, collision-free navigation and manipulation in both static and dynamic environments are made viable through efficient queries of distances and their gradients via a euclidean signed distance field. We demonstrate through a comparative study that our approach only slightly increases the computational complexity of the MPC planning. Finally, we validate the effectiveness of our framework through a set of hardware experiments involving dynamic mobile manipulation tasks with potential collisions, such as locomotion balancing with the swinging arm, weight throwing, and autonomous door opening.
-
Transferring Dexterous Manipulation from GPU Simulation to a Remote Real-World TriFinger
Arthur Allshire,
Mayank Mittal,
Varun Lodaya,
Viktor Makoviychuk,
Denys Makoviichuk,
Felix Widmaier,
Manuel Wüthrich,
Stefan Bauer,
Ankur Handa,
and Animesh Garg
IROS
2022
[Abs]
[arXiv]
[Website]
[Code]
We present a system for learning a challenging dexterous manipulation task involving moving a cube to an arbitrary 6-DoF pose with only 3-fingers trained with NVIDIA’s IsaacGym simulator. We show empirical benefits, both in simulation and sim-to-real transfer, of using keypoints as opposed to position+quaternion representations for the object pose in 6-DoF for policy observations and in reward calculation to train a model-free reinforcement learning agent. By utilizing domain randomization strategies along with the keypoint representation of the pose of the manipulated object, we achieve a high success rate of 83% on a remote TriFinger system maintained by the organizers of the Real Robot Challenge. With the aim of assisting further research in learning in-hand manipulation, we make the codebase of our system, along with trained checkpoints that come with billions of steps of experience available, available publicly.
-
Articulated Object Interaction in Unknown Scenes with Whole-Body Mobile Manipulation
Mayank Mittal,
David Hoeller,
Farbod Farshidian,
Marco Hutter,
and Animesh Garg
IROS
2022
[Abs]
[arXiv]
[Website]
A kitchen assistant needs to operate human-scale objects, such as cabinets and ovens, in unmapped environments with dynamic obstacles. Autonomous interactions in such real-world environments require integrating dexterous manipulation and fluid mobility. While mobile manipulators in different form-factors provide an extended workspace, their real-world adoption has been limited. This limitation is in part due to two main reasons: 1) inability to interact with unknown human-scale objects such as cabinets and ovens, and 2) inefficient coordination between the arm and the mobile base. Executing a high-level task for general objects requires a perceptual understanding of the object as well as adaptive whole-body control among dynamic obstacles. In this paper, we propose a two-stage architecture for autonomous interaction with large articulated objects in unknown environments. The first stage uses a learned model to estimate the articulated model of a target object from an RGB-D input and predicts an action-conditional sequence of states for interaction. The second stage comprises of a whole-body motion controller to manipulate the object along the generated kinematic plan. We show that our proposed pipeline can handle complicated static and dynamic kitchen settings. Moreover, we demonstrate that the proposed approach achieves better performance than commonly used control methods in mobile manipulation.
-
Neural Lyapunov Model Predictive Control
Mayank Mittal,
Marco Gallieri,
Alessio Quaglino,
Seyed Sina Mirrazavi Salehian,
and Jan Koutnik
(Under Review)
[Abs]
[arXiv]
This paper presents Neural Lyapunov MPC, an algorithm to alternately train a Lyapunov neural network and a stabilising constrained Model Predictive Controller (MPC), given a neural network model of the system dynamics. This extends recent works on Lyapunov networks to be able to train solely from expert demonstrations of one-step transitions. The learned Lyapunov network is used as the value function for the MPC in order to guarantee stability and extend the stable region. Formal results are presented on the existence of a set of MPC parameters, such as discount factors, that guarantees stability with a horizon as short as one. Robustness margins are also discussed and existing performance bounds on value function MPC are extended to the case of imperfect models. The approach is tested on unstable non-linear continuous control tasks with hard constraints. Results demonstrate that, when a neural network trained on short sequences is used for predictions, a one-step horizon Neural Lyapunov MPC can successfully reproduce the expert behaviour and significantly outperform longer horizon MPCs.
-
Learning Camera Miscalibration Detection
Andrei Cramariuc,
Aleksandar Petrov,
Rohit Suri,
Mayank Mittal,
Roland Siegwart,
and Cesar Cadena
ICRA
2020
[Abs]
[arXiv]
[Code]
Self-diagnosis and self-repair are some of the key challenges in deploying robotic platforms for long-term real-world applications. One of the issues that can occur to a robot is miscalibration of its sensors due to aging, environmental transients or external disturbances. Precise calibration lies at the core of a variety of applications, due to the need to accurately perceive the world. However, while a lot of work has focused on calibrating the sensors, not much has been done towards identifying when a sensor needs to be recalibrated. In this paper, we focus on a data-driven approach to learn the detection of miscalibration in vision sensors, specifically RGB cameras. Our contributions include a proposed miscalibration metric for RGB cameras and a novel semi-synthetic dataset generation pipeline based on this metric. Additionally, by training a deep convolutional neural network, we demonstrate the effectiveness of our pipeline to identify whether a recalibration of the camera’s intrinsic parameters is required or not.
-
Vision-based Autonomous UAV Navigation and Landing for Urban Search and Rescue
Mayank Mittal,
Rohit Mohan,
Wolfram Burgard,
and Abhinav Valada
ISRR
2019
[Abs]
[arXiv]
[Website]
Unmanned Aerial Vehicles (UAVs) equipped with bioradars are a life-saving technology that can enable identification of survivors under collapsed buildings in the aftermath of natural disasters such as earthquakes or gas explosions. However, these UAVs have to be able to autonomously navigate in disaster struck environments and land on debris piles in order to accurately locate the survivors. This problem is extremely challenging as pre-existing maps cannot be leveraged for navigation due to structural changes that may have occurred and existing landing site detection algorithms are not suitable to identify safe landing regions on debris piles. In this work, we present a computationally efficient system for autonomous UAV navigation and landing that does not require any prior knowledge about the environment. We propose a novel landing site detection algorithm that computes costmaps based on several hazard factors including terrain flatness, steepness, depth accuracy, and energy consumption information. We also introduce a first-of-a-kind synthetic dataset of over 1.2 million images of collapsed buildings with groundtruth depth, surface normals, semantics and camera pose information. We demonstrate the efficacy of our system using experiments from a city scale hyperrealistic simulation environment and in real-world scenarios with collapsed buildings.
-
Vision-based Autonomous Landing in Catastrophe-Struck Environments
Mayank Mittal,
Abhinav Valada,
and Wolfram Burgard
Workshop on Vision-based Drones: What’s Next?
IROS
2018
[Abs]
[arXiv]
[Video]
[PDF]
Unmanned Aerial Vehicles (UAVs) equipped with bioradars are a life-saving technology that can enable identification of survivors under collapsed buildings in the aftermath of natural disasters such as earthquakes or gas explosions. However, these UAVs have to be able to autonomously land on debris piles in order to accurately locate the survivors. This problem is extremely challenging as the structure of these debris piles is often unknown and no prior knowledge can be leveraged. In this work, we propose a computationally efficient system that is able to reliably identify safe landing sites and autonomously perform the landing maneuver. Specifically, our algorithm computes costmaps based on several hazard factors including terrain flatness, steepness, depth accuracy and energy consumption information. We first estimate dense candidate landing sites from the resulting costmap and then employ clustering to group neighboring sites into a safe landing region. Finally, a minimum-jerk trajectory is computed for landing considering the surrounding obstacles and the UAV dynamics. We demonstrate the efficacy of our system using experiments from a city scale hyperrealistic simulation environment and in real-world scenarios with collapsed buildings.