Get access to AUTOASSESS public results
Scientific Publications
Abstract:
In this paper, we propose a model for the thrust coefficient of propellers that can take into account cross-influence between adjacent propellers. The aerodynamic interaction between propellers in multirotor aerial vehicles reduces the thrust they can produce. The influence between propellers depends on their relative positioning and orientation, which are taken into account by the proposed model. It is validated on measurements collected by a force sensor mounted on a propeller for different configurations of the adjacent propellers in a support structure. In this work, we focus on configurations with small relative orientations. Results show that the proposed model outperforms the traditional constant model in terms of thrust prediction on the data we collected, and it performs better than other models with fewer parameters, being the only one with less than 10% maximum percentage error.
Abstract:
This study primarily aims to bridge the gap between simulated and real-life experiments of the Omnimorph platform. Through rigorous testing and analysis, it contributes to the broader goal of achieving stable flight for the Omnimorph—a milestone not yet realized. This thesis represents a significant step forward in enabling Omnimorph to operate effectively in real- world conditions for the first time.
Jiaxu Xing, Angel Romero, Leonard Bauersfeld, Davide Scaramuzza: Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight. CoRR abs/2403.12203 (2024)
Abstract:
Learning visuomotor policies for agile quadrotor flight presents significant difficulties, primarily from inefficient policy exploration caused by high-dimensional visual inputs and the need for precise and low-latency control. To address these challenges, we propose a novel approach that combines the performance of Reinforcement Learning (RL) and the sample efficiency of Imitation Learning (IL) in the task of vision-based autonomous drone racing. While RL provides a framework for learning high-performance controllers through trial and error, it faces challenges with sample efficiency and computational demands due to the high dimensionality of visual inputs. Conversely, IL efficiently learns from visual expert demonstrations, but it remains limited by the expert’s performance and state distribution. To overcome these limitations, our policy learning framework integrates the strengths of both approaches. Our framework contains three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning via RL. Testing in both simulated and real-world scenarios shows our approach can not only learn in scenarios where RL from scratch fails but also outperforms existing IL methods in both robustness and performance, successfully navigating a quadrotor through a race course using only visual information. Videos of the experiments are available at https://rpg.ifi.uzh.ch/bootstrap-rl-with-il/index.html.
Abstract:
Visual Odometry (VO) is crucial for autonomous robotic navigation, especially in GPS-denied environments like planetary terrains. To improve robustness, recent model-based VO systems have begun combining standard and event-based cameras. While event cameras excel in low-light and high-speed motion, standard cameras provide dense and easier-to-track features. However, the field of image- and event-based VO still predominantly relies on model-based methods and is yet to fully integrate recent image-only advancements leveraging end-to-end learning-based architectures. Seamlessly integrating the two modalities remains challenging due to their different nature, one asynchronous, the other not, limiting the potential for a more effective image- and event-based VO. We introduce RAMP-VO, the first end-to-end learned image- and event-based VO system. It leverages novel Recurrent, Asynchronous, and Massively Parallel (RAMP) encoders capable of fusing asynchronous events with image data, providing 8x faster inference and 33% more accurate predictions than existing solutions. Despite being trained only in simulation, RAMP-VO outperforms previous methods on the newly introduced Apollo and Malapert datasets, and on existing benchmarks, where it improves image- and event-based methods by 58.8% and 30.6%, paving the way for robust and asynchronous VO in space.
Abstract—Quadrotors are among the most agile flying robots. Despite recent advances in learning-based control and computer vision, autonomous drones still rely on explicit state estimation. On the other hand, human pilots only rely on a first-personview video stream from the drone onboard camera to push the platform to its limits and fly robustly in unseen environments. To the best of our knowledge, we present the first visionbased quadrotor system that autonomously navigates through a sequence of gates at high speeds while directly mapping pixels to control commands. Like professional drone-racing pilots, our system does not use explicit state estimation and leverages the same control commands humans use (collective thrust and body rates). We demonstrate agile flight at speeds up to 40 km/h with accelerations up to 2 g. This is achieved by training vision-based policies with reinforcement learning (RL). The training is facilitated using an asymmetric actor-critic with access to privileged information. To overcome the computational complexity during image-based RL training, we use the inner edges of the gates as a sensor abstraction. This simple yet robust, task-relevant representation can be simulated during training without rendering images. During deployment, a Swintransformer-based gate detector is used. Our approach enables autonomous agile flight with standard, off-the-shelf hardware. Although our demonstration focuses on drone racing, we believe that our method has an impact beyond drone racing and can serve as a foundation for future research into real-world applications in structured environments.
Roxane Merat, Giovanni Cioffi, Leonard Bauersfeld, Davide Scaramuzza
Abstract:
Globally-consistent localization in urban environments is crucial for autonomous systems such as self-driving vehicles and drones, as well as assistive technologies for visually impaired people. Traditional Visual-Inertial Odometry (VIO) and Visual Simultaneous Localization and Mapping (VSLAM) methods, though adequate for local pose estimation, suffer from drift in the long term due to reliance on local sensor data. While GPS counteracts this drift, it is unavailable indoors and often unreliable in urban areas. An alternative is to localize the camera to an existing 3D map using visual-feature matching. This can provide centimeter-level accurate localization but is limited by the visual similarities between the current view and the map. This paper introduces a novel approach that achieves accurate and globally-consistent localization by aligning the sparse 3D point cloud generated by the VIO/VSLAM system to a digital twin using point-to-plane matching; no visual data association is needed. The proposed method provides a 6-DoF global measurement tightly integrated into the VIO/VSLAM system. Experiments run on a high-fidelity GPS simulator and real-world data collected from a drone demonstrate that our approach outperforms state-of-the-art VIO-GPS systems and offers superior robustness against viewpoint changes compared to the state-of-the-art Visual SLAM systems.
Abstract:
In this work, we provide an experimental vali-dation of the recent concepts of closed-loop state and input sensitivity in the context of robust flight control for a quadrotor (UAV) equipped with the popular PX4 1 1 https://px4.io/ controller. Our objective is to experimentally assess how the optimization of the reference trajectory w.r.t. these sensitivity metrics can improve the closed-loop system performance against model uncertainties commonly affecting the quadrotor systems. To accomplish this, we present a series of experiments designed to validate our optimization approach on two distinct trajectories, with the primary aim of assessing its precision in guiding the quadrotor through the center of a window at relatively high speeds. This approach provides some interesting insights for increasing the closed-loop robustness of the robot state and inputs against physical parametric uncertainties that may degrade the system’s performance.
Abstract:
This work presents techniques for scheduling the position controller gains for a class of fully-actuated morphing multi-rotor UAVs that use synchronized tilting to change their actuation capabilities. The feasible set of forces and torques that can be produced by the platform changes with the tilting angle, thus the tracking and disturbance rejection capabilities also change. To exploit the platform limits, two methods are proposed for gain scheduling using a simplified example, then one method is tested in simulation with an omnidirectional morphing multi-rotor (OmniMorph). The simulation results show that the developed techniques achieve consistent position tracking performance along the range of tilting angles when rejecting step disturbance forces of values close to the maximum force capabilities. The proposed methods offer a trade-off between simplicity and accuracy, that could be potentially applied for any multi-rotor with synchronized tilting capabilities. A video summary can be found in: https://youtu.be/kH-rrO8gWeU
Abstract:
Traditional volumetric fusion algorithms preserve the spatial structure of 3D scenes, which is beneficial for many tasks in computer vision and robotics. However, they often lack realism in terms of visualization. Emerging 3D Gaussian splatting bridges this gap, but existing Gaussian-based reconstruction methods often suffer from artifacts and inconsistencies with the underlying 3D structure, and struggle with real-time optimization, unable to provide users with immediate feedback in high quality. One of the bottlenecks arises from the massive amount of Gaussian parameters that need to be updated during optimization. Instead of using 3D Gaussian as a standalone map representation, we incorporate it into a volumetric mapping system to take advantage of geometric information and propose to use a quadtree data structure on images to drastically reduce the number of splats initialized. In this way, we simultaneously generate a compact 3D Gaussian map with fewer artifacts and a volumetric map on the fly. Our method, GSFusion, significantly enhances computational efficiency without sacrificing rendering quality, as demonstrated on both synthetic and real datasets. Code will be available at https://github.com/goldoak/GSFusion
Abstract:
This work explores the potential of using differentiable simulation for learning quadruped locomotion. Differentiable simulation promises fast convergence and stable training by computing low-variance first-order gradients using robot dynamics. However, its usage for legged robots is still limited to simulation. The main challenge lies in the complex optimization landscape of robotic tasks due to discontinuous dynamics. This work proposes a new differentiable simulation framework to overcome these challenges. Our approach combines a high-fidelity, non-differentiable simulator for forward dynamics with a simplified surrogate model for gradient backpropagation. This approach maintains simulation accuracy by aligning the robot states from the surrogate model with those of the precise, non-differentiable simulator. Our framework enables learning quadruped walking in simulation in minutes without parallelization. When augmented with GPU parallelization, our approach allows the quadruped robot to master diverse locomotion skills on challenging terrains in minutes. We demonstrate that differentiable simulation outperforms a reinforcement learning algorithm (PPO) by achieving significantly better sample efficiency while maintaining its effectiveness in handling large-scale environments. Our method represents one of the first successful applications of differentiable simulation to real-world quadruped locomotion, offering a compelling alternative to traditional RL methods.
Abstract:
This paper introduces for the first time the design, modelling, and control of a novel morphing multi-rotor Unmanned Aerial Vehicle (UAV) that we call the OmniMorph. The morphing ability allows the selection of the configuration that optimizes energy consumption while ensuring the needed maneuverability for the required task. The most energy-efficient uni-directional thrust (UDT) configuration can be used, e.g., during standard point-to-point displacements. Fully-actuated (FA) and omnidirectional (OD) configurations can be instead used for full pose tracking, such as, e.g., constant attitude horizontal motions and full rotations on the spot, and for full wrench 6D interaction control and 6D disturbance rejection. Morphing is obtained using a single servomotor, allowing possible minimization of weight, costs, and maintenance complexity. The actuation properties are studied, and an optimal controller that compromises between performance and control effort is proposed and validated in realistic simulations. Preliminary tests on the prototype are presented to assess the propellers’ mutual aerodynamic interference.
Abstract:
Quadrotor flight is an extremely challenging problem due to the limited control authority encountered at the limit of handling. Model Predictive Contouring Control (MPCC) has emerged as a promising model-based approach for time optimization problems such as drone racing. However, the standard MPCC formulation used in quadrotor racing introduces the notion of the gates directly in the cost function, creating a multi-objective optimization that continuously trades off between maximizing progress and tracking the path accurately. This paper introduces three key components that enhance the state-of-the-art MPCC approach for drone racing. First and foremost, we provide safety guarantees in the form of a track constraint and terminal set. The track constraint is designed as a spatial constraint which prevents gate collisions while allowing for time optimization only in the cost function. Second, we augment the existing first principles dynamics with a residual term that captures complex aerodynamic effects and thrust forces learned directly from real-world data. Third, we use Trust Region Bayesian Optimization (TuRBO), a state-of-the-art global Bayesian Optimization algorithm, to tune the hyperparameters of the MPCC controller given a sparse reward based on lap time minimization. The proposed approach achieves similar lap times to the best-performing RL policy and outperforms the best model-based controller while satisfying constraints. In both simulation and real world, our approach consistently prevents gate crashes with 100% success rate, while pushing the quadrotor to its physical limits reaching speeds of more than 80km/h.
Abstract
This work answers positively the question whether non-stop flights are possible for maintaining constant the pose of cable-suspended objects. Such a counterintuitive answer paves the way for a paradigm shift where energetically efficient fixed-wing flying carriers can replace the inefficient multirotor carriers that have been used so far in precise cooperative cable-suspended aerial manipulation. First, we show that one or two flying carriers alone cannot perform non-stop flights while maintaining a constant pose of the suspended object. Instead, we prove that three flying carriers can achieve this task provided that the orientation of the load at the equilibrium is such that the components of the cable forces that balance the external force (typically gravity) do not belong to the plane of the cable anchoring points on the load. Numerical tests are presented in support of the analytical results.
Abstract:
In the last decade, usage of MRAVs (Multirotor Aerial Vehicles) across multiple domains, has expanded significantly. Technological innovations greatly enhanced their capabilities and rendered them well-suited for operating in perilous areas such as maritime environments, like ballast tanks. These confined and hazardous areas pose risks to human inspectors, making MRAVs a safer and more reliable alternative. However, there remains the need for optimized MRAV designs tailored to complex tasks, such as autonomous contact inspections in these areas. This thesis presents a generalized, systematic, design procedure aimed at optimizing MRAVs based on their tasks and through that, propose a suitable design, optimized for the task of autonomous contact-inspection on the walls of a ballast tank, under the scope of the AUTOASSESS. At the core of this procedure lies the formulation of the design problem, as a constrained multivariable nonlinear optimization problem (NLP), allowing holistic consideration of task requirements through constraints or components of a unified objective function, derived from modified cost functions from the literature. The software outputs optimized MRAV designs tailored to the tasks and with selected performance metrics, one can evaluate and compare the to make informed decisions about the optimal configuration, based on task requirements and performance.
Abstract:
When considering close interaction between a human & an aerial robot, one aspect that is unique to this scenario with respect to other ground-based robotic platforms is the presence of wind disturbances that act on the aerial robot. In order to realize that aerial robots can coexist & collaborate with humans, we must be able to guarantee the safety of humans even in the presence of wind disturbances. To ensure the safety of human collaborators in such scenarios, this thesis aims to establish a comprehensive safety metric that guides the robot’s motion during such interactions. The aerial-robot considered for defining a safety metric is a hexarotor with a tilted-propeller configuration (tilt-hex). In the presence of a known wind disturbance, an aerodynamic model estimates the addition wrench this disturbance provides. The deviation due to this disturbance is then used to construct a safety metric that ensures the aerial-robot’s motion is guided and it operates safely within the environment. In the presence of wind, this safety metric ensures that the aerial robot maintains a safe distance with the human collaborator. The controller that embeds this safety metric is a Non-Linear Model Predictive Controller (NMPC). The controller generates control inputs, such that the safety constraint is satisfied throughout the trajectory, thereby ensuring safe operation with the human collaborator. Simulations were performed to construct this safety metric, and the system was tested with various wind conditions. The system’s performance under these conditions was evaluated to determine if the safety metric remained satisfied throughout the whole trajectory. Finally, a brief conclusion and potential directions for future work have been detailed.
Zoric, Filip; Franchi, Antonio; Orsag, Matko; Kovacic, Zdenko; Gabellieri, Chiara: Towards Instance Segmentation-Based Litter Collection with Multi-Rotor Aerial Vehicle // 2024 International Conference on Unmanned Aircraft Systems / Tsourveloudis, Nikos;, Theilliol, Didier (ur.). Piscataway (NJ): IEEE, 2024. str. 631-637.
Abstract:
This paper presents a novel aerial robotics application of instance segmentation-based floating litter collection with a multi-rotor aerial vehicle (MRAV). In the scope of the paper, we present a review of the available datasets for litter detection and segmentation. The reviewed datasets are used to train a Mask-RCNN neural network for instance segmentation. The neural network is off-board deployed on an edge computing device and used for litter position estimation. Based on the estimated litter position, we plan a path based on a quadratic Bezier curve for the litter pickup. We compare different trajectory generation methods for the object pickup. The system is verified in a laboratory environment. Eventually, we present practical considerations and improvements necessary to enable autonomous litter collection with MRAV.
Abstract:
We propose visual-inertial simultaneous localization and mapping that tightly couples sparse reprojection errors, inertial measurement unit pre-integrals, and relative pose factors with dense volumetric occupancy mapping. Hereby depth predictions from a deep neural network are fused in a fully probabilistic manner. Specifically, our method is rigorously uncertainty-aware: first, we use depth and uncertainty predictions from a deep network not only from the robot’s stereo rig, but we further probabilistically fuse motion stereo that provides depth information across a range of baselines, therefore drastically increasing mapping accuracy. Next, predicted and fused depth uncertainty propagates not only into occupancy probabilities but also into alignment factors between generated dense submaps that enter the probabilistic nonlinear least squares estimator. This submap representation offers globally consistent geometry at scale. Our method is thoroughly evaluated in two benchmark datasets, resulting in localization and mapping accuracy that exceeds the state of the art, while simultaneously offering volumetric occupancy directly usable for downstream robotic planning and control in real-time.