Memorize What Matters: Emergent Scene Decomposition from Multitraverse

人类天生会保留永久元素的记忆,而短暂的时刻通常会从记忆中溜走。这种选择性记忆对于机器人感知、定位和映射至关重要。为了赋予机器人这种能力,我们引入了3D高斯映射(3DGM),一种基于3D高斯膨胀的自我监督相机仅离线映射框架。3DGM将同一区域的多层RGB视频转换为高斯基环境地图,同时进行2D短暂物体分割。我们的关键观察是,在遍历过程中,环境保持一致,而对象经常发生变化。这使我们能够利用重复遍历的自监督特性来实现环境-物体分解。具体来说,3DGM将多层环境映射表示为一个鲁棒的差分渲染问题,将环境像素和物体像素分别视为异常值和异常值。通过鲁棒特征蒸馏、特征残差挖掘和鲁棒优化,3DGM与合作完成3D映射和2D分割,无需人工干预。我们基于Ithaca365和nuPlan数据集构建了Mapverse基准,以评估我们的方法在无监督2D分割、3D重建和神经渲染方面的效果。大量结果证实了我们的方法在自动驾驶和机器人领域具有有效性和潜力。

Humans naturally retain memories of permanent elements, while ephemeral moments often slip through the cracks of memory. This selective retention is crucial for robotic perception, localization, and mapping. To endow robots with this capability, we introduce 3D Gaussian Mapping (3DGM), a self-supervised, camera-only offline mapping framework grounded in 3D Gaussian Splatting. 3DGM converts multitraverse RGB videos from the same region into a Gaussian-based environmental map while concurrently performing 2D ephemeral object segmentation. Our key observation is that the environment remains consistent across traversals, while objects frequently change. This allows us to exploit self-supervision from repeated traversals to achieve environment-object decomposition. More specifically, 3DGM formulates multitraverse environmental mapping as a robust differentiable rendering problem, treating pixels of the environment and objects as inliers and outliers, respectively. Using robust feature distillation, feature residuals mining, and robust optimization, 3DGM jointly performs 3D mapping and 2D segmentation without human intervention. We build the Mapverse benchmark, sourced from the Ithaca365 and nuPlan datasets, to evaluate our method in unsupervised 2D segmentation, 3D reconstruction, and neural rendering. Extensive results verify the effectiveness and potential of our method for self-driving and robotics.

https://arxiv.org/abs/2405.17187

https://arxiv.org/pdf/2405.17187.pdf

发表回复

您的电子邮箱地址不会被公开。 必填项已用 * 标注