SAFETYLIT WEEKLY UPDATE

We compile citations and summaries of about 400 new articles every week.
RSS Feed

HELP: Tutorials | FAQ
CONTACT US: Contact info

Search Results

Journal Article

Citation

Karlsson R, Asfandiyarov R, Carballo A, Fujii K, Ohtani K, Takeda K. Sensors (Basel) 2024; 24(14).

Copyright

(Copyright © 2024, MDPI: Multidisciplinary Digital Publishing Institute)

DOI

10.3390/s24144735

PMID

39066133

PMCID

PMC11281213

Abstract

Cognitive scientists believe that adaptable intelligent agents like humans perform spatial reasoning tasks by learned causal mental simulation. The problem of learning these simulations is called predictive world modeling. We present the first framework for a learning open-vocabulary predictive world model (OV-PWM) from sensor observations. The model is implemented through a hierarchical variational autoencoder (HVAE) capable of predicting diverse and accurate fully observed environments from accumulated partial observations. We show that the OV-PWM can model high-dimensional embedding maps of latent compositional embeddings representing sets of overlapping semantics inferable by sufficient similarity inference. The OV-PWM simplifies the prior two-stage closed-set PWM approach to the single-stage end-to-end learning method. CARLA simulator experiments show that the OV-PWM can learn compact latent representations and generate diverse and accurate worlds with fine details like road markings, achieving 69 mIoU over six query semantics on an urban evaluation sequence. We propose the OV-PWM as a versatile continual learning paradigm for providing spatio-semantic memory and learned internal simulation capabilities to future general-purpose mobile robots.


Language: en

Keywords

autonomous driving; BEV generation; continual learning; generative models; mobile robots; open-vocabulary semantics; self-supervised learning; world models

NEW SEARCH


All SafetyLit records are available for automatic download to Zotero & Mendeley
Print