

VLN-CE 是一个基于指令的导航任务,包含众包指令、真实环境以及不受限制的代理导航。该数据集支持 Room-to-Room (R2R) 和 Room-Across-Room (RxR) 数据集。
data/scene_datasets/mp3d/{scene}/{scene}.glb
路径下,共有 90 个场景。R2R_VLNCE_v1-3
和 R2R_VLNCE_v1-3_preprocessed
两个版本。
data/datasets/R2R_VLNCE_v1-3
。data/datasets/R2R_VLNCE_v1-3_preprocessed
。train
、val_seen
、val_unseen
和 test_challenge
四个分割,结构如下:
graphql
data/datasets
├─ RxR_VLNCE_v0
| ├─ train
| | ├─ train_guide.json.gz
| | ├─ train_guide_gt.json.gz
| | ├─ train_follower.json.gz
| | ├─ train_follower_gt.json.gz
| ├─ val_seen
| | ├─ val_seen_guide.json.gz
| | ├─ val_seen_guide_gt.json.gz
| | ├─ val_seen_follower.json.gz
| | ├─ val_seen_follower_gt.json.gz
| ├─ val_unseen
| | ├─ val_unseen_guide.json.gz
| | ├─ val_unseen_guide_gt.json.gz
| | ├─ val_unseen_follower.json.gz
| | ├─ val_unseen_follower_gt.json.gz
| ├─ test_challenge
| | ├─ test_challenge_guide.json.gz
| ├─ text_features
| | ├─ ...data/ddppo-models/{model}.pth
。download_mp.py
脚本下载场景数据。gdown
命令下载。如果使用 VLN-CE 数据集,请引用以下论文:
tex @inproceedings{krantz_vlnce_2020, title={Beyond the Nav-Graph: Vision and Language Navigation in Continuous Environments}, author={Jacob Krantz and Erik Wijmans and Arjun Majundar and Dhruv Batra and Stefan Lee}, booktitle={European Conference on Computer Vision (ECCV)}, year={2020} }
如果使用 RxR-Habitat 数据,请额外引用以下论文:
tex @inproceedings{ku2020room, title={Room-Across-Room: Multilingual Vision-and-Language Navigation with Dense Spatiotemporal Grounding}, author={Ku, Alexander and Anderson, Peter and Patel, Roma and Ie, Eugene and Baldridge, Jason}, booktitle={Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP)}, pages={4392--4412}, year={2020} }