Skip to main content

: A Vision-Language-Action Flow Model for General Robot Control

: A Vision-Language-Action Flow Model for General Robot Control (2024) has been cited 1,897 times according to Google Scholar. CitationMap has resolved 71 citing papers from institutions across 7 countries.

See Brian Ichter's full citation map →

Where this paper is cited

United States · 2Germany · 2Japan · 1China · 1South Korea · 1Hong Kong · 1France · 1

Top citing institutions

  • Nvidia (1)
  • Institute of Science Tokyo (1)
  • Mitsubishi Electric (1)
  • Peking University (1)
  • LMU Munich (1)
  • Disney Research; ETH Zürich (1)
  • Honda Research Institute Europe (1)
  • Pohang University of Science and Technology (1)
  • Konkuk University (1)
  • Intelligent Autonomous Systems lab, TU Darmstadt (1)
  • Emory University (1)
  • Peking University, Stanford University (1)

Papers citing this work (71 resolved)

  1. PD-VLA: Accelerating Vision-Language-Action Model Integrated with Action Chunking via Parallel Decoding

    · Wenxuan Song, Jiayi Chen, Pengxiang Ding, Han Zhao +5 more

  2. Disambiguate Gripper State in Grasp-Based Tasks: Pseudo-Tactile as Feedback Enables Pure Simulation Learning

    · Yifei Yang, Lu Chen, Zherui Song, Yenan Chen +4 more

  3. Stable Offline Hand-Eye Calibration for any Robot with Just One Mark

    · Sicheng Xie, Lingchen Meng, Zhiying Du, Shuyuan Tu +4 more

  4. GWM: Towards Scalable Gaussian World Models for Robotic Manipulation

    · Guanxing Lu, Baoxiong Jia, Puhao Li, Yixin Chen +3 more

  5. Query-Centric Diffusion Policy for Generalizable Robotic Assembly

    · Ziyi Xu, Hao-ming Lin, Shiqi Liu, Ding Zhao

  6. VENTURA: Adapting Image Diffusion Models for Unified Task Conditioned Navigation

    · Arthur Zhang, Xiangyun Meng, L. Calliari, Dong-Ki Kim +4 more

  7. Robust Instant Policy: Leveraging Student’s t-Regression Model for Robust In-context Imitation Learning of Robot Manipulation

    · Hanbit Oh, Andrea M. Salcedo-V'azquez, I. Ramirez-Alpizar, Y. Domae

  8. ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training

    · Ge Yan, Jiyue Zhu, Yuquan Deng, Shiqi Yang +7 more

  9. Zero-Shot Peg Insertion: Identifying Mating Holes and Estimating SE(2) Poses with Vision-Language Models

    · M. Yajima, Keita Ota, Asako Kanezaki, Rei Kawakami +4 more

  10. Hierarchical Framework for Constrained Dual-Arm Cooperative Manipulation with Whole-Body Collision Avoidance

    · Silong Zhang, Quecheng Qiu, Yingtai Ni, Yuechen Shao +2 more

  11. CREATE: Cross-Layer Resilience Characterization and Optimization for Efficient yet Reliable Embodied AI Systems

    · Tong Xie, Yijiahao Qi, Jinqi Wen, Zishen Wan +10 more

  12. The Dual-System Hierarchical Architecture: A Future Paradigm for Vision-Language-Action Models

    · Wenlong Chen, Zhen Tian, Zhou Zhou, Youhua Xia

  13. ROSClaw: A Hierarchical Semantic-Physical Framework for Heterogeneous Multi-Agent Collaboration

    · Rongfeng Zhao, Xuanhao Zhang, Zhaochen Guo, Xi Shao +3 more

  14. An Object Placement Optimization System for Efficient and Unbiased Imitation Learning Data Collection

    · Hiromasa Yamaguchi, Yuga Yano, H. Tamukoh

  15. Robotic Task Ambiguity Resolution via Natural Language Interaction

    · Eugenio Chisari, Jan Ole von Hartz, Fabien Despinoy, A. Valada

  16. RoboEnvision: A Long-Horizon Video Generation Model for Multi-Task Robot Manipulation

    · Liudi Yang, Yang Bai, George Eskandar, Fengyi Shen +6 more

  17. Scaling World Model for Hierarchical Manipulation Policies

    · Qian Long, Yueze Wang, Jiaxin Song, Junbo Zhang +12 more

  18. Eye-In-Finger: Smart Fingers for Delicate Assembly and Disassembly of LEGO

    · Zhenran Tang, Ruixuan Liu, Changliu Liu

  19. Autonomous Human-Robot Interaction via Operator Imitation

    · S. Christen, David Muller, Agon Serifi, R. Grandia +4 more

  20. SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning

    · Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, P. M. Scheikl +2 more

  21. Interactive Object Detection by Mitigating Uncertainty of Robot Task Plans using Large Language Model

    · Kanata Suzuki, Akane Ushizaka, Kazuki Hori, Tetsuya Ogata

  22. MEAT: Mixture of Experts in Action Transformer for Robotic Arm Control

    · N. Islam, H. Mai, Ying-Jen Chen

  23. Efficient Inference for Vision-Language-Action Models: A Comprehensive Review of Acceleration Techniques

    · Fuxiong Zhou

  24. Learning a Unified Policy for Position and Force Control in Legged Loco-Manipulation

    · Peiyuan Zhi, Peiyang Li, Jianqin Yin, Baoxiong Jia +1 more

  25. Efficient Task-Specific Conditional Diffusion Policies: Shortcut Model Acceleration and SO(3) Optimization

    · Haiyong Yu, Yanqiong Jin, Yonghao He, Wei Sui

  26. Affordance-based Robot Manipulation with Flow Matching

    · Fan Zhang, Michael Gienger, Fan Zhang, Michael Gienger

  27. DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning

    · Junha Lee, Eunha Park, Minsu Cho

  28. Adelia: A 4-nm LLM Processing Unit With Streamlined Dataflow and Dual-Mode Parallelism for Maximizing Hardware Efficiency

    · Sukbin Lim, Jung-Hoon Kim, Seungjae Moon, Junseo Cha +5 more

  29. AgiBot World Colosseo: A Large-Scale Manipulation Platform for Scalable and Intelligent Embodied Systems

    · AgiBot-World-Contributors, Qingwen Bu, Jisong Cai, Li Chen +47 more

  30. Survey of π0, π0-FAST, and π0.5: Vision-Language-Action Models in the Physical AI Framework

    · Seonghyun Kim, Samyeul Noh, Ingook Jang

  31. STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner

    · Tianxing Zhou, Zhirui Wang, Haojia Ao, Guangyan Chen +4 more

  32. FLAME: A Federated Learning Benchmark for Robotic Manipulation

    · Santiago Bou Betran, A. Longhini, Miguel Vasco, Yuchong Zhang +1 more

  33. From Modular to End-to-End: Practical Exploration of Vision-Language-Action(VLA) Systems in Power Distribution Gird Inspections

    · Ankai Zhang, Guozheng Peng, Rui Song, Zheng Wang +2 more

  34. Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation

    · Xiaohuan Pei, Yuxin Chen, Siyu Xu, Yunke Wang +2 more

  35. History-Conditioned Spatio-Temporal Visual Token Pruning for Efficient Vision-Language Navigation

    · Qitong Wang, Yijun Liang, Ming Li, Tianyi Zhou +1 more

  36. MORE: Mobile Manipulation Rearrangement Through Grounded Language Reasoning

    · Mohammad Mohammadi, Daniel Honerkamp, M. Büchner, Matteo Cassinelli +4 more

  37. UniBiDex: A Unified Teleoperation Framework for Robotic Bimanual Dexterous Manipulation

    · Zhongxuan Li, Zeliang Guo, Jun Hu, D. Navarro-Alarcón +3 more

  38. Open-source vision-language-action models for robotics

    · Linfeng Wang, Deok-Jin Lee

  39. LLM-Based Decision Making Framework for Autonomous Drone Navigation

    · Mirza Aarish Baig, Brad Alvarez, Richard Lage, Jayesh Soni +1 more

  40. Sampling-Based Model Predictive Control for Dexterous Manipulation on a Biomimetic Tendon-Driven Hand

    · Adrian Hess, Alexander M. Kübler, Benedek Forrai, M. Dogar +1 more

  41. XRoboToolkit: A Cross-Platform Framework for Robot Teleoperation

    · Zhigen Zhao, Liuchuan Yu, Ke Jing, Ning Yang

  42. Collaborative Multi-Robot Non-Prehensile Manipulation via Flow-Matching Co-Generation

    · Yorai Shaoul, Zhe Chen, M. Mohamed, Federico Pecora +2 more

  43. Teaching RL Agents to Act Better: VLM as Action Advisor for Online Reinforcement Learning

    · Xiefeng Wu, Jing Zhao, Shu Zhang, Ming Hu

  44. Learning Generalizable Language-Conditioned Cloth Manipulation from Long Demonstrations

    · Han Zhao, Jinxuan Zhu, Zihao Yan, Yichen Li +2 more

  45. DRL-VLA: An Optimization Method for VLA Model Based on Deep Reinforcement Learning

    · Mengkun Zhang, Pengfei Gao, Yinuo Sheng, Ran Li +6 more

  46. RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning

    · Kun Lei, Huanyu Li, Dongjie Yu, Zhenyu Wei +5 more

  47. From Knowing to Doing Precisely: A General Self-Correction and Termination Framework for VLA models

    · Wentao Zhang, Aolan Sun, Wentao Mo, Xiaoyang Qu +2 more

  48. Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation

    · Yue Liao, Pengfei Zhou, Siyuan Huang, Donglin Yang +10 more

  49. ReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-Tuning

    · T. Vo, Tan Q. Nguyen, Khang Nguyen, Nhat Tran +4 more

  50. Embodied AI: From LLMs to World Models [Feature]

    · Tongtong Feng, Xin Wang, Yu-Gang Jiang, Wenwu Zhu

  51. VLC: A Human-Robot-Collaboration Framework with Vision-Language-Model

    · Zilong Chen, Lebin Liang, Hao Dong, Dehao Kong +2 more

  52. Building Explicit World Model for Zero-Shot Open-World Object Manipulation

    · Xiaotong Li, Gang Chen, Javier Alonso-Mora

  53. A Memory-Augmented Dual-Stream Framework to Achieve Long-Horizon Generalization In Robotic Manipulation

    · Yuxin Zheng, W. Tao, Wentao Mo, Naifu Zhang +3 more

  54. HMVLA: Hyperbolic Multimodal Fusion for Vision-Language-Action Models

    · Kun Wang, Xiaokun Feng, M. Qu, Tonghua Su

  55. HannesImitation: Grasping with the Hannes Prosthetic Hand via Imitation Learning

    · Carlo Alessi, F. Vasile, Federico Ceola, Giulia Pasquale +2 more

  56. Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning

    · Jiyuan Shi, Xinzhe Liu, Dewei Wang, Ouyang Lu +4 more

  57. End-to-End Seam Tracking with Flow Matching-Based Diffusion Policy

    · Zhaoqi Chu, Xiangrong Liu, Xuhui Que, Bo Yu +1 more

  58. Closed-Form Robustness Bounds for Second-Order Pruning of Neural Controller Policies

    · Maksym Shamrai

  59. Steering Diffusion Policies with Value-Guided Denoising

  60. Can Multimodal LLMs Perform Time Series Anomaly Detection?

    · Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu +2 more

  61. DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics

    · Hang Zhang, Qijian Tian, Jingyu Gong, Daoguo Dong +3 more

  62. Rethinking Video Generation Model for the Embodied World

    · Yufan Deng, Zilin Pan, Hongyu Zhang, Xiaojie Li +5 more

  63. M100: An Orchestrated Dataflow Architecture Powering General AI Computing

    · Yancheng Xie, Changkui Mao, Chan-gui Wu, Chaochao Lu +33 more

  64. ForeAct: Steering Your VLA with Efficient Visual Foresight Planning

    · Zhuoyang Zhang, Shang Yang, Qinghao Hu, Luke J. Huang +4 more

  65. Self-Refining Vision Language Model for Robotic Failure Detection and Reasoning

    · Carl Qi, Xiaojie Wang, Silong Yong, Stephen Sheng +5 more

  66. RT-Cache: Training-Free Retrieval for Real-Time Manipulation

    · O.-Kil Kwon, Abraham George, Alison Bartsch, A. Farimani

  67. HACTS: a Human-As-Copilot Teleoperation System for Robot Learning

    · Zhiyuan Xu, Yinuo Zhao, Kun Wu, Ning Liu +4 more

  68. Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning

    · Quentin Rouxel, Clemente Donoso, Fei Chen, S. Ivaldi +6 more

  69. Taking Shortcuts for Categorical VQA Using Super Neurons

    · Pierre Musacchio, Jae-Yong Jeong, Dahun Kim, Jaesik Park

  70. Enhancing Robustness in Language-Driven Robotics: A Modular Approach to Failure Reduction

    · Émiland Garrabé, Pierre Teixeira, Mahdi Khoramshahi, Stéphane Doncieux

  71. Skin-Machine Interface with Multimodal Contact Motion Classifier

    · Alberto Confente, Takanori Jin, Taisuke Kobayashi, J. R. Guadarrama-Olvera +1 more

Map your own citations

CitationMap turns any Google Scholar profile into an interactive world map of citing institutions — free, no sign-up. Used for EB-1A / O-1 / NIW visa evidence, tenure files, and grant applications.

Create your citation map →