“: A Vision-Language-Action Flow Model for General Robot Control” (2024) has been cited 1,897 times according to Google Scholar. CitationMap has resolved 71 citing papers from institutions across 7 countries.
· Wenxuan Song, Jiayi Chen, Pengxiang Ding, Han Zhao +5 more
· Yifei Yang, Lu Chen, Zherui Song, Yenan Chen +4 more
Stable Offline Hand-Eye Calibration for any Robot with Just One Mark
· Sicheng Xie, Lingchen Meng, Zhiying Du, Shuyuan Tu +4 more
GWM: Towards Scalable Gaussian World Models for Robotic Manipulation
· Guanxing Lu, Baoxiong Jia, Puhao Li, Yixin Chen +3 more
Query-Centric Diffusion Policy for Generalizable Robotic Assembly
· Ziyi Xu, Hao-ming Lin, Shiqi Liu, Ding Zhao
VENTURA: Adapting Image Diffusion Models for Unified Task Conditioned Navigation
· Arthur Zhang, Xiangyun Meng, L. Calliari, Dong-Ki Kim +4 more
· Hanbit Oh, Andrea M. Salcedo-V'azquez, I. Ramirez-Alpizar, Y. Domae
ManiFlow: A General Robot Manipulation Policy via Consistency Flow Training
· Ge Yan, Jiyue Zhu, Yuquan Deng, Shiqi Yang +7 more
· M. Yajima, Keita Ota, Asako Kanezaki, Rei Kawakami +4 more
· Silong Zhang, Quecheng Qiu, Yingtai Ni, Yuechen Shao +2 more
· Tong Xie, Yijiahao Qi, Jinqi Wen, Zishen Wan +10 more
The Dual-System Hierarchical Architecture: A Future Paradigm for Vision-Language-Action Models
· Wenlong Chen, Zhen Tian, Zhou Zhou, Youhua Xia
ROSClaw: A Hierarchical Semantic-Physical Framework for Heterogeneous Multi-Agent Collaboration
· Rongfeng Zhao, Xuanhao Zhang, Zhaochen Guo, Xi Shao +3 more
· Hiromasa Yamaguchi, Yuga Yano, H. Tamukoh
Robotic Task Ambiguity Resolution via Natural Language Interaction
· Eugenio Chisari, Jan Ole von Hartz, Fabien Despinoy, A. Valada
RoboEnvision: A Long-Horizon Video Generation Model for Multi-Task Robot Manipulation
· Liudi Yang, Yang Bai, George Eskandar, Fengyi Shen +6 more
Scaling World Model for Hierarchical Manipulation Policies
· Qian Long, Yueze Wang, Jiaxin Song, Junbo Zhang +12 more
Eye-In-Finger: Smart Fingers for Delicate Assembly and Disassembly of LEGO
· Zhenran Tang, Ruixuan Liu, Changliu Liu
Autonomous Human-Robot Interaction via Operator Imitation
· S. Christen, David Muller, Agon Serifi, R. Grandia +4 more
SurgiPose: Estimating Surgical Tool Kinematics from Monocular Video for Surgical Robot Learning
· Juo-Tung Chen, Xinhao Chen, Ji Woong Kim, P. M. Scheikl +2 more
· Kanata Suzuki, Akane Ushizaka, Kazuki Hori, Tetsuya Ogata
MEAT: Mixture of Experts in Action Transformer for Robotic Arm Control
· N. Islam, H. Mai, Ying-Jen Chen
· Fuxiong Zhou
Learning a Unified Policy for Position and Force Control in Legged Loco-Manipulation
· Peiyuan Zhi, Peiyang Li, Jianqin Yin, Baoxiong Jia +1 more
· Haiyong Yu, Yanqiong Jin, Yonghao He, Wei Sui
Affordance-based Robot Manipulation with Flow Matching
· Fan Zhang, Michael Gienger, Fan Zhang, Michael Gienger
DextER: Language-driven Dexterous Grasp Generation with Embodied Reasoning
· Junha Lee, Eunha Park, Minsu Cho
· Sukbin Lim, Jung-Hoon Kim, Seungjae Moon, Junseo Cha +5 more
· AgiBot-World-Contributors, Qingwen Bu, Jisong Cai, Li Chen +47 more
Survey of π0, π0-FAST, and π0.5: Vision-Language-Action Models in the Physical AI Framework
· Seonghyun Kim, Samyeul Noh, Ingook Jang
STEP Planner: Constructing cross-hierarchical subgoal tree as an embodied long-horizon task planner
· Tianxing Zhou, Zhirui Wang, Haojia Ao, Guangyan Chen +4 more
FLAME: A Federated Learning Benchmark for Robotic Manipulation
· Santiago Bou Betran, A. Longhini, Miguel Vasco, Yuchong Zhang +1 more
· Ankai Zhang, Guozheng Peng, Rui Song, Zheng Wang +2 more
Action-aware Dynamic Pruning for Efficient Vision-Language-Action Manipulation
· Xiaohuan Pei, Yuxin Chen, Siyu Xu, Yunke Wang +2 more
History-Conditioned Spatio-Temporal Visual Token Pruning for Efficient Vision-Language Navigation
· Qitong Wang, Yijun Liang, Ming Li, Tianyi Zhou +1 more
MORE: Mobile Manipulation Rearrangement Through Grounded Language Reasoning
· Mohammad Mohammadi, Daniel Honerkamp, M. Büchner, Matteo Cassinelli +4 more
UniBiDex: A Unified Teleoperation Framework for Robotic Bimanual Dexterous Manipulation
· Zhongxuan Li, Zeliang Guo, Jun Hu, D. Navarro-Alarcón +3 more
Open-source vision-language-action models for robotics
· Linfeng Wang, Deok-Jin Lee
LLM-Based Decision Making Framework for Autonomous Drone Navigation
· Mirza Aarish Baig, Brad Alvarez, Richard Lage, Jayesh Soni +1 more
· Adrian Hess, Alexander M. Kübler, Benedek Forrai, M. Dogar +1 more
XRoboToolkit: A Cross-Platform Framework for Robot Teleoperation
· Zhigen Zhao, Liuchuan Yu, Ke Jing, Ning Yang
Collaborative Multi-Robot Non-Prehensile Manipulation via Flow-Matching Co-Generation
· Yorai Shaoul, Zhe Chen, M. Mohamed, Federico Pecora +2 more
Teaching RL Agents to Act Better: VLM as Action Advisor for Online Reinforcement Learning
· Xiefeng Wu, Jing Zhao, Shu Zhang, Ming Hu
Learning Generalizable Language-Conditioned Cloth Manipulation from Long Demonstrations
· Han Zhao, Jinxuan Zhu, Zihao Yan, Yichen Li +2 more
DRL-VLA: An Optimization Method for VLA Model Based on Deep Reinforcement Learning
· Mengkun Zhang, Pengfei Gao, Yinuo Sheng, Ran Li +6 more
RL-100: Performant Robotic Manipulation with Real-World Reinforcement Learning
· Kun Lei, Huanyu Li, Dongjie Yu, Zhenyu Wei +5 more
From Knowing to Doing Precisely: A General Self-Correction and Termination Framework for VLA models
· Wentao Zhang, Aolan Sun, Wentao Mo, Xiaoyang Qu +2 more
Genie Envisioner: A Unified World Foundation Platform for Robotic Manipulation
· Yue Liao, Pengfei Zhou, Siyuan Huang, Donglin Yang +10 more
ReFineVLA: Multimodal Reasoning-Aware Generalist Robotic Policies via Teacher-Guided Fine-Tuning
· T. Vo, Tan Q. Nguyen, Khang Nguyen, Nhat Tran +4 more
Embodied AI: From LLMs to World Models [Feature]
· Tongtong Feng, Xin Wang, Yu-Gang Jiang, Wenwu Zhu
VLC: A Human-Robot-Collaboration Framework with Vision-Language-Model
· Zilong Chen, Lebin Liang, Hao Dong, Dehao Kong +2 more
Building Explicit World Model for Zero-Shot Open-World Object Manipulation
· Xiaotong Li, Gang Chen, Javier Alonso-Mora
· Yuxin Zheng, W. Tao, Wentao Mo, Naifu Zhang +3 more
HMVLA: Hyperbolic Multimodal Fusion for Vision-Language-Action Models
· Kun Wang, Xiaokun Feng, M. Qu, Tonghua Su
HannesImitation: Grasping with the Hannes Prosthetic Hand via Imitation Learning
· Carlo Alessi, F. Vasile, Federico Ceola, Giulia Pasquale +2 more
Adversarial Locomotion and Motion Imitation for Humanoid Policy Learning
· Jiyuan Shi, Xinzhe Liu, Dewei Wang, Ouyang Lu +4 more
End-to-End Seam Tracking with Flow Matching-Based Diffusion Policy
· Zhaoqi Chu, Xiangrong Liu, Xuhui Que, Bo Yu +1 more
Closed-Form Robustness Bounds for Second-Order Pruning of Neural Controller Policies
· Maksym Shamrai
Can Multimodal LLMs Perform Time Series Anomaly Detection?
· Xiongxiao Xu, Haoran Wang, Yueqing Liang, Philip S. Yu +2 more
DailyArt: Discovering Articulation from Single Static Images via Latent Dynamics
· Hang Zhang, Qijian Tian, Jingyu Gong, Daoguo Dong +3 more
Rethinking Video Generation Model for the Embodied World
· Yufan Deng, Zilin Pan, Hongyu Zhang, Xiaojie Li +5 more
M100: An Orchestrated Dataflow Architecture Powering General AI Computing
· Yancheng Xie, Changkui Mao, Chan-gui Wu, Chaochao Lu +33 more
ForeAct: Steering Your VLA with Efficient Visual Foresight Planning
· Zhuoyang Zhang, Shang Yang, Qinghao Hu, Luke J. Huang +4 more
Self-Refining Vision Language Model for Robotic Failure Detection and Reasoning
· Carl Qi, Xiaojie Wang, Silong Yong, Stephen Sheng +5 more
RT-Cache: Training-Free Retrieval for Real-Time Manipulation
· O.-Kil Kwon, Abraham George, Alison Bartsch, A. Farimani
HACTS: a Human-As-Copilot Teleoperation System for Robot Learning
· Zhiyuan Xu, Yinuo Zhao, Kun Wu, Ning Liu +4 more
Extremum Flow Matching for Offline Goal Conditioned Reinforcement Learning
· Quentin Rouxel, Clemente Donoso, Fei Chen, S. Ivaldi +6 more
Taking Shortcuts for Categorical VQA Using Super Neurons
· Pierre Musacchio, Jae-Yong Jeong, Dahun Kim, Jaesik Park
Enhancing Robustness in Language-Driven Robotics: A Modular Approach to Failure Reduction
· Émiland Garrabé, Pierre Teixeira, Mahdi Khoramshahi, Stéphane Doncieux
Skin-Machine Interface with Multimodal Contact Motion Classifier
· Alberto Confente, Takanori Jin, Taisuke Kobayashi, J. R. Guadarrama-Olvera +1 more
CitationMap turns any Google Scholar profile into an interactive world map of citing institutions — free, no sign-up. Used for EB-1A / O-1 / NIW visa evidence, tenure files, and grant applications.