2025-07-03 |
MultiGen: Using Multimodal Generation in Simulation to Learn Multimodal Policies in Real |
Renhao Wang et.al. |
2507.02864v1 |
null |
2025-07-03 |
Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory |
Yuqi Wu et.al. |
2507.02863v1 |
null |
2025-07-03 |
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans |
Zhening Huang et.al. |
2507.02861v1 |
null |
2025-07-03 |
RefTok: Reference-Based Tokenization for Video Generation |
Xiang Fan et.al. |
2507.02862v1 |
null |
2025-07-03 |
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching |
Xin Zhou et.al. |
2507.02860v1 |
null |
2025-07-03 |
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation |
Jiaer Xia et.al. |
2507.02859v1 |
null |
2025-07-03 |
Requirements Elicitation Follow-Up Question Generation |
Yuchen Shen et.al. |
2507.02858v1 |
null |
2025-07-03 |
Answer Matching Outperforms Multiple Choice for Language Model Evaluation |
Nikhil Chandak et.al. |
2507.02856v1 |
null |
2025-07-03 |
AnyI2V: Animating Any Conditional Image with Motion Control |
Ziye Li et.al. |
2507.02857v1 |
null |
2025-07-03 |
Subtyping in DHOL -- Extended preprint |
Colin Rothgang et.al. |
2507.02855v1 |
null |
2025-07-03 |
MOTIF: Modular Thinking via Reinforcement Fine-tuning in LLMs |
Purbesh Mitra et.al. |
2507.02851v1 |
null |
2025-07-03 |
LLM Hypnosis: Exploiting User Feedback for Unauthorized Knowledge Injection to All Users |
Almog Hilel et.al. |
2507.02850v1 |
null |
2025-07-03 |
MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis |
Kunyu Zhang et.al. |
2507.02847v1 |
null |
2025-07-03 |
Legal Requirements Translation from Law |
Anmol Singhal et.al. |
2507.02846v1 |
null |
2025-07-03 |
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection |
Ziqi Miao et.al. |
2507.02844v1 |
null |
2025-07-03 |
On the Structure of Replicable Hypothesis Testers |
Anders Aamand et.al. |
2507.02842v1 |
null |
2025-07-03 |
StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason |
Kaiyi Zhang et.al. |
2507.02841v1 |
null |
2025-07-03 |
Neutrino mixing parameters and masses from $Δ(96)\rtimes H_{CP}$ in the tri-direct CP approach |
Li-Na Yan et.al. |
2507.02840v1 |
null |
2025-07-03 |
Stiefel optimization is NP-hard |
Zehua Lai et.al. |
2507.02839v1 |
null |
2025-07-03 |
ExPO: Unlocking Hard Reasoning with Self-Explanation-Guided Reinforcement Learning |
Ruiyang Zhou et.al. |
2507.02834v1 |
null |
2025-07-03 |
Generalizing Verifiable Instruction Following |
Valentina Pyatkin et.al. |
2507.02833v1 |
null |
2025-07-03 |
LCQNN: Linear Combination of Quantum Neural Networks |
Hongshun Yao et.al. |
2507.02832v1 |
null |
2025-07-03 |
Enhancing Noisy Quantum Sensing by GHZ State Partitioning |
Allen Zang et.al. |
2507.02829v1 |
null |
2025-07-03 |
USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network |
Ying Yu et.al. |
2507.02827v1 |
null |
2025-07-03 |
Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach |
Panpan Ji et.al. |
2507.02826v1 |
null |
2025-07-03 |
Establishing Best Practices for Building Rigorous Agentic Benchmarks |
Yuxuan Zhu et.al. |
2507.02825v1 |
null |
2025-07-03 |
DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift |
Po-Heng Chou et.al. |
2507.02824v1 |
null |
2025-07-03 |
Osculating Geometry and Higher-Order Distance Loci |
Sandra Di Rocco et.al. |
2507.02823v1 |
null |
2025-07-03 |
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model |
Wencheng Zhang et.al. |
2507.02822v1 |
null |
2025-07-03 |
Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks |
Luke Guerdan et.al. |
2507.02819v1 |
null |