2025-07-03 |
Point3R: Streaming 3D Reconstruction with Explicit Spatial Pointer Memory |
Yuqi Wu et.al. |
2507.02863v1 |
null |
2025-07-03 |
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans |
Zhening Huang et.al. |
2507.02861v1 |
null |
2025-07-03 |
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching |
Xin Zhou et.al. |
2507.02860v1 |
null |
2025-07-03 |
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation |
Jiaer Xia et.al. |
2507.02859v1 |
null |
2025-07-03 |
Requirements Elicitation Follow-Up Question Generation |
Yuchen Shen et.al. |
2507.02858v1 |
null |
2025-07-03 |
AnyI2V: Animating Any Conditional Image with Motion Control |
Ziye Li et.al. |
2507.02857v1 |
null |
2025-07-03 |
MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis |
Kunyu Zhang et.al. |
2507.02847v1 |
null |
2025-07-03 |
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection |
Ziqi Miao et.al. |
2507.02844v1 |
null |
2025-07-03 |
LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding |
Yuchen Ma et.al. |
2507.02843v1 |
null |
2025-07-03 |
StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason |
Kaiyi Zhang et.al. |
2507.02841v1 |
null |
2025-07-03 |
Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach |
Panpan Ji et.al. |
2507.02826v1 |
null |
2025-07-03 |
DNN-Based Precoding in RIS-Aided mmWave MIMO Systems With Practical Phase Shift |
Po-Heng Chou et.al. |
2507.02824v1 |
null |
2025-07-03 |
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model |
Wencheng Zhang et.al. |
2507.02822v1 |
null |
2025-07-03 |
Relativistic accretion and burdened primordial black holes |
Suvashis Maity et.al. |
2507.02821v1 |
null |
2025-07-03 |
Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks |
Luke Guerdan et.al. |
2507.02819v1 |
null |
2025-07-03 |
Towards Perception-Informed Latent HRTF Representations |
You Zhang et.al. |
2507.02815v1 |
null |
2025-07-03 |
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion |
Fangfu Liu et.al. |
2507.02813v1 |
null |
2025-07-03 |
Advancements in Computing and Simulation Techniques for the HIBEAM-NNBAR Experiment |
Bernhard Meirose et.al. |
2507.02810v1 |
null |
2025-07-03 |
Prediction of synthesis parameters for N, Si, Ge and Sn diamond vacancy centers using machine learning |
Zhi Jiang et.al. |
2507.02808v1 |
null |
2025-07-03 |
Multimodal Mathematical Reasoning with Diverse Solving Perspective |
Wenhao Shi et.al. |
2507.02804v1 |
null |
2025-07-03 |
AREE-Based Decoupled Design of Hybrid Beamformers in mmWave XL-MIMO Systems |
Jiazhe Li et.al. |
2507.02802v1 |
null |
2025-07-03 |
No time to train! Training-Free Reference-Based Instance Segmentation |
Miguel Espinosa et.al. |
2507.02798v1 |
null |
2025-07-03 |
Random Flights and Anomalous Diffusion: A Non-Markovian Take on Lorentz Processes |
Lorenzo Facciaroni et.al. |
2507.02796v1 |
null |
2025-07-03 |
Boosting the NOx production in microwave air plasma: A synergy of chemistry and vibrational kinetics |
Qinghao Shen et.al. |
2507.02795v1 |
null |
2025-07-03 |
RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation |
Liheng Zhang et.al. |
2507.02792v1 |
null |
2025-07-03 |
Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance |
Jakob Kienegger et.al. |
2507.02791v1 |
null |
2025-07-03 |
From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding |
Xiangfeng Wang et.al. |
2507.02790v1 |
null |
2025-07-03 |
Understanding and Improving Length Generalization in Recurrent Models |
Ricardo Buitrago Ruiz et.al. |
2507.02782v1 |
null |
2025-07-03 |
From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images |
Danrong Zhang et.al. |
2507.02781v1 |
null |
2025-07-03 |
KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs |
Yuzhang Xie et.al. |
2507.02773v1 |
null |