2025-07-03 |
LiteReality: Graphics-Ready 3D Scene Reconstruction from RGB-D Scans |
Zhening Huang et.al. |
2507.02861v1 |
null |
2025-07-03 |
RefTok: Reference-Based Tokenization for Video Generation |
Xiang Fan et.al. |
2507.02862v1 |
null |
2025-07-03 |
Less is Enough: Training-Free Video Diffusion Acceleration via Runtime-Adaptive Caching |
Xin Zhou et.al. |
2507.02860v1 |
null |
2025-07-03 |
Bootstrapping Grounded Chain-of-Thought in Multimodal LLMs for Data-Efficient Model Adaptation |
Jiaer Xia et.al. |
2507.02859v1 |
null |
2025-07-03 |
Answer Matching Outperforms Multiple Choice for Language Model Evaluation |
Nikhil Chandak et.al. |
2507.02856v1 |
null |
2025-07-03 |
MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis |
Kunyu Zhang et.al. |
2507.02847v1 |
null |
2025-07-03 |
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection |
Ziqi Miao et.al. |
2507.02844v1 |
null |
2025-07-03 |
USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network |
Ying Yu et.al. |
2507.02827v1 |
null |
2025-07-03 |
Establishing Best Practices for Building Rigorous Agentic Benchmarks |
Yuxuan Zhu et.al. |
2507.02825v1 |
null |
2025-07-03 |
Measurement as Bricolage: Examining How Data Scientists Construct Target Variables for Predictive Modeling Tasks |
Luke Guerdan et.al. |
2507.02819v1 |
null |
2025-07-03 |
Towards Perception-Informed Latent HRTF Representations |
You Zhang et.al. |
2507.02815v1 |
null |
2025-07-03 |
HyperGaussians: High-Dimensional Gaussian Splatting for High-Fidelity Animatable Face Avatars |
Gent Serifi et.al. |
2507.02803v1 |
null |
2025-07-03 |
No time to train! Training-Free Reference-Based Instance Segmentation |
Miguel Espinosa et.al. |
2507.02798v1 |
null |
2025-07-03 |
RichControl: Structure- and Appearance-Rich Training-Free Spatial Control for Text-to-Image Generation |
Liheng Zhang et.al. |
2507.02792v1 |
null |
2025-07-03 |
Self-Steering Deep Non-Linear Spatially Selective Filters for Efficient Extraction of Moving Speakers under Weak Guidance |
Jakob Kienegger et.al. |
2507.02791v1 |
null |
2025-07-03 |
From Long Videos to Engaging Clips: A Human-Inspired Video Editing Framework with Multimodal Narrative Understanding |
Xiangfeng Wang et.al. |
2507.02790v1 |
null |
2025-07-03 |
From Pixels to Damage Severity: Estimating Earthquake Impacts Using Semantic Segmentation of Social Media Images |
Danrong Zhang et.al. |
2507.02781v1 |
null |
2025-07-03 |
Discovery and Preliminary Characterization of a Third Interstellar Object: 3I/ATLAS |
Darryl Z. Seligman et.al. |
2507.02757v1 |
null |
2025-07-03 |
Partial Weakly-Supervised Oriented Object Detection |
Mingxin Liu et.al. |
2507.02751v1 |
null |
2025-07-03 |
DexVLG: Dexterous Vision-Language-Grasp Model at Scale |
Jiawei He et.al. |
2507.02747v1 |
null |
2025-07-03 |
Early Signs of Steganographic Capabilities in Frontier LLMs |
Artur Zolkowski et.al. |
2507.02737v1 |
null |
2025-07-03 |
RIS-Aided Cooperative ISAC Networks for Structural Health Monitoring |
Jie Yang et.al. |
2507.02731v1 |
null |
2025-07-03 |
A Systematic Search for Spectral Hardening in Blazar Flares with the Fermi-Large Area Telescope |
Adithiya Dinesh et.al. |
2507.02718v1 |
null |
2025-07-03 |
FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models |
Yuxuan Wang et.al. |
2507.02714v1 |
null |
2025-07-03 |
UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation |
Qin Guo et.al. |
2507.02713v1 |
null |
2025-07-03 |
XPPLORE: Import, visualize, and analyze XPPAUT data in MATLAB |
Matteo Martin et.al. |
2507.02709v1 |
null |
2025-07-03 |
The ESO SupJup Survey VIII. Chemical fingerprints of young L dwarf twins |
N. Grasser et.al. |
2507.02706v1 |
null |
2025-07-03 |
CanonSwap: High-Fidelity and Consistent Video Face Swapping via Canonical Space Modulation |
Xiangyang Luo et.al. |
2507.02691v1 |
null |
2025-07-03 |
On the Convergence of Large Language Model Optimizer for Black-Box Network Management |
Hoon Lee et.al. |
2507.02689v1 |
null |
2025-07-03 |
A wireless, inexpensive optical tracker for the CAVE |
Ehud Sharlin et.al. |
2507.02682v1 |
null |