2025-07-03 |
MultiGen: Using Multimodal Generation in Simulation to Learn Multimodal Policies in Real |
Renhao Wang et.al. |
2507.02864v1 |
null |
2025-07-03 |
AnyI2V: Animating Any Conditional Image with Motion Control |
Ziye Li et.al. |
2507.02857v1 |
null |
2025-07-03 |
MvHo-IB: Multi-View Higher-Order Information Bottleneck for Brain Disorder Diagnosis |
Kunyu Zhang et.al. |
2507.02847v1 |
null |
2025-07-03 |
Visual Contextual Attack: Jailbreaking MLLMs with Image-Driven Context Injection |
Ziqi Miao et.al. |
2507.02844v1 |
null |
2025-07-03 |
LLM-Driven Treatment Effect Estimation Under Inference Time Text Confounding |
Yuchen Ma et.al. |
2507.02843v1 |
null |
2025-07-03 |
StepHint: Multi-level Stepwise Hints Enhance Reinforcement Learning to Reason |
Kaiyi Zhang et.al. |
2507.02841v1 |
null |
2025-07-03 |
LCQNN: Linear Combination of Quantum Neural Networks |
Hongshun Yao et.al. |
2507.02832v1 |
null |
2025-07-03 |
USAD: An Unsupervised Data Augmentation Spatio-Temporal Attention Diffusion Network |
Ying Yu et.al. |
2507.02827v1 |
null |
2025-07-03 |
Confidence-driven Gradient Modulation for Multimodal Human Activity Recognition: A Dynamic Contrastive Dual-Path Learning Approach |
Panpan Ji et.al. |
2507.02826v1 |
null |
2025-07-03 |
SynapseRoute: An Auto-Route Switching Framework on Dual-State Large Language Model |
Wencheng Zhang et.al. |
2507.02822v1 |
null |
2025-07-03 |
LangScene-X: Reconstruct Generalizable 3D Language-Embedded Scenes with TriMap Video Diffusion |
Fangfu Liu et.al. |
2507.02813v1 |
null |
2025-07-03 |
Block triangular preconditioning for inverse source problems in time-space fractional diffusion equations |
Monoswini Majumdar et.al. |
2507.02809v1 |
null |
2025-07-03 |
GRB 240825A: Early Reverse Shock and Its Physical Implications |
Chao Wu et.al. |
2507.02806v1 |
null |
2025-07-03 |
Is Reasoning All You Need? Probing Bias in the Age of Reasoning Language Models |
Riccardo Cantini et.al. |
2507.02799v1 |
null |
2025-07-03 |
No time to train! Training-Free Reference-Based Instance Segmentation |
Miguel Espinosa et.al. |
2507.02798v1 |
null |
2025-07-03 |
Boosting the NOx production in microwave air plasma: A synergy of chemistry and vibrational kinetics |
Qinghao Shen et.al. |
2507.02795v1 |
null |
2025-07-03 |
Quasinormal modes of Floquet media slabs |
Benjamin Vial et.al. |
2507.02784v1 |
null |
2025-07-03 |
KERAP: A Knowledge-Enhanced Reasoning Approach for Accurate Zero-shot Diagnosis Prediction Using Multi-agent LLMs |
Yuzhang Xie et.al. |
2507.02773v1 |
null |
2025-07-03 |
Grounding Intelligence in Movement |
Melanie Segado et.al. |
2507.02771v1 |
null |
2025-07-03 |
DeSTA2.5-Audio: Toward General-Purpose Large Audio Language Model with Self-Generated Cross-Modal Alignment |
Ke-Han Lu et.al. |
2507.02768v1 |
null |
2025-07-03 |
A Proof-Theoretic View of Basic Intuitionistic Conditional Logic (Extended Version) |
Tiziano Dalmonte et.al. |
2507.02767v1 |
null |
2025-07-03 |
Knowledge Protocol Engineering: A New Paradigm for AI in Domain-Specific Knowledge Work |
Guangwei Zhang et.al. |
2507.02760v1 |
null |
2025-07-03 |
Multi-agent Auditory Scene Analysis |
Caleb Rascon et.al. |
2507.02755v1 |
null |
2025-07-03 |
Prompt learning with bounding box constraints for medical image segmentation |
Mélanie Gaillochet et.al. |
2507.02743v1 |
null |
2025-07-03 |
Leveraging Transformer Models to Capture Multi-Scale Dynamics in Biomolecules by nano-GPT |
Wenqi Zeng et.al. |
2507.02734v1 |
null |
2025-07-03 |
Bourbaki: Self-Generated and Goal-Conditioned MDPs for Theorem Proving |
Matthieu Zimmer et.al. |
2507.02726v1 |
null |
2025-07-03 |
Hierarchical Multi-Label Contrastive Learning for Protein-Protein Interaction Prediction Across Organisms |
Shiyi Liu et.al. |
2507.02724v1 |
null |
2025-07-03 |
A Systematic Search for Spectral Hardening in Blazar Flares with the Fermi-Large Area Telescope |
Adithiya Dinesh et.al. |
2507.02718v1 |
null |
2025-07-03 |
FairHuman: Boosting Hand and Face Quality in Human Image Generation with Minimum Potential Delay Fairness in Diffusion Models |
Yuxuan Wang et.al. |
2507.02714v1 |
null |
2025-07-03 |
UniMC: Taming Diffusion Transformer for Unified Keypoint-Guided Multi-Class Image Generation |
Qin Guo et.al. |
2507.02713v1 |
null |