Sumit
@_reachsumit
Senior ML Engineer @Meta | prev: @TikTok_us, @Amazon, @Samsung | UChicago Alum https://blog.reachsumit.com/ 🇮🇳→🇰🇷→🇦🇺→🇨🇦→🇺🇲
Adaptive Repetition for Mitigating Position Bias in LLM-Based Ranking Spotify introduces a dynamic early-stopping method that adaptively determines repetitions needed for each ranking instance, reducing LLM calls by 81% while preserving accuracy. 📝arxiv.org/abs/2507.17788
Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models Introduces a systematic method to detect "sticky tokens" that disrupt embedding similarities by pulling sentence pairs toward specific values. 📝arxiv.org/abs/2507.18171 👨🏽💻github.com/March-7/Sticky…
TDR: Task-Decoupled Retrieval with Fine-Grained LLM Feedback for In-Context Learning Meituan decouples ICL examples from different tasks and models fine-grained feedback from LLMs to improve example retrieval quality for ICL. 📝arxiv.org/abs/2507.18340 👨🏽💻github.com/Nnn-s/TDR
A Deep Dive into Retrieval-Augmented Generation for Code Completion: Experience on WeChat Tencent presents a study of RAG-based code completion on WeChat's proprietary codebase, finding that similarity-based RAG outperforms identifier-based approaches 📝arxiv.org/abs/2507.18515
Transform Before You Query: A Privacy-Preserving Approach for Vector Retrieval with Embedding Space Alignment Leverages alignment between semantic spaces of different embedding models to protect user query text. 📝arxiv.org/abs/2507.18518 👨🏽💻anonymous.4open.science/r/STEER/README…
MBASR: A Generic Framework for Multi-Behavior Data Augmentation in Sequential Recommendation Proposes five behavior-aware data augmentation operations to address data sparsity in multi-behavior sequential recommendation. 📝dl.acm.org/doi/10.1145/37… 👨🏽💻github.com/XiaoQi-C/MBASR
PathWeaver: A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search Introduces a multi-GPU framework that achieves 3.24× speedup for graph-based nearest neighbor search. 📝arxiv.org/abs/2507.17094 👨🏽💻github.com/AIS-SNU/PathWe…
DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning Alibaba introduces a search agent that combines dynamic knowledge graphs with multi-reward RL to improve multi-step IR and reduce reasoning deviations. 📝arxiv.org/abs/2507.17365
Each to Their Own: Exploring the Optimal Embedding in RAG Introduces Confident RAG method that generates multiple responses using different embedding models and selects the highest confidence answer, achieving 10% improvement over vanilla RAG. 📝arxiv.org/abs/2507.17442
R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems Kuaishou introduces a System-2 thinking framework for recommendation systems using actor and reflection models to iteratively refine user preference and item knowledge. 📝arxiv.org/abs/2507.17249
Millions of GeARs: Extending GraphRAG to Millions of Documents Huawei presents a method to scale GraphRAG to millions of documents by adapting GeAR with online alignment between passages and Wikidata triples, avoiding costly LLM-based triple extraction 📝arxiv.org/abs/2507.17399
Scaling Recommender Transformers to One Billion Parameters Yandex presents a scalable framework for training billion-parameter recommender transformers using a dual-objective pre-training approach that combines next-item and feedback prediction. 📝arxiv.org/abs/2507.15994
Reinforce Lifelong Interaction Value of User-Author Pairs for Large-Scale Recommendation Systems Kuaishou proposes a model using reinforcement learning to optimize lifelong interaction value of user-author pairs in short-video platforms. 📝arxiv.org/abs/2507.16253
Time to Split: Exploring Data Splitting Strategies for Offline Evaluation of Sequential Recommenders Compares data splitting strategies for sequential recommendation systems, showing leave-one-out splits may be insufficient. 📝arxiv.org/abs/2507.16289 👨🏽💻github.com/monkey0head/ti…
RAVine: Reality-Aligned Evaluation for Agentic Search Introduces an evaluation framework for agentic search systems that addresses current challenges through reality-aligned benchmarks and process-oriented assessment. 📝arxiv.org/abs/2507.16725 👨🏽💻github.com/SwordFaith/RAV…
Measuring the Fairness Gap Between Retrieval and Generation in RAG Systems using a Cognitive Complexity Framework Amazon proposes an evaluation framework extending IR fairness metrics by incorporating centrality-based measures. 📝amazon.science/publications/m…
CatalogRAG: Retrieval-Guided LLM Prediction for Multilingual E-commerce Product Attributes Amazon introduces a retrieval-augmented system that leverages existing product catalog entries to guide LLM predictions for missing structured attributes. 📝 amazon.science/publications/c…
Understanding Matching Mechanisms in Cross-Encoders @MathiasVast1 et al. investigate how neural ranking models construct relevance signals by analyzing attention processes and extracting causal insights in cross-encoders. 📝 arxiv.org/abs/2507.14604 👨🏽💻git.isir.upmc.fr/mat_vast/sigir…
GRACE: Generative Recommendation via Journey-Aware Sparse Attention on Chain-of-Thought Tokenization Walmart introduces a generative framework for multi-behavior sequential recommendation combining CoT tokenization with Journey-Aware Sparse Attention. 📝arxiv.org/abs/2507.14758
U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs Introduces a unified framework for universal multimodal retrieval that systematically explores design principles. 📝arxiv.org/abs/2507.14902 👨🏽💻github.com/chaxjli/U-MARV…