Sumit

@_reachsumit

Senior ML Engineer @Meta | prev: @TikTok_us, @Amazon, @Samsung | UChicago Alum https://blog.reachsumit.com/ 🇮🇳→🇰🇷→🇦🇺→🇨🇦→🇺🇲

Seattle, WA

Joined April 2010

458Following

3KFollowers

Pinned

Sumit@_reachsumit · Jul 25

Adaptive Repetition for Mitigating Position Bias in LLM-Based Ranking Spotify introduces a dynamic early-stopping method that adaptively determines repetitions needed for each ranking instance, reducing LLM calls by 81% while preserving accuracy. 📝arxiv.org/abs/2507.17788

710

Sumit@_reachsumit · Jul 25

Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models Introduces a systematic method to detect "sticky tokens" that disrupt embedding similarities by pulling sentence pairs toward specific values. 📝arxiv.org/abs/2507.18171 👨🏽‍💻github.com/March-7/Sticky…

_reachsumit's tweet card. Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models【ACL 2025 Main】 - March-7/StickyToken

485

Sumit@_reachsumit · Jul 25

TDR: Task-Decoupled Retrieval with Fine-Grained LLM Feedback for In-Context Learning Meituan decouples ICL examples from different tasks and models fine-grained feedback from LLMs to improve example retrieval quality for ICL. 📝arxiv.org/abs/2507.18340 👨🏽‍💻github.com/Nnn-s/TDR

_reachsumit's tweet card. Coming soon. Contribute to Nnn-s/TDR development by creating an account on GitHub.

279

Sumit@_reachsumit · Jul 25

A Deep Dive into Retrieval-Augmented Generation for Code Completion: Experience on WeChat Tencent presents a study of RAG-based code completion on WeChat's proprietary codebase, finding that similarity-based RAG outperforms identifier-based approaches 📝arxiv.org/abs/2507.18515

102

5.0K

Sumit@_reachsumit · Jul 25

Transform Before You Query: A Privacy-Preserving Approach for Vector Retrieval with Embedding Space Alignment Leverages alignment between semantic spaces of different embedding models to protect user query text. 📝arxiv.org/abs/2507.18518 👨🏽‍💻anonymous.4open.science/r/STEER/README…

364

Sumit@_reachsumit · Jul 24

MBASR: A Generic Framework for Multi-Behavior Data Augmentation in Sequential Recommendation Proposes five behavior-aware data augmentation operations to address data sparsity in multi-behavior sequential recommendation. 📝dl.acm.org/doi/10.1145/37… 👨🏽‍💻github.com/XiaoQi-C/MBASR

_reachsumit's tweet card. Contribute to XiaoQi-C/MBASR development by creating an account on GitHub.

343

Sumit@_reachsumit · Jul 24

PathWeaver: A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search Introduces a multi-GPU framework that achieves 3.24× speedup for graph-based nearest neighbor search. 📝arxiv.org/abs/2507.17094 👨🏽‍💻github.com/AIS-SNU/PathWe…

_reachsumit's tweet card. A High-Throughput Multi-GPU System for Graph-Based Approximate Nearest Neighbor Search - AIS-SNU/PathWeaver

470

Sumit@_reachsumit · Jul 24

DynaSearcher: Dynamic Knowledge Graph Augmented Search Agent via Multi-Reward Reinforcement Learning Alibaba introduces a search agent that combines dynamic knowledge graphs with multi-reward RL to improve multi-step IR and reduce reasoning deviations. 📝arxiv.org/abs/2507.17365

399

Sumit@_reachsumit · Jul 24

Each to Their Own: Exploring the Optimal Embedding in RAG Introduces Confident RAG method that generates multiple responses using different embedding models and selects the highest confidence answer, achieving 10% improvement over vanilla RAG. 📝arxiv.org/abs/2507.17442

369

Sumit@_reachsumit · Jul 24

R4ec: A Reasoning, Reflection, and Refinement Framework for Recommendation Systems Kuaishou introduces a System-2 thinking framework for recommendation systems using actor and reflection models to iteratively refine user preference and item knowledge. 📝arxiv.org/abs/2507.17249

327

Sumit@_reachsumit · Jul 24

Millions of GeARs: Extending GraphRAG to Millions of Documents Huawei presents a method to scale GraphRAG to millions of documents by adapting GeAR with online alignment between passages and Wikidata triples, avoiding costly LLM-based triple extraction 📝arxiv.org/abs/2507.17399

1.0K

Sumit@_reachsumit · Jul 23

Scaling Recommender Transformers to One Billion Parameters Yandex presents a scalable framework for training billion-parameter recommender transformers using a dual-objective pre-training approach that combines next-item and feedback prediction. 📝arxiv.org/abs/2507.15994

494

Sumit@_reachsumit · Jul 23

Reinforce Lifelong Interaction Value of User-Author Pairs for Large-Scale Recommendation Systems Kuaishou proposes a model using reinforcement learning to optimize lifelong interaction value of user-author pairs in short-video platforms. 📝arxiv.org/abs/2507.16253

353

Sumit@_reachsumit · Jul 23

Time to Split: Exploring Data Splitting Strategies for Offline Evaluation of Sequential Recommenders Compares data splitting strategies for sequential recommendation systems, showing leave-one-out splits may be insufficient. 📝arxiv.org/abs/2507.16289 👨🏽‍💻github.com/monkey0head/ti…

_reachsumit's tweet card. Contribute to monkey0head/time-to-split development by creating an account on GitHub.

885

Sumit@_reachsumit · Jul 23

RAVine: Reality-Aligned Evaluation for Agentic Search Introduces an evaluation framework for agentic search systems that addresses current challenges through reality-aligned benchmarks and process-oriented assessment. 📝arxiv.org/abs/2507.16725 👨🏽‍💻github.com/SwordFaith/RAV…

_reachsumit's tweet card. Contribute to SwordFaith/RAVine development by creating an account on GitHub.

259

Sumit@_reachsumit · Jul 22

Measuring the Fairness Gap Between Retrieval and Generation in RAG Systems using a Cognitive Complexity Framework Amazon proposes an evaluation framework extending IR fairness metrics by incorporating centrality-based measures. 📝amazon.science/publications/m…

_reachsumit's tweet card. In this paper, we investigate the problem of quantifying fairness in Retrieval-Augmented Generation (RAG) systems, particularly for complex cognitive tasks that go beyond factual question-answering....

338

Sumit@_reachsumit · Jul 22

CatalogRAG: Retrieval-Guided LLM Prediction for Multilingual E-commerce Product Attributes Amazon introduces a retrieval-augmented system that leverages existing product catalog entries to guide LLM predictions for missing structured attributes. 📝 amazon.science/publications/c…

_reachsumit's tweet card. E-commerce stores increasingly use Large Language Models (LLMs) to enhance catalog data quality through automated regeneration. A critical challenge is accurately predicting missing structured...

582

Sumit@_reachsumit · Jul 22

Understanding Matching Mechanisms in Cross-Encoders @MathiasVast1 et al. investigate how neural ranking models construct relevance signals by analyzing attention processes and extracting causal insights in cross-encoders. 📝 arxiv.org/abs/2507.14604 👨🏽‍💻git.isir.upmc.fr/mat_vast/sigir…

479

Sumit@_reachsumit · Jul 22

GRACE: Generative Recommendation via Journey-Aware Sparse Attention on Chain-of-Thought Tokenization Walmart introduces a generative framework for multi-behavior sequential recommendation combining CoT tokenization with Journey-Aware Sparse Attention. 📝arxiv.org/abs/2507.14758

366

Sumit@_reachsumit · Jul 22

U-MARVEL: Unveiling Key Factors for Universal Multimodal Retrieval via Embedding Learning with MLLMs Introduces a unified framework for universal multimodal retrieval that systematically explores design principles. 📝arxiv.org/abs/2507.14902 👨🏽‍💻github.com/chaxjli/U-MARV…

_reachsumit's tweet card. Contribute to chaxjli/U-MARVEL development by creating an account on GitHub.

261