|
Samin Yeasar Arnob
I am working at Cohere
on improving LLM reasoning (300B+) models.
I am a (soon to graduate) PhD student from McGill University and
Mila, advised by Professor
Doina Precup.
My PhD focuses on efficient and adaptive learning in large neural networks by discovering and exploiting
sparse subspaces inspired by brain-like neural pathways.
I study how overparameterized models in reinforcement learning and large language models can learn effectively using only a small subset of parameters,
improving energy efficiency, scalability, and continual learning.
My work introduces Neural Pathways for multitask RL, Sparse-Reg to improve sample efficiency and robustness in offline RL,
and Sparse Adapters for modular and compositional LLM fine-tuning and model merging.
Email
/
CV
/
Google Scholar
/
X
/
Github
|
|
Research Affiliations
 |
 |
 |
|
Cohere
Intern of Technical Stuff
09/2025 β Present
|
McGill University
PhD student
01/2020 β Present
|
Mila
Student Researcher
01/2020 β Present
|
|
 |
 |
 |
|
Microsoft Research, MTL
Student Researcher
03/2024 β 04/2025
|
Microsoft Research, NYC
Research Intern
04/2023 β 08/2023
|
Ubisoft
Research Intern
09/2021 β 08/2022
|
|
|
Research and Work Experience
Cohere
Fall 2025, Winter 2026
Intern of Technical Stuff
Building controlled-synthetic reasoning pipelines and tooling for large-scale LLM reasoning and structured training recipes.
LLMs
Reasoning
Data pipeline
RLVR / post-training
Microsoft Research, MontrΓ©al
Summer 2024, Fall 2024, Winter 2025
Student Researcher
Developed Sparse Adapters for modular, parameter-efficient fine-tuning LLMs and scalable model merging.
LLMs
Sparse adapters
Model merging
PEFT
Microsoft Research, NYC
Summer 2023
Applied Research Intern
Built agentic exploration systems using hierarchical world modelling and latent world modeling for efficient large-scale system interaction.
Agentic system
World models
Hierarchical planning
Exploration
Ubisoft
Fall 2021, Winter 2022, Summer 2022
Research Intern
conducted research on large map navigation for bots using imitation learning and offline RL, experimented with the GPT-2 model to enhance performance and generalization in automating bot navigation for future games.
Reinforcement Learning (RL)
GPT2
Imitation learning
Offline RL
Mila β Quebec AI Institute
2020 β Present
Student Researcher
Research with Doina Precup on sparse subspace optimization across reinforcement learning and large language models.
Reinforcement Learning (RL)
LLMs
Subspace learning
Sparse learning
|
|
|
Research Interest
Reinforcement Learning: Imitation Learning, Offline Reinforcement Learning, Multitask Learning, Representation Learning, Hierarchical World Model
Large Language Model: Reasoning, RLVR, RLHF, Mixture of Experts (MoE), Improving mergeability of experts, Parameter-efficient finetuning (PEFT)
|
|
|
Exploring Sparse Adapters for Scalable Merging of Parameter Efficient Experts
Samin Yeasar Arnob, Zhan Su, Minseon Kim, Oleksiy Ostapenko, Esra Saleh, Riyasat Ohib, Doina Precup, Lucas Page-Caccia, Alessandro Sordoni
COLM 2025,
Project,
Code
Keywords: Sparse adapter, Parameter-efficient finetuning, Model merging, LLM
Summary: We propose a simple, yet efficient sparse adapters training method as a building block for modular, parameter-efficient LLM finetuning, improving merging performance at scale.
|
|
Efficient Reinforcement Learning by Discovering Neural Pathways
Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Amy Zhang, Alessandro Sordoni, Doina Precup
NeurIPS 2024,
Project,
Code
Keywords: Neural Pathways, Parameter-efficient training, (Online/Offline) RL, Multitask RL
Summary: Learn specialized sparse subspaces for each RL agent that can co-exist, using ~5% of total parameters.
|
|
Sparse-Reg: Improving Sample Complexity of Offline Reinforcement Learning using Sparse Regularization
Samin Yeasar Arnob, Scott Fujimoto, Doina Precup
RLDM 2025,
Code
Keywords: Offline RL, Sparsity, Regularization, Sample Complexity, Continuous Control
Summary: Sparse regularization to mitigate overfitting in small offline datasets and improve continuous control performance.
|
|
Importance of Empirical Sample Complexity Analysis for Offline Reinforcement Learning
Samin Yeasar Arnob, Riashat Islam, Doina Precup
NeurIPS 2021 (Offline RL Workshop),
Code
Keywords: Offline RL, Sample Complexity
Summary: Study sample complexity as a robustness signal; highlight that many offline RL methods fail in low-data regimes.
|
|
Single-Shot Pruning for Offline Reinforcement Learning
Samin Yeasar Arnob, Riyasat Ohib, Sergey Plis, Doina Precup
NeurIPS 2021 (Offline RL Workshop)
Keywords: Offline RL, Sparse Networks, Pruning, Single-shot pruning
Summary: First work to demonstrate that single-shot pruning can be effective for offline reinforcement learning.
|
|
OAIRL: Off-policy Adversarial Inverse Reinforcement Learning
Samin Yeasar Arnob
ICML 2020 (Lifelong Learning Workshop),
Code
Keywords: RL, Transfer Learning, Inverse Reinforcement Learning
Summary: Improve imitation performance and transfer knowledge under dynamic task changes.
|
|