Orchid Majumder

I am a principal research engineer at Amazon. I work on our NeMo-Megatron based distributed training stack that is used to train the Amazon Nova models on thousands of AI accelerators on AWS. My current areas of interests are distributed training, ML systems, architecture exploration and scaling laws for LLM pre-training.

In the past, I was part of AWS AI Labs where I used to work on large-scale diffusion models. Before that, I worked on few-shot learning, meta-learning and hyperparameter optimization methods for visual recognition tasks. Please see my publications for more details. I obtained my MS from Columbia University and BS from Jadavpur University, both in CS.


Selected Publications

  • The Amazon Nova Family of Models: Technical Report and Model Card (arXiv) — Amazon AGI. [paper]
  • On the Scalability of Diffusion-Based Text-to-Image Generation (CVPR 2024) — Hao Li, Yang Zou, Ying Wang, Orchid Majumder, Yusheng Xie, R Manmatha, Ashwin Swaminathan, Zhuowen Tu, Stefano Ermon, Stefano Soatto. [paper]
  • Revisiting Contrastive Learning for Few-Shot Classification (arXiv) — Orchid Majumder, Avinash Ravichandran, Subhransu Maji, Marzia Polito, Rahul Bhotika, Stefano Soatto. [paper]
  • Incremental Meta-Learning via Indirect Discriminant Alignment (ECCV 2020) — Qing Liu, Orchid Majumder, Avinash Ravichandran, Rahul Bhotika, Stefano Soatto. [paper]
  • MARTHE: Scheduling the Learning Rate Via Online Hypergradients (IJCAI 2020) — Michele Donini, Luca Franceschi, Orchid Majumder, Massimiliano Pontil, Paolo Frasconi. [paper]
  • d-SNE: Domain Adaptation Using Stochastic Neighborhood Embedding (CVPR 2019 Oral) — Xiang Xu, Xiong Zhou, Ragav Venkatesan, Gurumurthy Swaminathan, Orchid Majumder. [paper]

GitHub

  • AdaTune — Adaptive learning rate tuning for PyTorch