ASTROPHYSICIST · ML ENGINEER · AI RESEARCHER

Rohan Pattnaik

I build machine learning systems for data that doesn't fit the textbook — and have been doing it at the intersection of astrophysics and AI for 8 years.

About

I started my career in computer science, drawn to hard problems. Then I discovered astrophysics — a field with extraordinary data and almost no off-the-shelf solutions.

For 8 years, I've built ML systems for data that most engineers never encounter: galaxy spectra, X-ray binaries, 21cm cosmological signals, transient astronomical events. Problems where you can't Google the solution and the training data might be labeled by 5,000 volunteers.

That unusual trajectory gave me something rare: the ability to drop into a new domain, understand it deeply, and build AI that actually works — not just prototypes, but systems that get published, deployed, and adopted by other researchers.

Currently at Johns Hopkins University as an Assistant Research Scientist, scaling foundation models for spectroscopy and exploring cross-domain transfer to mass spectrometry for planetary biosignature detection.

I'm actively looking for industry roles in ML/AI where rigorous thinking and unconventional data are features, not bugs.

BACKGROUND

PhD, Astrophysical Sciences & Technology
Rochester Institute of Technology

B.Tech, Computer Science & Engineering
IIIT Bhubaneswar

CURRENTLY

Assistant Research Scientist
Johns Hopkins University
Baltimore, MD

OPEN TO

ML Engineer · AI Researcher
Senior Data Scientist · Applied Scientist

What I've Built

01

SpecPT — Transformer Foundation Model for Spectroscopy

Designed a transformer autoencoder for self-supervised spectral representation learning, trained on 13 million galaxy spectra. The model denoises high-dimensional sequential inputs and predicts continuous targets with R² = 0.99 — compressing a months-long manual analysis pipeline to seconds per sample. Enables zero-shot transfer to new instruments.

Most remarkably: a University of Maryland team adopted SpecPT as the backbone for a mass spectrometry classifier targeting biosignature detection on planetary rovers — with minimal fine-tuning. The representations learned from galaxies transferred to biology.

Transformer Self-Supervised Foundation Model Transfer Learning PyTorch
Published · The Astrophysical Journal
02

Redshift Wrangler — Human-in-the-Loop ML at Scale

Built a complete citizen science data pipeline: converted raw FITS astronomical spectral files into visual formats suitable for non-expert annotation, launched the platform on Zooniverse, and recruited 5,000+ active volunteers who contributed 190,000 classifications. The resulting labeled dataset powers downstream ML pipelines with trustworthy training data — achieved at a fraction of the cost of expert labeling.

Data Pipeline Human-in-the-Loop Crowdsourcing FITS Python
5,000+ volunteers · 190K classifications
03

Cross-Validation Audit Framework for Regulatory Deposit Forecasting

At JPMorgan Chase, designed an audit framework for X-13 ARIMA time-series models used in regulatory deposit forecasting. Identified systematic overfitting in complex model configurations that masked true accuracy. Recommended and validated simpler, interpretable alternatives — improving out-of-sample forecast accuracy by 15% while reducing model complexity and regulatory compliance overhead.

Time-Series Model Risk Statistical Validation X-13 ARIMA R
+15% out-of-sample accuracy
04

Expert-Labeling Web App + Fine-tuned ResNet (Schlumberger-Doll Research)

Built a containerized expert-labeling web application (Docker + Flask) to replace heuristic-generated labels for rock particle classification. Fine-tuned ResNet on the new expert labels, achieving 85%+ classification accuracy versus the 60% average from heuristics. Replaced a manual, inconsistent process with a reproducible MLOps pipeline.

Computer Vision ResNet Docker Flask MLOps Fine-tuning
85% accuracy vs 60% baseline
05

Real-Time Astronomical Transient Classifier

Built a CNN classifier for real-time detection of transient astronomical events — gamma-ray bursts, supernovae, flare stars — in the Deeper, Wider, Faster survey. Reduced manual inspection by 95%, achieving a 21× speedup in event detection pipeline throughput.

CNN Anomaly Detection Real-Time Python Astronomy
95% reduction in manual inspection

The Toolkit

ML & Deep Learning

  • Transformers · CNNs · RNNs
  • Autoencoders · VAEs
  • Bayesian Inference
  • Transfer Learning
  • Distributed Training
  • Multi-task Learning

Infrastructure & Tools

  • Docker · Flask · Git
  • Linux/Unix · Jupyter
  • HuggingFace Accelerate
  • SLURM / HPC clusters
  • LaTeX

Languages

  • Python (primary)
  • R · SQL · C++ · Julia
  • Bash · HTML/CSS

Data & Analysis

  • PyTorch · scikit-learn
  • pandas · NumPy · SciPy
  • OpenCV · Keras
  • Astropy · FITS data

Selected Publications

SpecPT: A Universal Spectroscopic Analysis and Redshift Measurement Framework

Pattnaik, R., et al. — The Astrophysical Journal, 988(1), 139

Transformer Self-Supervised
→ Read Paper

A machine-learning approach for classifying low-mass X-ray binaries based on their compact object nature

Pattnaik, R., et al. — MNRAS 501.3 (2021): 3457–3471

Random Forest Multi-class
→ Read Paper

Machine learning in astronomy

Kembhavi A. & Pattnaik, R. — J. Astrophys. Astron. 43, 76 (2022)

Review
→ Read Paper

Neuro-Parametric Spectral Classification of Black Hole and Neutron Star X-ray Binary Systems

Garg A. et al. incl. Pattnaik, R. — Submitted to ApJ

Neural Network Physics-informed ML

Awards & Grants

🏛

NSF AAG Grant (Co-Investigator) · $394,869 · Award #2511507

"A Foundational Model for Extragalactic Spectroscopy: Transformer-Based Deep Learning and Citizen Science"

🏆

Chambliss Astronomy Achievement Award · Honorable Mention · AAS 241

🎓

Steven M. Wear Endowed Graduate Fellowship · 2021 · RIT

🌏

Centre for Astrophysics & Supercomputing Vacation Scholarship · Swinburne University

💡

Winner, Smart India Hackathon 2017

Selected Talks

Type Talk / Venue Date
Invited Infrared Spectroscopy from Space · IPAC Caltech Oct 2025
Invited Modern Statistics of Galaxies Seminar · LMU Munich Jul 2025
Invited Zwicky Transient Facility ML Meeting Feb 2025
Invited Johns Hopkins University Feb 2025
Contributed AI/ML Applications in Astronomy & Astrophysics Jan 2025
Invited Physics Colloquium · SUNY Geneseo Jan 2024
Invited AI in Astronomy Meeting · Universidade de São Paulo, Brazil Oct 2022
Talk ML in X-Ray Astronomy: Classifying Black Holes & Neutron Stars · PyData London Apr 2018

Let's Talk

I'm open to senior ML/AI roles, applied research positions,
and interesting consulting projects.