About
I am an undergraduate studying Computer Science and Mathematics at St. Edward's University in Austin, Texas. My work sits at the boundary of intelligent systems, reliable software engineering, and the mathematical structures that underpin modern machine learning.
My interest in AI research grew from a recurring tension I noticed while building production software: the gap between how systems are designed to behave and how they actually behave under real-world conditions. That tension motivates much of how I think about machine learning — not as a toolbox of techniques, but as a collection of open problems about reliability, generalization, and the relationship between scale and understanding.
I founded a student-led AI Paper Reading Circle at St. Edward's to build a genuine research culture among undergraduates, and I have independently constructed an annotated NLP dataset for Banglish (Bangla-English code-mixed) sentiment analysis — a domain that remains severely underrepresented in the research literature. I view engineering experience not as separate from research but as foundational to it: the ability to implement and stress-test ideas is how theoretical intuitions become grounded.
I am preparing to apply for PhD programs in AI/ML for Fall 2026, seeking research environments where rigorous theory and practical systems inform one another.
Research Interests
-
Natural Language Processing & Language Models
Knowledge representation in large language models, retrieval-augmented generation, and the reliability of LLM outputs under distribution shift. I am particularly interested in how models trained on imbalanced or code-mixed corpora can be made more robust — motivated in part by my work on low-resource Banglish NLP.
-
Machine Learning Theory & Generalization
Theoretical foundations of deep learning, including generalization bounds, optimization landscapes, and the implicit biases of gradient-based training. I am drawn to questions that connect formal statistical theory to empirically observed phenomena in modern networks.
-
Reliable & Scalable AI Systems
The infrastructure challenges of deploying intelligent systems at scale: efficient serving, pipeline reliability, and the systems-level design decisions that determine whether ML research translates into robust production applications. My backend engineering background makes this intersection particularly natural for me.
-
Mathematical Foundations of Computing
Probability theory, linear algebra, and discrete mathematics as formal tools for reasoning about algorithms and learning systems. I believe that mathematical fluency — not just mathematical familiarity — is what distinguishes research contributions that last from those that do not.
Research Work
Constructed and manually annotated a corpus of Bangla-English code-mixed product reviews for sentiment analysis. Addresses a documented scarcity of labeled data for South Asian code-mixed language varieties. Annotation protocol follows established NLP labeling conventions. Ongoing research includes baseline model training and evaluation on the dataset. This project grew from an observation that the majority of NLP benchmark datasets reflect high-resource, monolingual settings — a gap that, if unaddressed, means real-world language users in multilingual communities are systematically underserved by deployed NLP systems.
Founded and facilitate a weekly seminar in which students engage directly with primary AI/ML literature. Papers covered include foundational transformer architectures (Attention is All You Need, BERT, GPT), alignment-oriented work, and recent empirical investigations of LLM capabilities and failures. The group has covered papers from NeurIPS, ICML, ACL, and ICLR. Designed to build the close-reading skills that distinguish researchers from consumers of AI tools.
Systematic exploration of retrieval-augmented generation architectures, LLM-based workflow automation, and structured output generation. This work, conducted in parallel with production engineering, is aimed at developing a practitioner-level understanding of where current LLM systems succeed and where they fail — the kind of grounded understanding that informs good research questions.
Education
2026
B.S. Computer Science & Mathematics · Austin, Texas
Coursework spanning theory of computation, statistical methods, mathematical proof, and applied AI systems.
AI Engineering Specialization & Deep Learning Courses
Completed coursework in neural network architectures, NLP pipelines, computer vision, and model evaluation. Public lab notebooks available on GitHub.
Experience
Present
Designed and shipped production backend systems for multiple clients across tax services, automotive, lead generation, and community web applications. Built RESTful APIs using Node.js, TypeScript, and NestJS; managed PostgreSQL, MySQL, MongoDB, and Redis databases; deployed with Docker and CI/CD pipelines. Clients include MH Tax Solutions, Austin Ummah Soccer, and Ferraws Towing. Core focus: scalable architecture, clean code, and system reliability under real production conditions.
Present
Founded a student-led research seminar focused on close reading of AI/ML primary literature. Facilitates weekly discussion sessions, selects papers from leading venues (NeurIPS, ICML, ACL, ICLR), and mentors peers in developing critical research evaluation skills.
Designed annotation schema and manually labeled a corpus of Bangla-English code-mixed e-commerce reviews. Established baseline NLP pipelines for sentiment classification. Work is ongoing; dataset is publicly released for community use.
Present
Building applied systems involving large language models: RAG pipeline construction, structured output generation, automation of multi-step workflows. Parallel to academic coursework, this work grounds theoretical understanding in concrete engineering constraints.
Selected Projects
Manually labeled corpus of code-mixed Bangla-English product reviews for sentiment analysis. Designed to fill a gap in annotated resources for South Asian multilingual NLP. Ongoing research on baseline model construction and evaluation. GitHub →
Production-grade Python system integrating LLM workflows, web data collection, and structured output pipelines for automated lead generation. Emphasis on reliability, configurable targeting, and clean output schemas. Private repository.
Production web platform for an Austin tax services business. Built for security, reliability, and ease of client interaction. Live →
Laboratory notebooks from the DeepLearning.AI specialization and IBM AI Engineering program — covering neural network implementation, CNNs, NLP pipelines, and model evaluation from first principles. GitHub →
Python-based predictive model for food insecurity, applying supervised learning to socioeconomic datasets. Demonstrates applied ML in a domain with direct social relevance. Private repository.
Python implementation of face detection and matching using computer vision techniques. One of earlier projects applying ML to a concrete visual recognition task. GitHub →
Technical Skills
Contact
I am actively seeking PhD programs in AI, Machine Learning, and related areas for Fall 2026 enrollment. I welcome correspondence from faculty working on language models, ML theory, and AI systems — and from researchers who share an interest in grounding theoretical AI work in real-world system constraints.
The best way to reach me is by email. I am also available on LinkedIn and GitHub for more informal professional contact.