PORTFOLIO

Technical Skills

Programming Languages

Python

MySQL

SAS

MATLAB

C++

Web Technologies

HTML5

CSS3

JavaScript

Libraries & Frameworks

PyTorch

TensorFlow

Keras

Scikit-Learn

NLTK

Visualizations & Tools

Power BI

Tableau

VS Code

Git

LaTeX

Featured Researches

Decoding Human Dialogue: Sentiment to Deception

It all started with a pair of Nike shoes. Moments after casually mentioning them in a private chat to a friend, an ad for those exact sneakers appeared on my social media feed. That uncanny moment sparked a deep curiosity: How do machines read and understand human conversation?

Decoding the "Why" Behind the Reviews

Determined to understand how algorithms process text, my journey began in R with a dataset of over 14,000 Amazon earphone reviews. I sought to move beyond basic binary classification—identifying whether a review was simply positive or negative—to uncover the specific reasons why consumers felt that way. After training the system to recognize conversational nuances, such as understanding that the phrase "not good" conveys a negative sentiment rather than a positive one, I applied Latent Semantic Analysis (LSA) to automatically cluster thousands of reviews into hidden thematic categories.

The Finding: The algorithm revealed that while customers highly praised the sound quality and battery life, their primary sources of frustration were poor physical fit and unreliable Bluetooth connectivity.

RStudio

sentimentr

quanteda

Topic Modeling (LSA)

Code GitHub Repo

Teaching Machines Context

With time, it became clear that human language is far too complex for simple keyword dictionaries. I transitioned to Python to build a more advanced predictive pipeline capable of understanding nuance. I developed a "hybrid" feature engineering approach that taught the machine to read multi-word phrases rather than isolated words. This allowed the algorithm to mathematically weigh the true meaning of a phrase based on its surrounding context, completely changing how it processed customer feedback.

The Finding: By equipping the algorithm to recognize contextual cues and negations (such as realizing "not terrible" is a compliment), the Support Vector Machine (SVM) model successfully learned to interpret nuanced sentiment, achieving a robust 91% predictive accuracy.

Python

VS Code

Scikit-Learn

Hybrid Feature Engineering

NLTK

Read Paper GitHub Repo

Memory and Massive Scale

To process a massive, noisy dataset of over 1.5 million social media posts efficiently, I hit the computational limits of traditional Machine Learning. I transitioned to Deep Learning, specifically utilizing sequence models like Long Short-Term Memory (LSTM) networks. Instead of just reading isolated phrases, this advanced neural network acts like a human memory—it remembers how a sentence starts so it can accurately interpret how it ends. This allowed the system to capture profound semantic nuances across millions of unstructured records.

The Finding: By utilizing deep sequence memory to process complex language patterns and word embeddings, the LSTM architecture successfully captured context at an unprecedented scale, achieving an exceptional 98% predictive accuracy.

Python

TensorFlow

Keras

LSTM Architecture

Word Embeddings

Read Paper GitHub Repo

Detecting Deception in Low-Resource Languages

Equipped with a strong foundation in deep learning, I applied these advanced neural networks to a much harder, real-world problem: detecting deceptive fake news in Bengali. Working with a "low-resource" language means operating without the vast, pre-built dictionaries available for English. It required building a system that could inherently understand the structural flow of the language to spot manipulative and misleading patterns without relying on existing translational tools.

The Finding: By designing a custom Bidirectional Gated Recurrent Unit (BiGRU) architecture that mathematically processes text both forwards and backwards, the system successfully analyzed 50,000 news articles to separate truth from deception with an astounding 99.16% accuracy.

Python

TensorFlow

Keras

BiGRU Architecture

Low-Resource NLP

Read Paper ACM Digital Library