Driving insights through data science with a passion for machine learning and AI-driven innovation. I craft solutions that transform raw data into decisions that matter, fueling innovation and smarter decision-making.
Python
R
SQL
SAS
MATLAB
scikit-learn
TensorFlow
Keras
XGBoost
NLTK/ spaCy
MySQL
PostgreSQL
Amazon Redshift
Snowflake
Mongo DB
Hadoop (HDFS)
HiveQL
Apache Spark
PySpark
MS SQL Server
AWS(S3,EC2,Glue,Athena)
Azure (Basics)
Docker
Apache Airflow
Git
GitHub
Tableau
Power BI
VS Code
Jupyter Notebook
Developed and deployed ML models to predict patient outcomes and reduce readmissions by 18%. Built NLP pipelines with spaCy and Transformers for medical text mining. Automated data workflows with Spark and Airflow, and created real-time dashboards using Tableau and Power BI.
Tackled imbalanced fraud data using SMOTE and ensemble models. Engineered features and built models to predict loan defaults. Worked in Spark and AWS environment to scale ML workflows and optimize fraud detection accuracy by 22%.
Analyzed large health datasets to optimize claim processing and customer segmentation. Built ETL pipelines and interactive dashboards to support business decisions. Conducted A/B testing and modeled financial data to guide pricing strategies.
Built and maintained SSIS/SSRS reports and SQL pipelines. Automated financial reports including Profit & Loss, EBIT, and ROIC. Used SAS and SQL for statistical modeling and regression analysis to support business and mortgage analytics.
Related coursework:Artificial Intelligence, Visual Analytics, Knowledge discovery in databases, Database Systems.
GPA: 3.88 out of 4.0
Project: Designed and implemented a Synthetic Text Image Generator using SynthTIGER to automate the creation of labeled datasets for training OCR models. Enabled large-scale generation of annotated text images with support for masks, bounding boxes, and multilingual corpora, accelerating text detection research workflows.
GPA: 3.62 out of 4.0
Built a cloud-based ETL pipeline to extract Reddit posts, store them in S3, transform with AWS Glue, and query using Athena and Redshift. Orchestrated with Airflow & Celery, containerized using Docker.
A React-based streaming app offering personalized movie recommendations using OpenAI’s GPT. Features secure Firebase login, TMDB integration, and a clean, intuitive UI powered by Redux.
Built a high-performance dataset generator using SynthTIGER to create annotated text images for OCR training. Enabled scalable generation with masks, bounding boxes, and multilingual support—boosting model accuracy and speed.