Welcome to My Digital Space!

Hey folks, I'm
Sri Hari a Data Scientist
exploring the power of AI and data.


Driving insights through data science with a passion for machine learning and AI-driven innovation. I craft solutions that transform raw data into decisions that matter, fueling innovation and smarter decision-making.

My image

Skills honed over time, reflecting versatility and proficiency.

LANGUAGES & LIBRARIES
Csharp image

Python

Dotnet Core image

R

Blazor

SQL

EF Core image

SAS

Html image

MATLAB

Bootstrap image

scikit-learn

Angular image

TensorFlow

Node image

Keras

Express JS image

XGBoost

Electron JS image

NLTK/ spaCy


DATABASES & BIG DATA
SQL Server image

MySQL

Comsos Db image

PostgreSQL

MySQL image

Amazon Redshift

MySQL image

Snowflake

Mongo Db image

Mongo DB

Firebase image

Hadoop (HDFS)

Couchbase image

HiveQL

Dynamo Db image

Apache Spark

Maria Db image

PySpark

Redis cache image

MS SQL Server


CLOUD & TOOLS
Git image

AWS(S3,EC2,Glue,Athena)

GitHub image

Azure (Basics)

Appservice image

Docker

Service bus image

Apache Airflow

Azure function app image

Git

Azure devops image

GitHub

VSTS image

Tableau

Docker image

Power BI

WinDbg image

VS Code

MySQL image

Jupyter Notebook


Experience

2024-present

Providence Health & Services, Data Scientist

Developed and deployed ML models to predict patient outcomes and reduce readmissions by 18%. Built NLP pipelines with spaCy and Transformers for medical text mining. Automated data workflows with Spark and Airflow, and created real-time dashboards using Tableau and Power BI.

2023–2024

BNY Mellon, Data Scientist

Tackled imbalanced fraud data using SMOTE and ensemble models. Engineered features and built models to predict loan defaults. Worked in Spark and AWS environment to scale ML workflows and optimize fraud detection accuracy by 22%.

2022–23

ManipalCigna Health Insurance, Data Analyst / Data Scientist

Analyzed large health datasets to optimize claim processing and customer segmentation. Built ETL pipelines and interactive dashboards to support business decisions. Conducted A/B testing and modeled financial data to guide pricing strategies.

2021–22

Johnson & Johnson, Data Analyst

Built and maintained SSIS/SSRS reports and SQL pipelines. Automated financial reports including Profit & Loss, EBIT, and ROIC. Used SAS and SQL for statistical modeling and regression analysis to support business and mortgage analytics.

Education

2023-24

University of North Carolina at Charlotte, Master of Science

Related coursework:Artificial Intelligence, Visual Analytics, Knowledge discovery in databases, Database Systems.

GPA: 3.88 out of 4.0

2018-22

VIT University, Bachelor of Science

Project: Designed and implemented a Synthetic Text Image Generator using SynthTIGER to automate the creation of labeled datasets for training OCR models. Enabled large-scale generation of annotated text images with support for masks, bounding boxes, and multilingual corpora, accelerating text detection research workflows.

GPA: 3.62 out of 4.0

Projects I've Built Using My Skills

Reddit Data Pipeline

Reddit Data pipeline image

Built a cloud-based ETL pipeline to extract Reddit posts, store them in S3, transform with AWS Glue, and query using Athena and Redshift. Orchestrated with Airflow & Celery, containerized using Docker.

Netflix-GPT

Netflix-GPT image

A React-based streaming app offering personalized movie recommendations using OpenAI’s GPT. Features secure Firebase login, TMDB integration, and a clean, intuitive UI powered by Redux.

Text-Image-Generator

Text-Image-Generator image

Built a high-performance dataset generator using SynthTIGER to create annotated text images for OCR training. Enabled scalable generation with masks, bounding boxes, and multilingual support—boosting model accuracy and speed.