Skills
• Programming & Scripting: Python (Pandas, NumPy, SQLAlchemy, scikit-learn), SQL, PySpark
• Databases & Data Warehousing: PostgreSQL, SQL Server, Oracle, Snowflake, Redshift
• Cloud Platforms & Services: AWS (S3, Glue, EMR, RDS, Redshift, Kafka), Azure (PostgreSQL, Data Factory/Pipelines)
• Data Engineering & Orchestration: ETL pipeline development, Data Quality & Governance, Data Modeling (ER, Dimensional), Workflow Automation (Airflow, ADF), CI/CD version control (Git/GitHub)
• Analytics & Visualization: Tableau, Power BI, Exploratory Data Analysis (EDA), Statistical Analysis
• Big Data & Streaming: PySpark, Kafka, Real-time Data Processing
• Machine Learning & AI: Regression, Classification, Risk Modeling, Forecasting Models
• Performance & Testing: Query Optimization, Pipeline Performance Tuning, Data Validation & Unit Testing within ETL frameworks
• Compliance & Standards: HIPAA, ICD-10, OCC/Fed Regulatory Reporting, Audit & Data Governance
About
I am a highly motivated Senior Data Analyst with over 6 years of experience dedicated to translating complex data into clear, actionable strategies across banking, healthcare, and research.
I specialize in the full data lifecycle—from expertly designing and automating ETL pipelines using Python, SQL, and PySpark to building robust, scalable solutions on AWS and Azure. I enjoy making data easy to understand by creating engaging Tableau and Power BI dashboards that drive executive decision-making and improve operational efficiency.
My background is strong in applying advanced analytics, including machine learning models for risk management and forecasting, and ensuring compliance with critical regulations like HIPAA, ICD-10, and OCC/Fed. I am committed to data governance, audit readiness, and using tools like Airflow to ensure all processes are reliable and fully automated. I thrive on combining technical expertise with a keen business sense to reduce operational risk and enable data-driven outcomes.