Skills
• Programming & Scripting: Python (Pandas, NumPy, SciPy, Scikit-learn), Java, Scala, SQL (T-SQL, PL/SQL), R, Shell Scripting
• Big Data & Streaming: Spark (Batch & Structured Streaming), Hadoop (HDFS, Hive, Pig, HBase, Sqoop, Oozie), Kafka, MapReduce
• Cloud & Data Engineering: AWS (S3, EC2, EMR, Redshift), Azure (Data Factory, Databricks), GCP (BigQuery, Dataflow, Cloud Functions)
• ETL & Workflow Orchestration: Apache Airflow, Apache Beam, StreamSets, Azure Data Factory
• Databases & Warehousing: Oracle, SQL Server, PostgreSQL, MySQL, MongoDB, Cassandra; Star & Snowflake Schemas; OLTP/OLAP
• Machine Learning & Analytics: Regression, Decision Trees, Random Forest, KNN, K-Means, NLP, TensorFlow
• Data Visualization & BI: Tableau, Power BI, Matplotlib, Seaborn, ggplot
.DevOps & Deployment: Git, Jenkins, Docker, CI/CD pipelines
. APIs & Integration: REST APIs, FastAPI, Flask, JSON, XML, Kafka Connect
•Methodologies: Agile/Scrum