Data Engineer
About Candidate
- Over 7 years of IT experience specializing in Big Data, Spark, Scala, Java, Python, and ETL/ELT solutions for both Cloud and On-Premises environments.
- Expertise in implementing and optimizing data engineering workflows for efficient data movement, transformation, and integration.
- Hands-on experience in deploying and managing big data technologies like Apache Spark, Databricks, Hadoop, and Hive, across both cloud and hybrid infrastructures.
- Skilled in using cloud platforms such as AWS, Azure, and Databricks to build and deploy data pipelines, with a focus on performance and cost-efficiency.
- Experience in building data pipelines that leverage tools like AWS Glue, AWS Lambda, AWS Kinesis, Azure Data Factory, and Azure Synapse Analytics.
- Proficient in building, optimizing, and automating ETL/ELT processes using Python, Scala, and Spark SQL for large-scale data ingestion and transformation.
- Knowledgeable in using various data storage solutions including AWS S3, Azure Data Lake, Snowflake, and Databricks Delta Lake for big data storage and management.
- Strong skills in database design, creation of tables, views, and stored procedures, and ensuring data consistency and integrity across platforms.
- Worked on deploying microservices using Docker, Kubernetes, and AWS EKS for containerization and orchestration in a cloud-native environment.
- Experienced in orchestrating and automating workflows with tools like Apache Airflow, Apache Oozie, and Apache Nifi.
- Proficient in applying Change Data Capture (CDC) techniques with tools like Attunity, Kafka, and Spark Streaming for real-time data processing.
- Strong problem-solving and troubleshooting skills with a focus on efficient system performance and scalable solutions.
- Experience with DevOps practices, including CI/CD pipelines, code automation, and deployment management using tools like Azure DevOps and Jenkins.
- Familiar with data monitoring and logging with tools like New Relic, Rollbar, and AWS CloudWatch to ensure data pipeline health and issue resolution.
- Ability to work in Agile environments, utilizing Scrum/Kanban methodologies for project management and collaboration.
- Solid background in automating and streamlining data integration processes to deliver business insights through data science and analytics pipelines.