Data Engineer
About Candidate
The candidate is an accomplished Lead Data Engineer with extensive experience in designing, building, and optimizing large-scale data infrastructures and pipelines. They have a proven track record of working with leading organizations across diverse domains, leveraging their expertise in Big Data technologies, AWS services, and ETL pipelines to deliver scalable and efficient solutions.Key achievements include developing end-to-end ETL pipelines, implementing Data Mesh architecture for distributed data processing, and migrating large datasets securely to cloud platforms like AWS S3. Their proficiency spans tools and frameworks such as PySpark, Apache Airflow, Kafka, Hive, and Delta Lake, enabling robust data governance and advanced analytics capabilities.
They have successfully transitioned legacy systems to modular, service-oriented architectures, designed star schema models for data warehouses, and automated ETL pipelines to ensure data accuracy and consistency. Their technical expertise extends to cloud platforms (AWS EC2, EMR, Glue, S3), orchestration tools (Airflow, Oozie), and DevOps practices (Docker, Jenkins, GitHub Actions).As a leader, they have demonstrated excellence in mentoring junior engineers, guiding architectural planning, and managing cross-functional teams to meet delivery milestones. Their hands-on approach to solving complex data challenges, coupled with their ability to design scalable data solutions, has significantly contributed to the success of high-impact projects.With a solid foundation in Big Data technologies and a strong focus on optimizing query performance and data transformation, the candidate brings a unique blend of technical expertise, leadership, and a passion for data-driven innovation.