Data Engineer

SAN0445

About Candidate

  • Expertise in data engineering, building and managing large-scale ETL pipelines using Apache Spark, Hive, and AWS tools.
  • Strong experience in managing big data infrastructure, including Cloudera and AWS S3, with a focus on scalability and performance optimization.
  • Proficient in building data lakes, setting up raw and transformed data layers, and developing data aggregation strategies.
  • Hands-on experience with data orchestration using Apache Airflow and Redshift as a data warehouse solution.
  • Skilled in implementing and maintaining data quality frameworks, leveraging tools like “Great Expectations” and “Deequ” for data validation.
  • Experience in the design and deployment of complex, enterprise-level data pipelines for telecom and healthcare industries.
  • Adept at data migration and system upgrades, such as moving from HDP 2.7 to HDP 3.0.
  • Knowledgeable in integrating data from multiple sources and vendors in various formats (CSV, Parquet, XLSX).
  • Proficient in utilizing Apache Kafka, Ignite, and Cassandra for data streaming and marketing campaign triggers.
  • Experience in building and maintaining documentation for data processes and deployment workflows.
  • Skilled in handling massive datasets, with expertise in loading and managing operational workloads in Hadoop and Teradata environments.
  • Competence in user management and access control using tools like Kerberos and Apache Ranger.
  • Expertise in SQL and query optimization for large-scale data analysis and reporting needs.
  • Proficient in leveraging cloud-based tools (AWS Lambda, S3, Redshift) to automate data workflows and reporting.
  • Solid understanding of data security and privacy practices, including data encryption and compliance with industry standards.
  • Experience in creating actionable insights from network and performance data, using automation to streamline reporting and analysis.
  • Hands-on experience with BI tools like Apache Superset, Looker, and IBM Cognos for internal and customer-facing analytics.
  • Proven track record in leading data engineering projects from scratch and successfully implementing data solutions in real-time environments.

Skills

Data Engineering, Big Data (Apache Spark, Hive, Kafka, Cassandra), ETL Pipelines, Apache Airflow, AWS (S3, Lambda, Redshift), Data Quality Frameworks (Great Expectations, Deequ), Data Lakes, Data Warehousing, Data Migration, Cloudera, SQL, Apache Superset, Looker, BI Tools (IBM Cognos, Tableau), Cloud Solutions, Data Security & Privacy, Reporting & Automation, System Optimization, User Management (Kerberos, Apache Ranger)

Be the first to review “Data Engineer”

Your Rating for this listing