Back to Job Board
G

Lead Data Engineer

Go Digital Technology ConsultingMH, IndiaApril 20, 2026

Job Description

Technologies / Skills: Advanced SQL, Python and associated libraries like Pandas, Numpy etc., Pyspark , Shell scripting, Data Modelling, Big data, Hadoop , Hive, ETL pipelines and IaC tools like Terraform etc. Responsibilities:* Proven success in communicating with users, other technical teams, and senior management to

  • Collect requirements, describe data modeling decisions and develop data engineering strategy.
  • Ability to work with business owners to define key business requirements and convert to user stories with required technical specifications.
  • Communicate results and business impacts of insight initiatives to key stakeholders to collaboratively solve business problems.
  • Working closely with the overall Enterprise Data & Analytics Architect and Engineering practice leads to ensure adherence with the best practices and design principles.
  • Assures quality, security and compliance requirements are met for supported area.
  • Design and create fault-tolerance data pipelines running on cluster
  • Excellent communication skills with the ability to influence client business and IT teams
  • Should have design data engineering solutions end to end. Ability to come up with scalable and modular solutions

Required Qualification:* 7 years of relevant IT experience

  • 4+ years of hands-on experience Designing and developing Data Pipelines for Data Ingestion or Transformation using Python (PySpark)/Spark SQL in AWS cloud
  • Experience in design and development of data pipelines and processing of data at scale using technologies like EMR, Kinesis, Kafka, Lambda, Glue, Athena, Redshift, Step Functions.
  • Advanced experience in writing and optimizing efficient SQL queries with Python and Hive handling Large Data Sets in Big-Data Environments
  • Experience in debugging, tunning and optimizing PySpark data pipelines
  • Should have implemented concepts and have good knowledge of Pyspark dataframes, joins, caching, memory management, partitioning, parallelism etc.
  • Understanding of Spark UI, Event Timelines, DAG, Spark config parameters, in order to tune the long running data pipelines.
  • Experience working in Agile implementations
  • Experience with building data pipelines in streaming and batch mode.
  • Experience with Git and CI/CD pipelines to deploy cloud applications
  • Good knowledge of designing Hive tables with partitioning for performance

Desired Qualification:* Experience in data modelling

  • Hands on creating workflows on any Scheduling Tool like Autosys, CA Workload Automation
  • Proficiency in using SDKs for interacting with native AWS services
  • Strong understanding of concepts of ETL, ELT and data modeling

Preparing for this role?

Practice with an AI interviewer tailored to Lead Data Engineer at Go Digital Technology Consulting.