Data Engineer

Ford Motor Private Limited · Chennai, Tamil Nadu

// classified as

Other (Adjacent or hard to classify.)

> stack

javapythonscalasqlawsbigquerydockerhadoopkafkakubernetessparktableauterraform

> description

We are looking for a highly technical Senior Data Engineer to design, build, and optimize our data infrastructure. This role is unique as it requires a 'Developer-first' mindset—combining traditional data engineering (ETL/ELT) with modern DevOps practices and cloud-native architecture. You will be responsible for ensuring that our data pipelines are not only high-performing but also automated, scalable, and resilient within the Google Cloud Platform (GCP) ecosystem

Design, build, and maintain scalable, efficient, and reliable data pipelines and architectures on Google Cloud Platform (GCP).

* Utilize GCP services such as BigQuery, Cloud Dataflow, Cloud Storage, Cloud Pub/Sub, Dataform, Cloud run and Artifact registry for data ingestion, transformation, and storage.

* Develop and optimize Extract, Transform, Load (ETL/ELT) processes to integrate data from various internal and external sources, both structured and unstructured.

* Implement and manage data warehousing concepts, data modeling (relational and dimensional), and database design.

* Ensure data quality, integrity, and governance throughout the data lifecycle.

* Monitor, troubleshoot, and resolve issues in data systems and pipelines to ensure continuous operation and performance.

* Implement and manage CI/CD pipelines for automated testing, deployment, and release of data engineering and BI solutions on GCP.

* Utilize Infrastructure as Code (IaC) tools such as Terraform to provision and manage GCP resources.

* Employ containerization technologies like Docker and orchestration tools like Kubernetes for deploying data applications and services.

* Automate repetitive tasks and operational processes using scripting languages (e.g., Python, Bash).

* Monitor system performance, resource utilization, and implement logging and alerting mechanisms for data infrastructure.

* Collaborate with development and operations teams to streamline workflows and ensure high availability, reliability, and scalability of cloud-based systems.

Must-Have Qualifications:

* Bachelor's degree in Computer Science, Information Systems, Engineering, or a related quantitative field.

* 3-5+ years of strong experience in Data Engineering, with significant hands-on experience on Google Cloud Platform (GCP) or any leading cloud platform.

* Expertise with core GCP data services (e.g., BigQuery, Cloud Dataflow, Cloud Storage, Cloud SQL, Cloud Pub/Sub) or equivalent services in any leading cloud platform.

* Strong proficiency in SQL and at least one programming language like Python, Java, or Scala for data manipulation and pipeline development.

* Extensive experience designing and implementing ETL/ELT processes and managing data warehousing solutions.

* Solid understanding and practical experience with DevOps principles and practices

* Experience with CI/CD tools (e.g., Tekton, Jenkins, GitLab CI, Cloud Build) and version control systems (e.g., Git).

* Familiarity with Infrastructure as Code (IaC) tools (e.g., Terraform, Ansible).

* Experience with containerization (Docker) and orchestration (Kubernetes, Astronomer etc.).

* Strong analytical, problem-solving, and critical thinking skills with meticulous attention to detail.

* Excellent communication and interpersonal skills, with the ability to articulate complex technical concepts to both technical and non-technical stakeholders.

* Experience working in an Agile development environment.

Nice-to-Have Qualifications:

* Hands-on experience building applications powered by Large Language Models (LLMs), including expertise in prompt engineering, fine-tuning, and implementing RAG (Retrieval-Augmented Generation) architectures.

* Implemented Vector Databases (such as Vertex AI Vector Search, FAISS) for semantic search capabilities.

* GCP Data Engineer or DevOps certification.

* Experience with other cloud platforms (AWS, Azure).

* Familiarity with big data technologies like Apache Spark or Hadoop ecosystems.

* Experience with data governance, security, and compliance principles (e.g., GDPR, HIPAA).

* Knowledge of streaming data processing (e.g., Apache Kafka, GCP Pub/Sub).

* Experience with BI tools (e.g., Qliksense, Power BI, Tableau, Looker).

* Prior experience mentoring junior team members.