← back to jobs
> job detail
N
⚙️Data Engineer

Data Engineer

Noodle · Bengaluru, Karnataka, India
// classified as
Data Engineer (Pipelines, infra, ingestion, ETL.)
posted
1d ago
location
Bengaluru, Karnataka, India
languages
python, sql
tools
aws, docker, hive
> stack
pythonsqlawsdockerhives3sparkairflow
> description

About the opportunity:

Daybreak’s Data Engineers have a strong understanding of database structures, modeling, and data warehousing techniques; know how to create SQL queries, stored procedures, views and define best practices for engineering scalable secure data pipelines. Members of our Data Engineering team are passionate about embracing the challenge of dealing with petabytes of data on a daily basis.

As a member of the technical (data science and engineering) team in Customer Success at Daybreak.ai, you will be part of a fast-paced team tackling the technical challenges related to deploying our products.  You will be the tip of the spear, applying Daybreak’s products to make an impact using real-world data.  Because you see how the technology truly acts in the real world, using real-world data, your insights and feedback will have a heavy influence on shaping the design of the product.  Given the growth stage and complexity of our products, you will not only be working on deploying products - but also innovating on the process of automating the deployment of AI software at scale.  



What you’ll do:

  • Contribute to the data engineering development work
  • Collaborate with multiple stakeholders including but not limited to the Infrastructure, DevOps, and Data Science team
  • Interface with customer facing teams 
  • Support and monitor multiple data pipelines across different customers
  • Work closely with the development teams to understand changes to every release

About you:

MUST HAVE 

  • 2+ years of experience with data pipelining and development
  • Experience implementing or connecting to ERPs and/or advanced planning systems.
  • Bachelor or Advanced degree in Computer Science, Engineering, Technology, or a relevant field
  • Hands on experience of working with distributed data systems such as Spark, Hive, or HDFS
  • Good knowledge on Python programming language
  • Good knowledge in SQL queries (not limited to PostgreSQL)
  • Experience in data pipeline orchestration tools like Airflow, Dagster.
  • Basic understanding of containers and familiarity with docker commands
  • Hands on experience of working with distributed data systems such as Spark, Hive, or HDFS
  • Very good debugging skills
  • Flexible to learn new technologies and adapt to dynamic environment

Helpful to have:

  • Exposure to cloud preferably AWS, S3, or EMR
  • Experience on processing large scale time series data would be preferred
  • Exposure to Kubernetes
  • Understanding of ML model lifecycle and pipelines
  • Experience with (and excited about) interdisciplinary collaboration