Wait, What Do You Do?

About the opportunity:

Daybreak’s Data Engineers have a strong understanding of database structures, modeling, and data warehousing techniques; know how to create SQL queries, stored procedures, views and define best practices for engineering scalable secure data pipelines. Members of our Data Engineering team are passionate about embracing the challenge of dealing with petabytes of data on a daily basis.

As a member of the technical (data science and engineering) team in Customer Success at Daybreak.ai, you will be part of a fast-paced team tackling the technical challenges related to deploying our products. You will be the tip of the spear, applying Daybreak’s products to make an impact using real-world data. Because you see how the technology truly acts in the real world, using real-world data, your insights and feedback will have a heavy influence on shaping the design of the product. Given the growth stage and complexity of our products, you will not only be working on deploying products - but also innovating on the process of automating the deployment of AI software at scale.

What you’ll do:

Contribute to the data engineering development work
Collaborate with multiple stakeholders including but not limited to the Infrastructure, DevOps, and Data Science team
Interface with customer facing teams
Support and monitor multiple data pipelines across different customers
Work closely with the development teams to understand changes to every release

About you:

MUST HAVE

2+ years of experience with data pipelining and development
Experience implementing or connecting to ERPs and/or advanced planning systems.
Bachelor or Advanced degree in Computer Science, Engineering, Technology, or a relevant field
Hands on experience of working with distributed data systems such as Spark, Hive, or HDFS
Good knowledge on Python programming language
Good knowledge in SQL queries (not limited to PostgreSQL)
Experience in data pipeline orchestration tools like Airflow, Dagster.
Basic understanding of containers and familiarity with docker commands
Hands on experience of working with distributed data systems such as Spark, Hive, or HDFS
Very good debugging skills
Flexible to learn new technologies and adapt to dynamic environment

Helpful to have:

Exposure to cloud preferably AWS, S3, or EMR
Experience on processing large scale time series data would be preferred
Exposure to Kubernetes
Understanding of ML model lifecycle and pipelines
Experience with (and excited about) interdisciplinary collaboration