> job detail
B
⚙️Data Engineer
Data Operations Lead
Bioptimus · Paris - Berlin - London - EU Remote
// classified as
Data Engineer (Pipelines, infra, ingestion, ETL.)
posted
1d ago
location
Paris - Berlin - London - EU Remote
languages
python, r, sql
tools
aws, s3
> stack
pythonrsqlawss3
> education
bachelorsmasters
> description
<div class="content-intro"><p>Bioptimus is building the first universal AI foundation model for biology to fuel breakthrough discoveries and accelerate innovation in biomedicine. With more than $75M in funding, Bioptimus is a fast-growing start-up headquartered in Paris, incorporated in October 2023. Backed by leading international venture capitalists, our world-class team of scientists and engineers is redefining the frontiers of AI and life sciences. </p></div><h2 style="line-height: 1.5;"><strong>About the role</strong></h2>
<hr>
<p data-start="18" data-end="374">We are looking for a highly organized and technically proficient Data Operations Lead to own and scale the operational lifecycle of biomedical data partnerships. In this critical role, you will serve as the bridge between external clinical and research partners, our internal Data team, and the engineering environment that powers our AI foundation models.</p>
<h2 style="line-height: 1.5;"><strong>What you'll be doing</strong></h2>
<hr>
<p data-start="1045" data-end="1105">As a Data Operations Lead, you will own the following tasks:</p>
<ul>
<li><strong>Data Partnership Operations & Lifecycle Management</strong>: Own the operational lifecycle of external data partnerships following contract signature. Act as the primary operational and technical point of contact for hospitals, biobanks, CROs, and research laboratories. Coordinate onboarding, data delivery timelines, and stakeholder communication to ensure successful execution of partnership milestones.</li>
<li><strong>Data Transfer & Infrastructure Coordination</strong>: Manage secure biomedical data transfers using cloud infrastructure and standardized transfer protocols. Coordinate access management, encryption, and ingestion workflows across cloud storage systems (AWS S3, SFTP, APIs, direct upload pipelines). Ensure incoming datasets are delivered, validated, and tracked according to internal governance standards.</li>
<li><strong>Clinical & Multi-Omics Data Harmonization</strong>: Collaborate with internal technical and product teams to define and maintain harmonized data models and metadata standards across complex clinical and multi-modal datasets. Organize and maintain relationships between clinical metadata and associated omics or imaging assets, including genomics, transcriptomics, spatial biology, and pathology data.</li>
<li><strong>Pipeline Operations & Automation</strong>: Work closely with engineering and data teams to configure and maintain lightweight ingestion and QC pipelines. Identify operational bottlenecks and repetitive workflows and convert them into scalable systems, scripts, templates, dashboards, or automation tools that improve operational efficiency and visibility.</li>
<li><strong>Data Quality Oversight</strong>: Coordinate automated and manual quality control checks across incoming datasets. Identify missing data, inconsistencies, corruption, or metadata mismatches and work directly with external partners to resolve issues. Ensure data integrity, traceability, and version control throughout the ingestion process.</li>
<li><strong>Operational Tracking & Reporting</strong>: Maintain a centralized “single source of truth” for all incoming datasets, including ingestion status, completeness, QC status, and milestone tracking. Build and maintain reporting dashboards and operational tools to provide visibility into project progress, ingestion velocity, and operational risks.</li>
<li><strong>Cross-Functional Collaboration & Communication</strong>: Partner closely with Data Science, Engineering, Legal, and Partnership teams to align operational execution with business and scientific priorities. Communicate technical issues clearly to both scientific collaborators and non-technical stakeholders. Provide regular updates on operational risks, blockers, and delivery progress.</li>
<li><strong>Site Visits & External Partner Engagement</strong>: Conduct periodic visits to partner hospitals, biobanks, and laboratories to support onboarding, troubleshoot technical or operational bottlenecks, and strengthen long-term collaborations.</li>
</ul>
<h2 style="line-height: 1.5;"><strong>What you'll bring</strong></h2>
<hr>
<p>The successful candidate will have a ‘team-first’ attitude; be highly organized, proactive, and detail-oriented; thrive in a fast-paced and evolving environment; and enjoy solving operational and technical challenges at scale. We value individuals who combine strong project management capabilities with hands-on technical fluency and an understanding of biomedical data ecosystems.</p>
<ul>
<li><strong>Biomedical Data Expertise</strong>: Strong understanding of clinical and biomedical data structures, including real-world data, clinical trial datasets, and multi-omics data modalities. Familiarity with oncology, immunology, or related therapeutic areas is highly desirable.</li>
<li><strong>Cloud & Data Infrastructure</strong>: Proven experience managing data lifecycles in cloud environments, particularly AWS (S3, CLI, access management). Familiarity with secure data transfer protocols and large-scale biomedical data handling workflows.</li>
<li><strong>Data Wrangling & Technical Skills</strong>: Proficiency in Python or R, along with SQL for querying and transforming datasets. Ability to write lightweight scripts, automate workflows, and interact with APIs or cloud-based systems.</li>
<li><strong>Project & Stakeholder Management</strong>: Demonstrated ability to manage multiple external collaborations and operational workstreams simultaneously. Excellent communication skills, with the ability to translate technical issues into clear guidance for both scientific and non-technical stakeholders.</li>
<li><strong>Operational Problem Solving</strong>: Comfortable working independently in ambiguous environments. Strong analytical and organizational skills with the ability to identify bottlenecks, improve processes, and drive operational efficiency.</li>
<li><strong>Educational Background</strong>: Bachelor’s or Master’s degree in Life Sciences, Bioinformatics, Health Informatics, Computer Science, or a related quantitative field.</li>
</ul>
<h3><strong>How to stand out:</strong></h3>
<ul>
<li data-section-id="19pttxb" data-start="266" data-end="371">Experience working directly with hospitals, biobanks, laboratories, or clinical research organizations.</li>
<li data-section-id="14kgzde" data-start="372" data-end="473">Familiarity with biomedical data standards, anonymization, and compliance frameworks (GDPR, HIPAA).</li>
<li data-section-id="i328v5" data-start="474" data-end="568">Experience managing large-scale biomedical datasets in cloud environments, particularly AWS.</li>
<li data-section-id="zfke9a" data-start="569" data-end="636">Knowledge of digital pathology and/or multi-omics data workflows.</li>
<li data-section-id="p4b8e6" data-start="637" data-end="730">Experience handling genomics and transcriptomics file formats (e.g. FASTQ, BAM, VCF, TIFF).</li>
<li data-section-id="17le5r4" data-start="731" data-end="814">Experience building operational tracking tools, dashboards, or reporting systems.</li>
<li data-section-id="8ofo2c" data-start="815" data-end="907">Experience automating operational workflows using scripts, APIs, or lightweight pipelines.</li>
<li data-section-id="1jdacr6" data-start="908" data-end="1016">Proven ability to manage cross-functional and external stakeholder relationships in complex data projects.</li>
</ul>
<h2 style="line-height: 1.5;">The candidate journey</h2>
<hr>
<p>To be considered, <strong>please submit your CV in English</strong>. We believe in a transparent and collaborative interview process. Here is what you can expect after submitting your application:</p>
<ol>
<li><strong>Screening</strong>: Once you have applied, the hiring team will review your application to determine if your work experience and skills align with the necessary proficiencies of this position.</li>
<li><strong>Hiring Manager (30 min)</strong>: A discussion with the Hiring Manager to review your background, operational experience, technical fluency, and motivation for joining Bioptimus. This conversation will also explore your experience working with external partners, managing data workflows, and operating in fast-paced environments.</li>
<li><strong>Technical Assessment</strong>: Given the technical nature of the role, you will be invited to complete a technical assessment designed to evaluate your practical skills in data operations, data handling, and workflow problem-solving. This assessment may include exercises related to data organization, scripting, SQL, cloud infrastructure, or operational reasoning.</li>
<li><strong>Case Study</strong>: You will work through a real-world operational case study related to biomedical data onboarding, harmonization, or partner management. You will present your approach and recommendations to members of the Data, Engineering, and Partnerships teams, followed by a discussion and Q&A session.</li>
<li><strong>Executive Interview</strong>: A comprehensive discussion with our Senior Leadership team focused on long-term vision, collaboration style, values, and mutual fit.</li>
<li><strong>Offer:</strong> Following the completion of the interviews, our hiring team will make a final decision and will be in touch to share the outcome of your interviews. If the team would like to move forward, the recruiter will discuss the details of our proposed offer with you.</li>
<li><strong>Onboarding: </strong>We are happy to have you joining the team. Once you have accepted and signed your offer, we will be in touch to begin the process of onboarding you to Bioptimus.</li>
</ol>
<h2 style="line-height: 1.5;"><strong>Why This is a Unique Opportunity</strong></h2>
<hr>
<p>You will:</p>
<ul data-start="8132" data-end="8535">
<li data-section-id="b6525y" data-start="8132" data-end="8233">Be part of a trailblazing team working at the intersection of AI, biotech, and biomedical research.</li>
<li data-section-id="1ma5a0i" data-start="8234" data-end="8334">Play a key operational role in enabling the development of frontier foundation models for biology.</li>
<li data-section-id="6pcvzr" data-start="8335" data-end="8442">Build and scale the data infrastructure and operational processes powering next-generation biomedical AI.</li>
<li data-section-id="v5jxm1" data-start="8443" data-end="8535">Collaborate with leading hospitals, biobanks, researchers, and engineers across the globe.</li>
</ul>
<p data-start="8537" data-end="8554">And benefit from:</p>
<ul>
<li>A collaborative and mission-driven work environment.</li>
<li>Competitive salary and equity package.</li>
<li>Flexible work arrangements, including remote options.</li>
<li>Opportunities for professional growth and leadership development.</li>
<li>The opportunity to shape the future of biology and AI through groundbreaking work.</li>
</ul><div class="content-conclusion"><p>We believe that the unique contributions of all Bioptimists create our success. To ensure that our culture continues to incorporate everyone’s perspectives and experience, we never discriminate based on race, religion, national origin, gender identity or expression, sexual orientation, age, or marital, or disability status. Decisions related to hiring are made fairly, and we provide equal employment opportunities to all qualified candidates. We take responsibility for always striving to create an inclusive environment that makes every employee and candidate feel welcome.</p></div>