Middle Python Data Engineer

for Biotech Company

Location

Bucharest, Romania

Area

Backend

Tech Level

Middle

Tech Stack

Kafka, Scala, Hadoop, SQL, Spark, Python, Apache, AWS, DBT (Data Build Tool), Azure

Refer a Friend

About the Client

Our client is a platform-driven, clinical-stage biotechnology company that is mapping human longevity to change the nature of aging and extend healthy lifespan. Our growing portfolio of therapeutics for immune, muscle, and brain aging includes four drug programs, two first-in-class and two first-in-indication. Our vision is “growing older without aging” - a future in which aging allows us to pursue our goals, accumulate new experiences and accomplishments, and actively contribute to society without disease, physical disability, or loss of independence and connection. A leading company in the emerging longevity biotech sector, client has raised $127M from Andreessen Horowitz, Kaiser Foundation Hospitals, and others.

Project details

This role is pivotal in setting the standards, protocols, and best practices for our data architecture and infrastructure, ensuring that all systems are efficient, traceable, scalable, and maintainable. Embedded within the Data Science team, you will collaborate with engineers, data scientists, and domain experts to establish and maintain cutting-edge systems that drive the company's mission of leveraging omics, drug, target, and pathway data for biological discovery and therapeutic innovation. This is a high-impact role where you will not only build and optimize technical solutions but also set the foundation for long-term success by defining and implementing best practices for data infrastructure across the organization.

Your Team

We are looking for 2 specialists: Data Governance Specialist and Data Python Engineer.

What's in it for you

Interview process that respects people and their time
Professional and open IT community
Internal meet-ups and resources for knowledge sharing
Time for recovery and relaxation
Bright online and offline events
Opportunity to become part of our internal volunteer community

Responsibilities

Infrastructure Development & Standards

Lead the design, development, and implementation of scalable, high-performance data infrastructure to support diverse biological data sets and downstream applications.
Establish and enforce best practices, protocols, and policies for data management, processing, and infrastructure maintenance.
Create standards for system architecture to ensure traceability, usability, integrity, and scalability of all data systems.
Proactively identify opportunities to optimize and future-proof infrastructure for evolving data needs.

Data Architecture & Pipelines

Design, implement, and maintain efficient ETL pipelines to process and harmonize internal and external biological data sources.
Integrate large-scale datasets, such as OpenTargets, StringDB, and the Human Protein Atlas, into unified, accessible formats.
Build infrastructure to support advanced applications, such as knowledge graphs and AI-driven tools, ensuring seamless interoperability with analytical workflows.

Collaboration & Leadership

Partner with data scientists, engineers, and bioinformaticians to ensure infrastructure aligns with analytical and research objectives.
Mentor and guide other engineers, instilling best practices in infrastructure design, data management, and software development.
Foster a culture of collaboration, continuous learning, and high standards within the team.

Skills

Education

BS in Computer Science, Bioinformatics, or a related field; MS or PhD preferred.

Experience

5+ years of experience in bioinformatics or software development in biotech, pharmaceuticals, or a related field.
Demonstrated success in designing and deploying scalable ETL pipelines and infrastructure for large-scale biological datasets.
Proven track record of establishing standards and best practices for data architecture and processing systems.
Experience working in an agile development environment with test-driven development methodologies.

Technical Expertise

Proficient in data engineering programming languages such as Python, R, SQL/no-SQL, and languages for big data and cloud platforms.
Proficiency with Google Cloud Platform (GCP) and experience in managing cloud-based systems.
Familiarity with public domain data sets and tools, such as OpenTargets, STRING DB, and the Human Protein Atlas.

Skills

Strong leadership and mentoring capabilities, with the ability to guide teams in adopting best practices.
Excellent organizational skills, with a focus on creating maintainable and scalable systems.
Exceptional communication skills, enabling effective collaboration across cross-functional teams.

Preferred

Experience with knowledge graphs or similar advanced data structures is a plus.

Your personal recruiter

Vasylii Demianets

Apply Now

sharing is caring & referral bonus

Career Why Brightgrove Privacy Policy Blog

Українська

English U.S.

Middle Python Data Engineer

Location

Area

Tech Level

Tech Stack

your info

REFERRAL'S INFO

About the Client

Project details

Your Team

What's in it for you

Responsibilities

Skills

Education

Experience

Technical Expertise

Skills

Preferred

Vasylii Demianets

Apply Now

sharing is caring & referral bonus