Job Description
The Senior Data Engineer will design, build, and deliver a new enterprise data product supporting the clients generative drug design and computational chemistry platforms. This role focuses on creating scalable, well-structured data architecture from the ground up, with long-term expansion and downstream AI/ML integration in mind. The ideal candidate combines strong data engineering expertise with an understanding of drug design, chemistry, and scientific data workflows.
-Design and implement a new enterprise data product, initially scoped as a standalone deliverable with future integration into broader AI-driven drug discovery platforms.
-Build scalable data pipelines, schemas, and storage models capable of supporting large, complex scientific and chemistry-derived datasets.
-Develop data solutions primarily on GCP / BigQuery, adhering to enterprise data engineering templates and standards.
-Implement data transformations and pipelines using Python, with a focus on data quality, traceability, and performance.
-Ensure the data architecture supports future expansion, additional datasets, and evolving analytical and computational needs.
-Collaborate closely with computational chemists, data scientists, and ML engineers to ensure data models align with generative design, molecular representations, and ML outputs.
-Apply an understanding of drug design and chemistry concepts (e.g., molecular properties, structure-activity data, experimental outputs) to inform data modeling and integration decisions.
-Provide technical guidance on data structure, scalability, and long-term maintainability in an enterprise environment.
We are a company committed to creating diverse and inclusive environments where people can bring their full, authentic selves to work every day. We are an equal opportunity/affirmative action employer that believes everyone matters. Qualified candidates will receive consideration for employment regardless of their race, color, ethnicity, religion, sex (including pregnancy), sexual orientation, gender identity and expression, marital status, national origin, ancestry, genetic factors, age, disability, protected veteran status, military or uniformed service member status, or any other status or characteristic protected by applicable laws, regulations, and ordinances. If you need assistance and/or a reasonable accommodation due to a disability during the application or recruiting process, please send a request to HR@insightglobal.com.To learn more about how we collect, keep, and process your private information, please review Insight Global's Workforce Privacy Policy: https://insightglobal.com/workforce-privacy-policy/.
Skills and Requirements
-Strong experience in data engineering, including database, schema, and data product design.
-Hands-on experience with GCP and BigQuery (Postgres familiarity a plus).
-Proficiency in Python for building and maintaining data pipelines.
-Onyx background
-Experience working with large, complex datasets at scale, ideally in scientific or R&D @contexts.
-Background in life sciences, pharma, or scientific data platforms. -Experience supporting downstream analytics, ML pipelines, or AI-driven platforms, particularly in R&D or discovery environments.
-Background in life sciences, pharma, or scientific data platforms.
-Working knowledge or hands-on exposure to drug design, chemistry, or computational chemistry data.