At Johnson & Johnson,?we believe health is everything. Our strength in healthcare innovation empowers us to build a?world where complex diseases are prevented, treated, and cured,?where treatments are smarter and less invasive, and?solutions are personal.?Through our expertise in Innovative Medicine and MedTech, we are uniquely positioned to innovate across the full spectrum of healthcare solutions today to deliver the breakthroughs of tomorrow, and profoundly impact health for humanity.?Learn more at https://www.jnj.com
Job Function:
Data Analytics & Computational Sciences
Job Sub Function:
Data Science
Job Category:
Scientific/Technology
All Job Posting Locations:
Titusville, New Jersey, United States of America
Job Description:
Job Description
Johnson and Johnson Innovative Medicine (J&J IM), a pharmaceutical company of Johnson & Johnson is recruiting for a Cataloging Data Scientist
This position has a primary location of Titusville, NJ but is also open to candidates from Cambridge, Boston , Madrid, Spain
About Innovative Medicine
About Innovative Medicine
Our expertise in Innovative Medicine is informed and inspired by patients, whose insights fuel our science-based advancements. Visionaries like you work on teams that save lives by developing the medicines of tomorrow.
Join us in developing treatments, finding cures, and pioneering the path from lab to life while championing patients every step of the way.
Learn more at https://www.jnj.com/innovative-medicine
We are searching for the best talent for Senior Principal Data Scientist - Cataloging & Metadata
Purpose
We are seeking a Sr. Principal Data Scientist for Cataloging, Metadata and Governance Team to design, develop, and implement automated AI solutions that address complex enterprise business challenges. In addition to building robust AI models, you will play a key role in enhancing data quality by curating, validating, and enriching metadata from multiple sources, and conducting quality checks on all relevant data fields. You will collaborate closely with Data Management, Platform Teams, Product Owners, and Business stakeholders to support catalog automation setup, improving data catalog usability and ensuring seamless data access for analytics and decision-making.
The Senior Principal Data Scientist - Cataloging & Metadata collaborates with cross-functional teams, including Data Management, Platform Teams, Product Owners, and Business stakeholders-to ensure metadata from multiple sources is curated, validated, and enriched, and that quality checks are rigorously performed across all relevant data fields. This role aligns with catalog automation initiatives, enhances data catalog usability, and supports seamless data access to empower analytics and informed decision-making. In addition to these core responsibilities, the Cataloging Data Scientist also designs, develops, and implements generative AI solutions that are closely integrated with cataloging processes, further enhancing the discoverability, findability, and overall usability of the data catalog, and driving continuous improvement in data management and discovery.
You will be responsible for:
Own solutioning, developing and implementing solutions for the data cataloging, metadata and governance team.
Lead the curation and ongoing management of the enterprise data catalog by capturing, validating, and enriching metadata from diverse sources, ensuring that business terms, data elements, and approved definitions are documented in collaboration with Data Owners and SME's.
Monitor catalog adoption and usage, continuously enhancing catalog usability and searchability so that all critical datasets, data products, and master data entities are indexed, discoverable, and accurately described.
Implement rigorous data quality assessments, applying validation and enrichment techniques to maintain the reliability, accuracy, and @contextualization of metadata throughout the data lifecycle.
Develop and monitor KPIs for metadata quality, completeness, and compliance across domains.
Works closely with cross-functional teams-including Knowledge Management, Data Products, and other groups-to integrate catalog automation and metadata capabilities into broader enterprise workflows, supporting seamless data accessibility and governance.
Partner with the DSDH teams to implement automated data governance monitoring and reporting processes.
Contribute to proof-of-concept/pilot/launch projects that assess data governance and metadata improvements and quantify business value achieved through enhanced data governance.
Partner with the DSDH teams to implement automated data governance solutions, monitoring and reporting processes.
Participate with ontologies and knowledge graph initiatives to ensure metadata is harmonized with enterprise semantic frameworks.
Collaborate with the JJ Technology, legal, Compliance, external vendors and other DSDH teams to ensure alignment and traceability between business definitions, technical metadata, and lineage.
Design, develop, and deploy generative AI solutions that are integrated with cataloging workflows, further improving the discoverability, accessibility, and overall effectiveness of the data
Establish and configure integrated connections between multiple cataloging platforms to enable seamless data synchronization and automated metadata updates, reinforcing traceability and discoverability across systems.
Qualifications / Requirements:
Required:
Masters/PhD in Lifesciences with master's in computer science, Data Science, Information Systems (or equivalent degree)
7+ years of experience in computational biology, automation, data cataloging (platforms such as TileDB, Collibra, Alation etc), business analysis, data science or related fields preferably within Life Sciences or a regulated industry.
Familiarity with data engineering, automation, data management, data compliance, quality, governance & AI Solutions
6+ years of hands-on experience in python, SQL and other AI automation tools
Strong python skills with API integration and backend development using FastAPI or Flask
Experience with data cataloging platforms and metadata extraction via APIs
Experience with databases(Snowflake, Postgres) & version control(GIT)
Hands-on experience building & deploying Gen AI
Strong troubleshooting skills across pipelines, APIs and dataflows
Strong stakeholder management skills with the ability to successfully drive solutions independently
Strong people management skills with the ability to mentor and guide resources.
Strong communication skills with ability to seamlessly work across technical and business teams
Strong sense of ownership and accountability in managing critical tasks and responsibilities to ensure successful project outcomes.
Preferred :
Experience in setting up automations and building intelligent solutions (machine-readable metadata, profiling, validation rules, anomaly detection etc)
Excellent attention to detail, data organization, and documentation skills
Familiarity with automated metadata ingestion & catalog curation workflows.
Ability to translate complex data concepts into clear, accessible documentation.
Johnson & Johnson is an Equal Opportunity Employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, sex, sexual orientation, gender identity, age, national origin, disability, protected veteran status or other characteristics protected by federal, state or local law. We actively seek qualified candidates who are protected veterans and individuals with disabilities as defined under VEVRAA and Section 503 of the Rehabilitation Act.
Johnson & Johnson is committed to providing an interview process that is inclusive of our applicants' needs. If you are an individual with a disability and would like to request an accommodation, external applicants please contact us via https://www.jnj.com/contact-us/careers , internal employees contact AskGS to be directed to your accommodation resource.
JNJDataScience #JNJIMRND-DS #LI-Hybrid
Required Skills:
Preferred Skills:
Advanced Analytics, Consulting, Critical Thinking, Data Analysis, Data Privacy Standards, Data Quality, Data Reporting, Data Savvy, Data Science, Data Visualization, Digital Fluency, Econometric Models, Mentorship, Strategic Thinking, Tactical Planning, Technical Credibility
The anticipated base pay range for this position is :
The anticipated base pay range for this position is $137,000 to $235,750 USD
Additional Description for Pay Transparency:
Subject to the terms of their respective plans, employees and/or eligible dependents are eligible to participate in the following Company sponsored employee benefit programs: medical, dental, vision, life insurance, short- and long-term disability, business accident insurance, and group legal insurance. Subject to the terms of their respective plans, employees are eligible to participate in the Company's consolidated retirement plan (pension) and savings plan (401(k)). This position is eligible to participate in the Company's long-term incentive program. Subject to the terms of their respective policies and date of hire, Employees are eligible for the following time off benefits: Vacation -120 hours per calendar year Sick time - 40 hours per calendar year; for employees who reside in the State of Washington -56 hours per calendar year Holiday pay, including Floating Holidays -13 days per calendar year Work, Personal and Family Time - up to 40 hours per calendar year Parental Leave - 480 hours within one year of the birth/adoption/foster care of a child Condolence Leave - 30 days for an immediate family member: 5 days for an extended family member Caregiver Leave - 10 days Volunteer Leave - 4 days Military Spouse Time-Off - 80 hours Additional information can be found through the link below. https://www.careers.jnj.com/employee-benefits