The Data Engineer role is a pivotal position within the Enterprise Data Engineering & Analytics Department, supporting the design, build, and operationalization of integrated data pipelines and analytics solutions that enable MD Anderson's digital business initiatives. The Data Engineer works across the Context Engine framework to deliver end-to-end data engineering solutions while partnering closely with Enterprise Data Engineering & Analytics teams and other institutional stakeholders.
The Data Engineer contributes to the mission of MD Anderson Cancer Center, a leading institution focused on cancer care, research, education, and prevention. In this role, the Data Engineer helps advance enterprise analytics capabilities by ensuring secure, governed, and reusable data assets that accelerate insights and improve time-to-solution across MD Anderson.
Ideal Candidate Statement
The ideal candidate for the Data Engineer role brings a bachelor's degree in computer science, preferred advanced education in analytics or computer science, hands-on experience building data pipelines in healthcare or research environments, and familiarity with modern cloud-based data platforms, hands-on use of Large Language Models (LLMs) in real-world projects, Python or Spark development, and analytics delivery. Epic data model exposure or certification and the ability to collaborate across technical and clinical teams are strongly preferred.
Position Information
Salary range based on a 40-hour work week: Minimum $106,500 - Midpoint $133,000 - Maximum $159,500
Work location: Houston, Texas or surrounding area preferred
This Data Engineer role offers the opportunity to contribute directly to MD Anderson's mission by enabling high-quality, governed data that supports clinical, research, and operational analytics across the institution. The position provides exposure to enterprise-scale data engineering initiatives, collaboration with experienced engineering and data science professionals, and opportunities for continued learning and career growth while supporting a balanced and sustainable work environment.
Employer-paid medical coverage starting day one for employees working 30+ hours/week, plus optional group dental, vision, life, AD&D, and disability insurance.
Accruals for PTO and Extended Illness Bank, plus paid holidays, wellness, childcare, and other leave options.
Tuition Assistance Program after six months of service and access to extensive wellness, fitness, and employee resource groups.
Defined-benefit pension through the Teachers Retirement System, voluntary retirement plans, and employer-paid life and reduced salary protection programs.
Responsibilities
Data Engineering - End-to-End Solution Delivery
Participate in end-to-end solution delivery that increases information capabilities and realizes data value across the institution
Build and test end-to-end data pipelines across ingestion, curation, transformation, modeling, and consumption within the Context Engine framework
Integrate data governance processes across data provenance, security, data quality, ontology, and metadata management
Participate in planning, architecture, analysis, design, and build of data pipelines in partnership with IS, Data Offices, and Data Governance teams
Contribute to existing data pipelines spanning acquisition, integration, and consumption for defined use cases
Data Curation, Modeling, and Governance
Build data curation pipelines including profiling, specification creation, cleansing, transforming, standardizing, mastering, harmonizing, validating, and aggregating data
Monitor and support data quality across the Context Engine
Incorporate repeatable solution designs and data models to support reuse and scalability
Promote effective data management practices and understanding of analytics across the enterprise
Standards, Testing, and System Maintenance
Adhere to IS division standard operating procedures and all MD Anderson policies
Maintain build standards and governance oversight sign-off aligned with institutional data strategy
Participate in documentation preparation for enhancements or new technology
Perform quality control, testing, and peer review of analytics builds
Support system updates, releases, change control processes, and after-hours support as required
Education, Training, and Collaboration
Train data scientists, analysts, end users, and data consumers on data pipelining and preparation techniques
Assist in establishing training plans and curricula for Context Engine tools
Provide institutional, department, and one-on-one training on EDEA deliverables
Support liaison relationships with customers and OneIS partners to deliver effective technical solutions
Innovation and Continuous Improvement
Explore and promote modern tools, techniques, and architectures to automate data preparation and integration tasks
Improve productivity by reducing manual and error-prone processes
Model OneIS values through integrity, partnership, quality, and continuous improvement
EDUCATION
Required: Bachelor's Degree
Preferred: Bachelor's in computer science, Master's degree Business Analytics, Computer Science, Information Technology, Data Science, or related.
WORK EXPERIENCE
Required: 2 years Clinical, relevant healthcare information technology, or relevant business experience. or
Required: With preferred degree, no experience required.
May substitute required education with years of related experience on a one to one basis.
Preferred: 3-5 years creating data pipelines in a healthcare research environment, experience building and maintaining analytical reports and dashboards, problem solving skills and ability to translate business/clinical requirements into reliable data models, analytics & reporting- cloud data management solutions like Foundry, Fabric etc, data pipeline & ETL development -hands on experience designing, building and maintaining pipelines using python/spark, hands-on use of Large Language Models (LLMs) in real-world projects, such as integrating generative AI solutions into applications, workflows, or analytics platforms. Candidates should be familiar with prompt engineering, model evaluation, and responsible AI practices. Experience collaborating with cross-functional teams to deploy and scale LLM-powered features is highly desirable.
Preferred certifications: EPIC Cogito, Clarity, Caboodle, Clinical Data Model.
Work location: Prefer a candidate in Houston Texas or surrounding area.
LICENSES AND CERTIFICATIONS
Required: EPIC - EPIC Certification Must obtain at least one Epic Data Model certification (Clinical, Access, or Revenue) issued by Epic. within 180 Days
OTHER REQUIREMENTS: Must pass pre-employment skills test as required and administered by Human Resources.
The University of Texas MD Anderson Cancer Center offers excellent https://www.utsystem.edu/offices/employee-benefits/ut-retirement-program/voluntary-retirement-programs, tuition benefits, educational opportunities, and individual and team recognition.
This position may be responsible for maintaining the security and integrity of critical infrastructure, as defined in Section 113.001(2) of the Texas Business and Commerce Code and therefore may require routine reviews and screening. The ability to satisfy and maintain all requirements necessary to ensure the continued security and integrity of such infrastructure is a condition of hire and continued employment.
It is the policy of The University of Texas MD Anderson Cancer Center to provide equal employment opportunity without regard to race, color, religion, age, national origin, sex, gender, sexual orientation, gender identity/expression, disability, protected veteran status, genetic information, or any other basis protected by institutional policy or by federal, state, or local laws unless such distinction is required by law.http://www.mdanderson.org/about-us/legal-and-policy/legal-statements/eeo-affirmative-action.html
Additional Information
Requisition ID: 179587
Employment Status: Full-Time
Employee Status: Regular
Work Week: Days
Minimum Salary: US Dollar (USD) 106,500
Midpoint Salary: US Dollar (USD) 133,000
Maximum Salary : US Dollar (USD) 159,500
FLSA: exempt and not eligible for overtime pay
Fund Type: Hard
Work Location: Remote (within Texas only)
Pivotal Position: Yes
Referral Bonus Available?: No
Relocation Assistance Available?: No
LI-Remote