The Senior Site Reliability and Security Engineer is responsible for ensuring the reliability, observability, and security posture of the QTS OS and SDP platforms deployed on AWS.
This role combines deep technical expertise in cloud operations and application-level security with leadership in incident response, production monitoring, and proactive threat detection.
The engineer will work closely with DevOps, backend, and application development teams to identify risks, resolve incidents, and drive continuous improvement in availability and security.
RESPONSIBILITIES
Monitor and analyze production environments for reliability, performance, and security risks across QTS OS and SDP platforms.
Lead troubleshooting of production incidents, guiding development teams to identify root causes and implement permanent fixes.
Collaborate with engineering teams to design and maintain robust observability practices - including metrics, logs, traces, and alerts - using AWS CloudWatch and related tools.
Identify and mitigate security threats across cloud infrastructure (IAM, Security Groups, VPC/PrivateLink, WAF) and application layers (API Gateway, Lambda, ECS).
Perform reviews of AWS IAM policies, roles, and network security configurations to detect privilege escalation or exposure risks.
Participate in and facilitate threat-modeling sessions with development teams.
Analyze code changes and 3rd-party dependencies to ensure alignment with internal security and compliance policies.
Partner with developers to design secure patterns for integrating new AWS services and frameworks.
Support incident response, participate in on-call rotation, and contribute to post-incident RCA documentation and follow-up actions.
Review service metrics, dashboards, and alarms to ensure coverage for critical user paths and backend systems.
Recommend and implement process improvements in monitoring, alerting, and escalation workflows.
Work with DevOps and architecture teams to assess the impact of new deployments on reliability and security posture.
Contribute to internal standards and best practices for secure and resilient system design.
Document technical findings, detection strategies, and mitigations for recurring risks.
Technical Expertise
AWS Services: ECS/Fargate, IAM, Security Groups, VPC/PrivateLink, WAF, Lambda, CloudWatch, S3, SQS/SNS, API Gateway, CloudFront.
Languages: Python, Java, TypeScript - with focus on scripting, automation, and code reviews for secure patterns.
Infrastructure & Tools: Terraform (IaC reviews and security validation), GitHub Actions (CI/CD observability).
Monitoring & Observability: AWS CloudWatch, CloudTrail, metrics/alarms configuration, log correlation, anomaly detection.
Security Practices: Threat modeling, SCA (Static Component Analysis), IAM least-privilege design, network isolation, runtime behavior analysis.
Incident Management: Root cause analysis, mitigation design, documentation, and coordination during live incidents.
Preferred familiarity: Snyk, CodeQL, AWS Config, or similar tools for vulnerability management.
BASIC QUALIFICATIONS
Bachelor's degree in Computer Science, Engineering, or related field.
Due to the nature of systems and data supported, U.S. citizenship is required for this position.
5+ years of experience supporting production systems in AWS, focusing on reliability or cloud security.
Strong understanding of AWS networking, IAM, and monitoring services.
Proven ability to guide teams during incident response and troubleshooting.
Demonstrated experience detecting and resolving infrastructure or application security risks.
Excellent diagnostic and analytical skills for complex distributed systems.
PREFERRED QUALIFICATIONS
Hands-on experience securing and monitoring large-scale AWS microservice deployments (ECS/Fargate).
Familiarity with automated dependency scanning, SCA tools, and security compliance monitoring.
Experience facilitating threat modeling sessions and implementing mitigation strategies.
Working knowledge of software development in Python, Java, or TypeScript.
Experience collaborating with DevOps and application teams to enforce secure SDLC practices.
Solid understanding of observability patterns, alert tuning, and SLO/SLA-driven reliability.
KNOWLEDGE, SKILLS, AND ABILITIES
Strong interpersonal skills for collaboration with engineering and operations teams at all levels.
Ability to balance security rigor with operational pragmatism and delivery timelines.
Excellent written and verbal communication for documenting incidents, standards, and remediation steps.
Capable of working independently and leading technical investigations of varying complexity.
Demonstrated ownership mindset for system health, reliability, and security posture.
Willingness to participate in limited on-call rotation supporting production environments.
TOTAL REWARDS
This role is also eligible for a competitive bene?ts package that includes: medical, dental, vision, life, and disability insurance; 401(k) retirement plan; ?exible spending and HSA accounts; paid holidays; paid time off; paid volunteer days; employee assistance program; tuition assistance; parental leave; military leave assistance; QTS scholarship for dependents; wellness program, and other company bene?ts.
This position is Bonus eligible.
We conform to all the laws, statutes, and regulations concerning equal employment opportunities and affirmative action. We strongly encourage women, minorities, individuals with disabilities and veterans to apply to all of our job openings. We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, gender, sexual orientation, gender identity, or national origin, age, disability status, Genetic Information & Testing, Family & Medical Leave, protected veteran status, or any other characteristic protected by law. We prohibit retaliation against individuals who bring forth any complaint, orally or in writing, to the employer or the government, or against any individuals who assist or participate in the investigation of any complaint or discrimination claim.
The "Know Your Rights" Poster is included here:
Know Your Rights (English) (http://www.eeoc.gov/sites/default/files/2022-10/22-088\EEOC\KnowYourRights\10\20.pdf)
Know Your Rights (Spanish)
The pay transparency policy is available here:
Pay Transparency Nondiscrimination Poster-Formatted (https://www.dol.gov/sites/dolgov/files/OFCCP/pdf/pay-transp\%20English\formattedESQA508c.pdf)
QTS is committed to working with and providing reasonable accommodations to individuals with disabilities. If you need a reasonable accommodation because of a disability for any part of the employment process, please send an e-mail to talentacquisition@qtsdatacenters.com and let us know the nature of your request and your contact information.
It's exhilarating to find yourself at a pivotal moment in history- and even more so to be leading the way. At QTS Data Centers, we are proud to stand at the forefront of today's dynamic digital transformation. Our world-class data centers empower our customers' most strategic growth initiatives, positioning us as a global leader in digital infrastructure.
As AI and cloud technologies fuel the demand for increased speed, capacity, and innovation, QTS has emerged as the global digital infrastructure leader. We are committed to connecting the globe for good. Driven by purpose and a spirit of innovation, we design, build, and operate some of the most advanced data centers worldwide. In addition to our cutting-edge technology, we are dedicated to sustainability, incorporating renewable energy solutions to minimize our environmental footprint and drive meaningful impact. As a proud portfolio company of Blackstone, QTS is uniquely positioned to achieve ambitious growth and innovation goals.
At QTS, we are Powered by People . Our team members are the cornerstone of our culture, innovation, and growth. They are mission-driven, resourceful, and committed to making a positive impact in the communities where we live and work. Together, we're achieving remarkable things and shaping the future of digital infrastructure.
And we'd like to invite you to join us.
In addition to a variety of benefit packages, QTS goes above and beyond for our employees:
Roth and Traditional 401(k) matching contributions with immediate vesting
Every employee is bonus or commission eligible
Generous PTO, Paid Volunteer Days Plus Floating Holidays
Stock Purchase Plan (SPP)
11 paid Holidays Annually/Holiday compensation when worked
Pet and Legal Insurance
Q-Rest Sabbatical Program
Q-Anniversary Service Award Program
Parental Leave for primary and secondary caregivers
Military Benefits Package
QTS Charitable Matching Gift Program
QTS Scholarship for Employee Dependents
QTS Crisis Fund
Wellness Program
Tuition Reimbursement Program