Description
Leading the reverse manufacturing engineering operations for AI Servers and Systems based on Trainium chips across cross-geographical reverse logistics sites, ODMs and CMs. As part of the Manufacturing, Quality and Reliability Team in AWS Annapurna Labs focused on Machine Learning products that designs cutting AI platforms for the world's largest Cloud Services provider.
AWS Utility Computing (UC) provides product innovations - from foundational services such as Amazon's Simple Storage Service (S3) and Amazon Elastic Compute Cloud (EC2), to consistently released new product innovations that continue to set AWS's services and features apart in the industry. As a member of the UC organization, you'll support the development and management of Compute, Database, Storage, Internet of Things (Iot), Platform, and Productivity Apps services in AWS, including support for customers who require specialized security solutions for their cloud services.
Within AWS, Annapurna Labs team is building the next generation cloud server infrastructure. Our success depends on delivering world-class server infrastructure; we're handling massive scale and rapid integration of emergent technologies. Our servers include accelerators such as AWS Trainium and AWS Inferentia which are machine learning products designed to deliver high performance at low cost.
The Trainium Manufacturing, Quality and Reliability Team is part of AWS Annapurna Labs focused on Machine Learning products that designs cutting AI platforms for the world's largest Cloud Services provider. We are seeking a talented and motivated Manufacturing Engineer with a proven track record of implementing best in class test techniques and processes within a complex supply chain. As a member of the Cloud-Scale Machine Learning Acceleration team, you will be the interface between the system engineering team and the ODM and CM partners.
As a Senior Manufacturing Engineer you will engage with an experienced cross-disciplinary staff to own and drive reverse logistics operations across datacenter reverse logistics and ODM RMA programs as a unified program. Drive program ownership for reverse manufacturing operations equivalent to forward manufacturing. You will work closely with an internal inter-disciplinary team, and outside partners to drive key aspects of failure triage, manufacturing retest, test infrastructure and execution. A successful candidate will be responsive, flexible and able to succeed within an open collaborative peer environment. You will:
Be responsible for the failure triage and retest of servers and components that have failed during forward manufacturing or in the datacenter.
Drive manufacturing process improvements to address reliability issues and concerns.
Lead identifying and validating product/component risks and work with design teams to mitigate them and define the test methodology and test coverage to improve product quality
Establishing and maintaining re-test capacity, infrastructure and requirements across datacenter reverse logistics and ODM sites.
Provide technical leadership and mentor engineers.
Working with multiple vendors and ODMs to standardize component manufacturing and reliability expectations.
The successful candidate will be capable of making wide-ranging business decisions on behalf of the organization and willing to "roll up sleeves and do what needs to get done" to consistently deliver results. We're changing an industry, and we want individuals who are ready for this challenge.
Key job responsibilities
Manage warehouse inventory tracking, including card locations and ownership assignments for internal tracking
Oversee spare parts inventory for testing equipment deployed at ODMs
Review test logs for customer-returned cards and conduct technical assessments for RMA acceptance
Generate comprehensive 8D reports while monitoring manufacturing and PCB revision changes to communicate resolved identified issues to customer
Analyze failure analysis reports on RMA FA requests; correlating data with yield metrics to identify patterns across component vendors, manufacturing sites, and individual testers
Track RMA cases from creation through final closure
Provide disposition path for unique customer returns based on failure mode/mechanism and rework process
Coordinate with manufacturing teams on revision changes and issue resolution
Conduct and coordinate reliability validation for reworked components on product; validating reliability of rework process in terms of solderability and ensuring no unintended consequences
Evaluate, investigate and introduce new manufacturing technology and methodology to enhance product quality and production efficiency at ODM and CM
Develop or adapt manufacturing process at the ODM and CM, including defining fixture requirements, critical assembly requirements, test methodology, signal integrity, power and heat management requirement
Work with engineering teams to clearly represent process and reviews to enable smooth New Product Introduction and changes
Support cost reduction and sustaining activities
About the team
Annapurna Labs is a wholly owned subsidiary of AWS, focused on developing custom silicon and servers including the Nitro(K2), Graviton, Inferentia, and Trainium families of processors.
Machine Learning Annapurna functions as a vertically integrated team including software, firmware, hardware, and silicon design in a single organization.
We are the Trainium Servers and Systems organization under MLA focused on Hardware Development, Software Development, Fleet Ops Systems, and Manufacturing, Quality, and Reliability.
This position is in the Manufacturing, Quality and Reliability team.
Basic Qualifications
Preferred Qualifications
Experience with server, storage, networking, or large-scale distributed systems
Experience communicating clearly and concisely with leadership, stakeholders, and cross-functional teams
Amazon is an equal opportunity employer and does not discriminate on the basis of protected veteran status, disability, or other legally protected status.
Our inclusive culture empowers Amazonians to deliver the best results for our customers. If you have a disability and need a workplace accommodation or adjustment during the application and hiring process, including support for the interview or onboarding process, please visit https://amazon.jobs/content/en/how-we-hire/accommodations for more information. If the country/region you're applying in isn't listed, please contact your Recruiting Partner.
Our compensation reflects the cost of labor across several US geographic markets. The base pay for this position ranges from $143,300/year in our lowest geographic market up to $257,300/year in our highest geographic market. Pay is based on a number of factors including market location and may vary depending on job-related knowledge, skills, and experience. Amazon is a total compensation company. Dependent on the position offered, equity, sign-on payments, and other forms of compensation may be provided as part of a total compensation package, in addition to a full range of medical, financial, and/or other benefits. For more information, please visit https://www.aboutamazon.com/workplace/employee-benefits . This position will remain posted until filled. Applicants should apply via our internal or external career site.