Data Center Engineering Operations Principal Engineer
At Amazon, we're working to be the most customer-centric company on Earth. To get there, we need exceptionally talented, bright, and driven people.
Amazon’s AWS Data Centers are a major component of worldwide cloud-computing infrastructure and are industry leading examples of mission critical facilities. The pace, scale, and growth of Amazon’s data center platform will challenge you. Our uncompromisingly high standards as we continually seek to innovate on behalf of our customers will push you to new levels of creativity.
The Infrastructure Operations organization is looking for an extraordinary individual with proven and tested technical acumen and leadership skills to lead our Data Center Engineering Operations (DCEO) in the PDX - Oregon region, one of the largest regions in AWS. The position will help ensure overall availability and reliability to meet or exceed defined service levels of data center operations.
A DCEO Principle Engineer is one of our most senior leadership roles in the data center environment. As engineers at Amazon we are responsible for achieving a world class uptime for our customers. In this role you will be responsible for leading the overall infrastructure management (e.g. electrical, mechanical, and HVAC systems) and be accountable for delivering operational excellence in the areas of safety, security, availability of an expanding portfolio of AWS data centers. The role requires a highly driven, self-managed individual who demonstrates initiative and proactively seeks innovative solutions to highly complex technical problems. This role reports directly to the Director of Infrastructure Operations.
The DCEO Principal Engineer is a leader and evangelist, working with and leading cross-functional teams which include: technical program management, mechanical engineering, electrical engineering, control engineering, supply chain management, operations, end users as well as safety and security teams.
The ideal candidate should be able to operate comfortably at an strategic and tactical level. It requires the individual to be able to see the big picture, guide, direct, educate, and develop members of the team while at times shifting to being hands-on on electrical and mechanical equipment troubleshooting. They will maintain, operate, and troubleshoot mission-critical data center facility equipment including electrical support equipment such as stand-by diesel generators and related fuel systems, 3 phase electrical systems that include but not limited to switchgear, transfer controllers, UPS units, PDUs, battery technologies and associated systems. Mechanical equipment includes CRAC units, centrifugal chillers, EVAP, cooling towers/water chemical system, air handlers and associated systems, pumps, and motors. Additional support equipment includes fire suppression systems, building automation systems, and general facilities equipment.
Drive a safety-first culture that is underpinned by Amazon Leadership Principles and the utmost regard for adherence to standard operating procedures, operational excellence, and continuous improvement mindset;
Be a leader within the Data Center Infrastructure engineering staff across the region, including engineering technicians, chief engineers, facility and area managers;
Partner with senior leaders and DCEO cluster managers to drive prevention of Large Scale Events and Critical Site Events. Review and advise on technical work, support with event management, and help lead the strategic direction and improvements of our DCEO organization;
Work with construction and commissioning teams to properly test and validate installation, operation, and performance of mechanical, electrical and plumbing (MEP) systems.
Maintain existing IT, mechanical, electrical infrastructure as well as pioneer new technology and processes. Work with internal teams to understand design requirements, solicit customer input, and drive innovative infrastructure solutions within business constraints;
For issues that arise in data centers, provide support and guidance to the teams involved in root-cause analysis, mitigation and remediation strategies, sharing lessons learned with global partners, and support with long-term corrective measures including reporting and documentation.
Act as a mentor and subject matter expert to operations staff through one-on-one coaching and mentoring.
Actively oversees and plans infrastructure goals for key initiatives and metrics that support service availability and Data Center management
Works independently and has frequent reviews with business leadership, other Principal Engineers and Senior Engineers across AWS.
Communicate clearly and effectively at the appropriate level of detail with people from a wide range of technical abilities and backgrounds.
Have fun while offering creative, out of the box solutions.
Amazon is an Equal Opportunity-Affirmative Action Employer – Minority / Female / Disability / Veteran / Gender Identity / Sexual Orientation.
· Bachelor’s degree in mechanical or electrical engineering or a related field.
· Licensed Professional Engineer.
· 10+ years of experience in reliability, maintenance, and engineering functions with mission critical facilities.
· Must have comprehensive knowledge of typical datacenter designs/architectures, and be familiar with industry design trends.
· Possess leadership and problem-solving skills, be a motivated, highly dependable individual.
· An ability and willingness to think outside of the box to find creative and innovative solutions to reduce costs with no impact on quality, reliability, or maintainability.
· An ability to deliver results in highly complex, technically challenging situations where collaboration across multiple teams is required.
· Exceptional verbal and written communication skills, attention to detail, and maintain highest regard for safety and quality standards.
· Well versed in relevant national and international safety and building compliance codes and standards including, UL, IEC, ISO, IBC, NEC/NFPA Life Safety and OSHA.
· Master’s degree in mechanical or electrical engineering or a related field.
· Have a track record of innovative and creative thinking.
· Experience with large scale technical operations or large-scale compute facilities.
· Ability to perform complex business case analysis to justify technical decisions and present the justification to management in a high level review.
· Detailed understanding of mechanical and electrical system control strategies and implementation.
· Direct experience with chiller/evap systems, cooling towers and factory testing protocols.
· Direct experience with medium (MV) and high (HV) voltage distribution systems design.
· Basic understanding of enterprise hardware and network architectures.
· Experience leading teams, either directly or indirectly. Ability to influence and build collaborative teams.