Site Reliability Engineer II
Redmond, WA 
Share
Posted 1 day ago
Job Description
OverviewCome build and maintain the world's computer as a member of the Microsoft Capacity Infrastructure Services team in Azure Core. The team ensures new servers are brought online (capacity buildout) to enable Azure customers to leverage the latest offerings, see the illusion of infinite capacity, and grow the Azure business efficiently at hyperscale.As a Site Reliability Engineer II, you'll work with a breadth of partners across Microsoft including developers in service teams, hardware engineers, network engineers, datacenter technicians, supply chain managers, and business leaders to rapidly debug and resolve issues delaying this carefully orchestrated buildout sequence. You'll drive continuous improvements with these teams to prevent repeats and address common classes of issues across the Azure software stack through design reviews and problem management.This opportunity will enable you to learn unparalleled system-wide knowledge of how the Azure cloud is built and maintained. The contacts you make with experts will enable you to deep dive on services and new technologies and partner for improvements. You'll be stretched to automate mitigations tactically and strategically analyze data to identify problem areas for driving prioritization.This role requires flexibility to hold virtual meetings and collaborate with partners worldwide. Microsoft's mission is to empower every person and every organization on the planet to achieve more. As employees we come together with a growth mindset, innovate to empower others and collaborate to realize our shared goals. Each day we build on our values of respect, integrity, and accountability to create a culture of inclusion where everyone can thrive at work and beyond.
ResponsibilitiesParticipate in onboarding, code/design reviews, and regular meetings with the engineering teams that develop and manage products and services.Independently develop code or scripts that automate the performance of repetitive and easily scalable operations processes.Design, develop, and maintain telemetry pipelines and monitoring tools that detail operations metrics.Analyze data and use data to drive improvements with engineering teams.Respond to incidents during regular on-call rotations.Other Embody our culture and values

 

Job Summary
Company
Start Date
As soon as possible
Employment Term and Type
Regular, Full Time
Required Experience
Open
Email this Job to Yourself or a Friend
Indicates required fields