> job detail
M
👽Other
Site Reliability Engineer
Microsoft · Redmond, WA,US; Reston, VA,US
// classified as
Other (Adjacent or hard to classify.)
posted
2d ago
location
Redmond, WA,US; Reston, VA,US
languages
bash, c, java
tools
aws, azure, docker
> stack
bashcjavapythonawsazuredockerhadoopkubernetessparkterraform
> education
doctorate
> description
Write secure, high-quality code that is maintainable, scalable, and performant. Architect, implement, and optimize hybrid and cloud infrastructure using Infrastructure as Code (e.g., Containers, Bicep, Terraform, AKS etc.) to improve availability, scale, security, and operational efficiency. Design and implement data governance, storage, backup, and disaster recovery for a multi-petabyte Azure environment, ensuring integrity, security, and performance. Build and operate large-scale data pipelines and data transformations to support analytics, governance, and operational needs. Evaluate emerging engineering tools and practices and incorporate them into the roadmap to continuously improve efficiency, reliability, and scale. Deliver automation to improve service health, manageability, reliability, telemetry, and alerting, with a focus on resiliency. Create and maintain clear technical documentation and design specifications aligned with best practices. Partner with engineering, project management, and operations to evolve services and optimize infrastructure in support of organizational goals. Participate in an on-call rotation to operate live services; troubleshoot and mitigate complex issues, escalate as needed, and write post-incident reviews to share learnings. Identify opportunities for automation using scripts, pipelines, policy‑driven guardrails, or AI‑enabled tooling to reduce manual toil and increase engineering productivity. Master's Degree in Computer Science, Information Technology, or related field AND 2+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 4+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience. These requirements include, but are not limited to the following specialized security screenings: The successful candidate must have an active U.S. Government Top Secret Clearance with access to Sensitive Compartmented Information (SCI) based on a Single Scope Background Investigation (SSBI) with Polygraph. Failure to maintain or obtain the appropriate U.S. Government clearance and/or customer screening requirements may result in employment action up to and including termination. Clearance Verification: This position requires successful verification of the stated security clearance to meet federal government customer requirements. You will be asked to provide clearance verification information prior to an offer of employment. Citizenship & Citizenship Verification: This position requires verification of U.S. citizenship due to citizenship-based legal restrictions. Specifically, this position supports United States federal, state, and/or local United States government agency customer and is subject to certain citizenship-based restrictions where required or permitted by applicable law. To meet this legal requirement, citizenship will be verified via a valid passport, or other approved documents, or verified US government Clearance. Doctorate Degree in Computer Science, Information Technology, or related field AND 3+ years technical experience in software engineering, network engineering, or systems administration OR Master's Degree in Computer Science, Information Technology, or related field AND 6+ years technical experience in software engineering, network engineering, or systems administration OR Bachelor's Degree in Computer Science, Information Technology, or related field AND 8+ years technical experience in software engineering, network engineering, or systems administration OR equivalent experience. 4+ years of experience building, deploying, and operating containerized applications and infrastructure as code (e.g., Docker, Kubernetes, Azure Container Apps/AKS/ACI, Terraform, Azure Bicep, ARM templates). 4+ years of experience writing and maintaining scripts for deployment, orchestration, and automation (e.g., PowerShell, Python, Bash). Experience working with large datasets, data pipelines, and data transformation patterns (batch and/or streaming). Experience with one or more major cloud platforms (Azure, AWS, or Google Cloud). Hands-on experience with Azure services and infrastructure (e.g., ARM templates, IaaS, VMs, Key Vault, Event Hubs, Synapse, Spark/Hadoop), or equivalent services in AWS or Google Cloud. Familiarity with data pipeline and transformation tooling (e.g., Spark, Hadoop) and operating at scale. Familiarity with petabyte-scale datasets and building reliable data pipelines and transformations that support mission-critical services. Proficiency in at least one programming language (e.g., C# or Java) and scripting languages such as PowerShell, Bash, and Python.