← back to jobs
> job detail
B
👽Other

Site Reliability Engineer II

Backblaze External Website · Remote - Bangalore
// classified as
Other (Adjacent or hard to classify.)
posted
1d ago
location
Remote - Bangalore
languages
bash, go, python
tools
aws, docker, grafana
> stack
bashgopythonawsdockergrafanakubernetesterraform
> description
<p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>About Backblaze</strong><br><br>Backblaze is the object storage leader in the open cloud movement, fueling customer success with cloud storage built purposefully to unlock budgets, unburden administrators, and unleash innovators. Together with our partners, we’re helping customers break free from the restrictive, overpriced legacy solutions that hold them back, and blaze forward with the full power of the open cloud in their hands.</span></p> <p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Founded in 2007, we scaled the business with less than $3 million in outside funding until 2021, when we did a traditional IPO on the Nasdaq stock exchange. Today, Backblaze generates over $100m in revenue and is the leading specialized storage cloud - managing over three billion gigabytes of data storage for 500K+ customers in 175+ countries, including businesses, developers, IT professionals, and individuals.</span></p> <h2><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>About the Role</strong></span></h2> <p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">We are seeking a <strong>Site Reliability Engineer II (SRE II)</strong> to help ensure the stability, scalability, and reliability of our services and infrastructure. This role focuses on building automation, maintaining observability, and supporting incident response to keep customer-facing systems performing at their best. The SRE will collaborate with engineering, product, and operations teams to embed reliability practices into day-to-day development and operations while contributing to tools and processes that improve efficiency and reduce manual effort.</span></p> <h2><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Key Responsibilities</strong></span></h2> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Service Reliability &amp; Operations</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Support the availability and durability of critical services across production environments.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Monitor service health using SLIs, SLOs, and error budgets, and escalate issues when thresholds are at risk.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Participate in on-call rotations, incident response, and post-incident reviews to drive service improvements.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Follow established ITIL/OSS processes (incident, change, problem, and capacity management).</span></li> </ul> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Automation &amp; Tooling</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Develop automation for common operational tasks, reducing manual intervention and toil.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Contribute to monitoring, logging, and alerting frameworks (e.g., Prometheus, Grafana, Catchpoint,ELK).</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Work with CI/CD pipelines, configuration management, and infrastructure as code tools (Terraform, Ansible, Jenkins).</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Write scripts (Bash, Python, Go, etc.) to improve system reliability and efficiency.</span></li> </ul> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Collaboration</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Partner with engineering, product, and operations teams to support resilient system design and operations.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Assist in capacity planning and disaster recovery exercises.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Work with vendors and service providers to troubleshoot service issues and track SLA performance.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Document systems, share learnings, and help grow a reliability-minded engineering culture.</span></li> </ul> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Continuous Improvement</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Contribute to playbooks, runbooks, and operational documentation.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Identify recurring issues and propose long-term improvements.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Promote reliability-focused practices within development and operations teams.</span></li> </ul> <h2><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Qualifications</strong></span></h2> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Education &amp; Experience</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Bachelor’s degree in Computer Science, Engineering, or related field (or equivalent experience).</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">2–4 years of experience in site reliability, systems engineering, or operations.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Exposure to large-scale, production-grade systems.</span></li> </ul> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Technical Skills</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Solid Linux systems administration and troubleshooting skills.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Familiarity with service reliability concepts - monitoring, alerting, incident response, and root cause analysis.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Proficiency in at least one scripting language (Python, Bash, or Go).</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Understanding of containers (Kubernetes, Docker) and microservices concepts.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Knowledge of incident response and operational best practices.</span></li> </ul> <h3><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><strong>Preferred Attributes</strong></span></h3> <ul> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Experience in a SaaS, service provider, or distributed systems environment.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Familiarity with ITIL/OSS practices and SLO/SLA’s</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Strong problem-solving skills and willingness to learn new technologies.</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Experience with cloud platforms (AWS, GCP, or Azure).</span></li> <li style="font-family: helvetica, arial, sans-serif; font-size: 12pt;"><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">Ability to work independently, take ownership, and drive projects from problem discovery through resolution.&nbsp;</span></li> </ul> <p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">At this point, we hope you're feeling excited about the job description you're reading. Even if you don't meet every requirement, we still encourage you to apply. Learning, developing, and growing are key parts of our culture. We're eager to meet people who believe in our mission and can contribute to our team in various ways. We want people to feel comfortable expressing their true selves and to come, stay, and do their best work here.</span><br><br><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">At Backblaze, we value being fair and good to our customers, partners, and employees. That’s why diversity, equity, and inclusion are at the core of our values. We are committed to fostering a workforce where all employees feel a sense of belonging regardless of race, ethnicity, nationality, gender, sexual orientation, age, religion, socio-economic status, ability, veteran status, and education. We believe that our dedication to cultivating a diverse workspace not only allows us to better serve our customers in over 175 countries, but further reinforces our commitment to doing the right thing. <strong>We are proud to be an Equal Opportunity Employer.</strong></span></p> <p><span style="font-family: helvetica, arial, sans-serif; font-size: 12pt;">To understand more about the data we collect and process as part of your application, please view our <a href="https://cdn.prod.website-files.com/63d32de856f6323a43a277f2/64b0660cd90ac9b4953f7f1d_Backblaze_HR_Employee_Related_Privacy_Notice.pdf">Backblaze Employee Privacy Notice.</a></span></p>