Senior Platform SRE

  • Full Time Job
  • Hybrid
  • $140,000 - $160,000 nzd

This is a critical, strategic-hybrid role where you will accelerate delivery across the entire engineering organization. You will co-own CI/CD and operability with product teams across our entire stack, while stewarding our AWS platform and Infrastructure-as-Code (IaC) to remove friction, improve reliability, and reduce time-to-production.

As our Senior Platform SRE, you will work hand in hand with product engineers, embedding reliability directly into application code. This role requires a hands-on, code-contributing approach to solve reliability challenges, ensuring our .NET/C# and Node.js codebases meet stringent Service Level Objectives (SLOs) and support progressive delivery.

What You’ll Be Doing (Your Impact)

Your primary focus will be on enhancing platform operability and application reliability by combining code contributions with deep infrastructure stewardship.

  • AWS Platform & IaC Stewardship: Manage the full lifecycle of our global AWS resources (VPC, EC2, RDS, S3) and master the Terraform codebase. This includes provisioning, monitoring, governance, and cost optimization (Savings Plans, right-sizing).
  • Embedded Reliability Engineering: Partner with product teams and contribute code changes directly into .NET (C#) and Node.js services. You’ll embed key capabilities for modern progressive delivery and resilience:
  • Implementing OpenTelemetry and structured logging for observability.
  • Engineering resilience patterns (retries, circuit breakers).
  • Enabling progressive delivery via feature flags and health probes.
  • CI/CD Acceleration: Manage and optimize pipelines for shipping production code to AWS using Azure DevOps. This includes integrating security and quality gates, supporting progressive delivery, and implementing health-based gates with automatic rollbacks.
  • Database Reliability (SQL Server): Steward SQL Server on RDS and EC2, focusing on performance diagnostics, tuning, backup/restore, recovery, and automating diagnostic workflows.
  • Reliability Operations: Participate in on-call rotations, lead infrastructure and database incidents, conduct blameless post-incident reviews, and report on key reliability metrics (MTTR, deployment frequency, SLO compliance).

What You’ll Bring (Your Expertise)

At Arlo, you’ll find a supportive team that trusts you to make an impact, gives you the freedom to grow, and the space to do your best work. We value clarity, grit, ownership, and curiosity, and we’re not afraid to challenge ideas to reach the best outcome. We work hard, move fast, and celebrate wins together.

  • SaaS & AWS Footprint: Required experience working in a SaaS environment with a global AWS footprint.
  • AWS expertise: Strong expertise across AWS compute, networking, storage, and databases.
  • IaC: Terraform mastery (modules, remote state + locking, drift detection, policy-as-code).
  • CI/CD & Containers: Experience with CI/CD for .NET (C#) and Node.js using Azure DevOps (or GitHub Actions), and Helm/Kustomize for containerized deployments.
  • Observability: Required experience with observability and incident tooling (Datadog, CloudWatch, OpenTelemetry, PagerDuty).
  • Databases: Experience with SQL Server on RDS/EC2 (performance tuning, backup/restore, recovery) is an advantage/bonus.

A Bit About Arlo

Arlo is a world-leading SaaS company on a mission to revolutionize professional training. With customers in over 70 countries, over 7.5M people trained, and $3B in course transactions, Arlo is loved by thousands of trainers and millions of learners worldwide. Arlo’s all-in-one training management platform handles everything from course creation and scheduling to delivery and operations.

We believe human connection is at the heart of great learning. By blending that philosophy with the latest in AI and elearning technology, Arlo helps training providers save time, grow revenue, and deliver exceptional learning at scale.

This role is a key part of Arlo’s next phase of growth. Global expansion, deepening our AI capabilities, and empowering a passionate community of training professionals shaping the future of learning.