Site Reliabiltiy Engineer at Harbor Compliance

**Who this is for** This role is for a mid-to-senior level Site Reliability Engineer who enjoys architecting resilient infrastructure and leading technical stra

Work type: remote

Location: Remote (United States)

Salary: $110,761 – $158,230/yr

Type: Full-time

Summary

**Who this is for** This role is for a mid-to-senior level Site Reliability Engineer who enjoys architecting resilient infrastructure and leading technical strategy in a high-growth environment. It is ideal for someone who thrives on automation and wants to take full ownership of system performance and observability. **Key highlights** You will be a technical leader at Harbor Compliance, responsible for managing mission-critical Linux infrastructure and containerized workloads. The role focuses on proactive design, CI/CD optimization, and maintaining high system availability through advanced observability and incident response. **You might be a good fit if you...** - Have 4–7 years of experience building and maintaining modern, scalable infrastructure. - Possess expert-level skills in Kubernetes, Helm, and infrastructure-as-code tools like Terraform or Ansible. - Are proficient in managing MySQL databases and troubleshooting Linux-based production environments. - Have a strong background in scripting (Python) and designing secure, high-performance cloud networking strategies.

Job Description

About Harbor Compliance

Join Harbor Compliance, a pioneering leader in the compliance industry, recognized by Inc. 5000 and Deloitte Technology Fast 500. Merging with Labyrinth, Inc. in 2021, we've expanded our reach to over 35,000 clients, leveraging advanced technology to simplify business licensing and legal entity management. We’re a growing team, passionate about making compliance accessible and efficient for all businesses and nonprofits.

The Site Reliability Engineer is a senior-level technical leader responsible for the proactive design, implementation, and predictable management of our business-critical Linux infrastructure. You will collaborate cross-functionally with Software Development and technical stakeholders to execute resilient infrastructure strategies that support high-growth business goals. Success in this role is defined by the successful delivery of scalable technical solutions and the consistent maintenance of exceptional system performance and reliability.

Key Responsibilities:

Design and execute a comprehensive infrastructure strategy that proactively supports evolving business requirements and operational excellence.

Own the predictable delivery of high-complexity technical solutions through deep automation using Kubernetes and sophisticated CI/CD pipelines.

Maintain superior portal availability and system health by implementing advanced observability and distributed tracing strategies.

Lead high-severity incident response efforts and drive systemic improvements through insightful, blameless postmortem analysis.

Architect failure-resilient and self-healing infrastructure systems to ensure continuous operational stability and zero data loss.

Serve as the internal subject matter expert to influence software architecture decisions toward maximum scalability and performance.

Facilitate regular knowledge-sharing and training sessions to elevate technical standards and process predictability across the entire technology department.

Direct security initiatives and design secure networking strategies to maintain a high-standard protection framework for all client data and assets.

Requirements:

4–7 years of professional experience building and managing resilient, modern infrastructure within a fast-paced environment.

Expert-level proficiency in managing and troubleshooting Linux-based servers across multiple distributions.

Advanced capability in developing modular, reusable infrastructure templates using tools such as Terraform and Ansible.

Proven success in managing containerized workloads at scale using Kubernetes and Helm.

Extensive experience configuring and optimizing high-performance database environments, specifically MySQL.

Demonstrated ability to build robust, secure CI/CD deployment pipelines that include automated rollback and quality gates.

Strong technical documentation skills, including the creation of architectural diagrams, detailed specifications, and operational playbooks.

Ability to lead cross-functional projects independently while mentoring junior engineers and driving team-wide initiatives.

Skills and Knowledge:

Deep understanding of observability platforms such as New Relic, Datadog, or Prometheus to measure and improve system reliability.

Expertise in designing secure cloud networking strategies including firewalls, VPNs, and identity management best practices.

Advanced scripting and programming proficiency in Python or similar languages to automate complex operational workflows.

Strategic insight into infrastructure ROI and the ability to align technical roadmaps with broad business priorities.

Practical knowledge of disaster recovery planning and the execution of failure-resilient system designs.

View this job on nocollar jobs