Job Description:
• Own the Quality & Reliability Program: Define and drive the vision for quality—across proactive practices (testing, deployment, observability), reactive processes (incident response, external communications), and cultural expectations (quality ownership, readiness).
• Lead Cross-Functional Programs: Drive reliability and quality initiatives across Engineering, Product, Operations, and Customer Success.
• Production Readiness: Own the Production Readiness Review (PRR) process; ensure all releases meet reliability standards before they go live.
• Define and Drive SLOs: Establish and track Service Level Objectives (SLOs). Build visibility into reliability metrics and lead efforts to meet or exceed targets.
• Improve Incident Management: Streamline incident response and postmortems. Drive structural improvements in tooling, communication, and ownership.
• Scale Tooling & Automation: Collaborate across teams to enhance observability, alerting, testing automation, and response tooling.
• Mitigate System Risk: Identify risk vectors early, build mitigation plans, and drive resolution with urgency.
• Drive Alignment: Influence across Eng, Product, Ops, and GTM teams to prioritize reliability and integrate quality into every initiative.
• Track Progress: Use tools like Atlas, Jira, and internal dashboards to maintain clarity on goals, risks, and outcomes.
• Embed Continuous Learning: Build programs that ensure we learn from every incident, test edge cases, and continuously harden our systems.
Requirements:
• 8+ years of program management experience, with at least 3 years in technical, reliability, or quality-focused domains.
• Strong understanding of system architecture, distributed systems, and reliability engineering principles.
• Familiarity with SDLC models, CI/CD pipelines, deployment automation, observability, and incident management tooling.
• Demonstrated success defining and improving SLOs, SLIs, and production readiness processes.
• Proven ability to lead large-scale, cross-functional programs across Engineering, Product, Operations, and Customer Success.
• Skilled at translating complex technical goals into clear, actionable, and measurable outcomes.
• Experienced in using Atlassian tools (e.g., Jira, Atlas) for program tracking, reporting, and executive communication.
• Adept at navigating ambiguity, building alignment, and driving decision-making without formal authority.
• Comfortable balancing technical depth with business priorities to influence outcomes.
• Bachelor’s degree in Computer Science, Engineering, or related technical field, or equivalent practical experience.
• Bonus: Experience in regulated or high-availability industries such as fintech, healthcare, or infrastructure.
Benefits:
• Base salary per year (paid semi-monthly)
• Stock options with standard startup vesting - 1 year cliff; 4 years total
• $50 monthly communication expense stipend to go towards your phone/internet bill
• $250 stipend to enhance your WFH setup
• Reimbursement for peripheral equipment: monitor (up to $400), keyboard and mouse (up to $200)
• Premium medical benefits including vision and dental (100% coverage for employees)
• Company-sponsored life and disability insurance
• Paid parental bonding leave
• Paid sick leave, jury duty, bereavement
• 401k plan
• Flexible Time Off (our team members typically take off ~3-4 weeks per year)
• Volunteer Time Off
• 13 scheduled holidays
• 2x / year in-person team meet-ups