Generally, the goals of our oncall rotation are to:

  1. Ensure that always-on services like Warp Drive are reliable – outages are quickly and effectively resolved, and reliability is maintained/improved over time (we tune alerts, build in robustness, and fix operational bugs)
  2. Maintain a healthy engineering culture – oncall is a fantastic way to show engineers how their service behaves in practice and guide reliability work. It can also burn people out though, especially as oncall load creeps up over time.

Being oncall isn’t the only way that Warp engineers contribute to operational work. We also write and update documentation, help answer questions from users, fix bugs, and resolve product quality and reliability issues.

There are also oncall patterns, particularly at larger companies, that don’t currently make sense for Warp. In particular:

Composition

Warp has two oncall rotations: one for the client terminal, and one for our server. Each has 6-10 engineers, so people aren’t oncall too often, but also don’t lose context. For the most part, engineers are in the oncall rotation that lines up with the main focus of their everyday work. However, that’s not a hard requirement — engineers working in other areas are welcome to join an oncall rotation to gain experience in a new domain.

In each rotation, there’s always a primary and secondary oncall. The secondary oncall is there for escalation and coverage (if the primary oncall misses an alert or has to run out for an appointment, for example), but they may also assist the primary oncall. Engineers spend one week as the secondary oncall, followed by one week as the primary oncall. This is supposed to streamline handoff, since they can get up to speed on current operational issues before taking over as primary.

On/offboarding

Given Warp’s emphasis on engineering mobility, it’s expected that engineers will move between oncall rotations fairly frequently. For example, if someone wraps up a server-side project and starts focusing more on the client, they might transition over to the client oncall rotation.

In addition, it’s important to have a routine for adding new Warpers, so that everyone gets involved.

Responsibilities

The exact responsibilities will change over time, but in general, the oncall: