We had the honor of sitting down with Elena Tatarchenko, Engineering Manager at Brex, to talk about what DevOps means to her, how she hires for those roles, and practical advice for other managers who are looking for DevOps talent. Joining Elena today are Mike Robbins, Chandra Bergmann, Mina Zivkovic, and Elliott Jin from various teams across Triplebyte.
Mina: How do you define DevOps?
Elena: DevOps is the art of deploying software changes quickly, safely, and securely.
Chandra: That makes sense. More specifically, how does a DevOps person do that (deploy software changes quickly, safely, and securely)?
Elena: So for us, the way we have it structured is there's a foundational team that is responsible for a lot of the bottom layer. So, managing the Kubernetes cluster, figuring out how code goes from GitHub all the way to running in that Kubernetes cluster. We're the ones looking at any security vulnerabilities that are in libraries, patching code or images.
And really the speed of the deployment also really matters to engineers, because that's how they know when their code is live and working the way they want. The idea is to remove as much context switching as we can, so it's great when they're able to go from a PR to it being live quickly, that way they don't lose focus.
And so there's a centralized team since a lot of that would be reused by every single service that's out there, so we've combined it into a single place that can build tools that are relevant for every single team.
Mina: Why do you think DevOps becoming more prevalent?
Elena: Agile methodologies have pushed us to release smaller features and iterate as we go. It’s somewhat useless for engineers to push out small features if they’re not going to be deployed. Infrastructure has to work at the same speed as development. The need to automate things that used to be manual has made DevOps into a specific role.
Chandra: Since DevOps is such a specific role, how do you think about hiring for a DevOps role vs Generalist?
Elena: A really good DevOps, is a generalist. There is so much to the infrastructure tool set that one must know. Navigating AWS/cloud, configuring CI/CD, packaging software efficiently, configuring the network, scaling the DB, alerting, monitoring, and actually being able to run the app and fix it when it breaks. Beyond that a good DevOps also understands product lifecycles, developer workflows, and business priorities.
Chandra: What differences do you look for between someone going for a DevOps role versus a more full-stack engineer?
Elena: So for us, a lot of the skill set is around being able to configure AWS, which is the cloud that we use doing all of that packaging and configuring Kubernetes. All of that. And too frequently, we're still the first line of defense of like
Something weird in the app is happening, like there's too many 400s. If we didn't configure the monitors correctly, that'll actually go to us instead of the team that is actually ultimately responsible. So, we have to be able to trace down enough of the code to at least route it to the right person, but in some cases we just end up fixing it. So you have to know enough of the application layer to be comfortable doing that.
Elliott: Do you have any advice for candidates looking for DevOps roles?
Elena: You can’t learn DevOps from a book. Practical experience is key. Frequently it’s not even something that’s really taught at schools, lots of people are self taught. Most of the people I know in DevOps have imposter syndrome.
Mina: Earlier, you said that beyond a good DevOps person there's product life cycle, and developer workflows, and business priorities. How does an engineer go from being a good generalist to being a good DevOps engineer?
Elena: It probably depends on the company itself, but usually there's some things you can do to streamline your own work that can be extended out to a lot of engineers. I think that's probably the easiest way to start out. You may also know a lot more about your service in terms of how it scales, if it should be talking to its own database, and those are really the right kind of mindset for a DevOps person to start thinking about how their application is going to be isolated.
Chandra: You also talked about some about agile methodologies making DevOps become more prevalent. I was hoping you could share a little bit more detail about how entwined you would consider agile and DevOps, and, the way that your company is set up, is it meaningful to distinguish one from the other or are the methodologies very intertwined?
Elena: DevOps ends up doing a lot of the agile methodologies, anyway, but I would say just historically what pushed it to be such an important aspect of any organization is the need to ship code really quickly, and so you used to be able to just copy code into a server and then scale it manually. All those things now have declarative ways of doing that and so you're really blurring the lines of what's application code versus what's infrastructure code, and so skill sets are overlapping, and everyone’s moving faster to get there.
Mike: Ship code faster means minutes instead of days or week? Instead of packaging releases, right?
Elena: Right. So we're completely continuous. Any single PR gets deployed by itself.
Mina: I would love to ask you more about hiring. Our customers ask us all the time about setting up a hiring process. They even ask
How would you set up an on-site interview? What advice would you give? What is your hiring process - either past or present?
Elena: I think every company I've worked at, we've ended up at the same process, a technical phone screen to be that extra layer of weeding out candidates that sound really good on paper but actually can't get the job done. It's possible that they've spent too many years either managing or being architects.
So, technical phone screen is really key. Usually we'll do some sort of recruiter screen. That one varies a lot. I've been in companies where that's the hiring manager that's doing the resume screen, figuring out if it's aligned in terms of where the role is going and what they're looking to get from it.
Mina: What are some common pitfalls when hiring for a DevOps role?
Elena: It’s a wide breadth of skills, [it’s] hard to be an expert in all areas. We like to think of it as spiking in one or two areas but we do not expect you to be good at everything.
It’s hard to design a process that is agnostic of knowing a particular set of services. If we ask you to debug a broken system that’s running a specific stack, I have no doubt you’ll be faster at it if you’ve run that particular stack in the past.
Some companies have extremely large DevOps teams so it’s also possible to interview specialists. We’re mostly not at the point where that can work for us, unless that person is a fast learner and willing to expand their skillset.
Some of these tools are legitimately tricky and take a while to build expertise, like AWS for example. We don’t penalize if you know a different cloud better, but you will ramp up faster if you’ve managed things on top of AWS.
Most importantly, DevOps unlike other Software Engineers rarely (if ever) has a dedicated major in college or even bootcamps. One way or another, people have picked up those skills outside of traditional institutions. Be careful that your initial screen doesn’t weed out strong candidates that may have nontraditional backgrounds.
Mina: To back up a little bit, how do you know when you’re ready to hire a DevOps person? And what do you look for in a first hire?
Elena: Start with making a list of projects that DevOps hire would do. Is it enough for a full time person? If not, there are some great DevOps contractors out there.
Usually I find the first hire sets a lot of the best practices and direction, so I think if you're lacking that sort of knowledge of what is the best practices for DevOps, I think you have to be very careful that your first hire will know that.
Chandra: Has there been something that someone has done in a DevOps interview that kind of blew you out of the water and made you think
Okay. We definitely want to hire this person.
Elena: We don't generally expect people to be really good in more than a couple of areas. There have been a few that really blew us away on basically everything we asked in terms of networking, how completely they get to the right answer, fix the system. Yeah. Just different aspects.
Elliott: You talked about there being a central team for this. How did you think about the trade off versus maybe having embedded people within each team to handle that side of things?
Elena: I think in this particular case it really wasn't up to me, but it kind of makes sense to do both, I think. Because there's enough reusable components that every team will have, but you also need someone who understands enough of the infrastructure side on each team to be able to use those components. And so, the way we think about it is naturally somebody on that is forced to learn enough to be successful, and then that becomes the champion on that team to go.
Elliott: Should that champion be the one who carries the pager? Or who should be carrying pagers?
Elena: Oh, everyone's carrying pagers. I think the goal is really just to have monitors that can figure out the right person to alert when something goes down.
Chandra: One of the misconceptions that we’ve heard is that
Oh, DevOps is the person who carries the pager, is the one who's responsible if the system goes down. What do you think of that view?
Elena: Yeah, you'll have very angry DevOps that'll definitely push against features getting released that are not ready. So, I think whatever you do, try not to get yourself in that position. That's definitely something we've changed, actually before I got there.
Elliott: What was the change?
Elena: It was kind of a centralized team that would always get alerted as something went down, and they would have to route. This was never the ideal desire, but we just hadn't invested enough to actually have more granularity around knowing what went down and who should be alerted.
Chandra: We’ve gotten the impression that having only one team that carries the pager is sort of the antithesis of DevOps. Would you say that you agree with that?
Elena: Yeah. I think that's true. I think that was the old world, and the new world is definitely that we're all responsible for the health of our services.
Note from the author: This interview has been edited and condensed for clarity.