Affirmer Spotlight: Meet Elaine

Taylor Law
Affirm Tech Blog
Published in
6 min readApr 8, 2019

--

Elaine Arbaugh (fourth from the left) with other Affirm engineers

Up next in our Affirmer Spotlight series, we connected with Elaine Arbaugh from our Platform Engineering team. She gives us a rundown of her team’s responsibilities and her favorite thing about working at Affirm.

Q: Can you give us an overview of your role and what your team does here at Affirm?

A: I’m a Site Reliability Engineer (SRE) on the Platform Engineering team. At Affirm, SREs are responsible for maintaining our metrics, monitoring, and alerting infrastructure; capacity planning for scaling events; and supporting initiatives across engineering to improve performance and reliability. Some of the biggest projects we’ve worked on recently include scaling our systems for Black Friday and merchant launch events (like this recently announced partnership!), adding instrumentation for all of our SQL queries at the python level, and moving some of our systems from using gevent to multiprocessing. We also run State of the Site, a weekly meeting where representatives from each engineering team discuss anomalies and issues that happened in the previous week as well as how we’re tracking against our SLOs for latencies and error rates.

More broadly, the Platform Engineering team is focused on building infrastructure that is performant, fault-tolerant, secure, reliable and observable. Some of our work includes setting up servers and databases using AWS, while other parts involve building frameworks in our platform for other teams to use. We’re split up into several sub-teams, which include:

  • Platform Foundations, which handles our development, testing, and deployment process and building tools for other engineering teams.
  • Data Engineering, which handles offline data infrastructure.
  • Security, which includes security engineering, handling audits, and building security programs.
  • Operations, which includes setting up our infrastructure and automating our processes.
  • Frameworks, which builds frameworks used in our infrastructure and in our platform code.

Our sub-teams work cross-functionally and people across teams are exposed to what other sub-teams are working on since we collaborate on design, consulting, and oncall and triage work.

Q: What led you to Affirm and joining the Infrastructure team?

A: I interviewed at Affirm in late 2015 and started in August 2016, so the company has grown a lot since then. Coming out of undergrad, I was interested in startups because I thought I could learn about a wide variety of things and work on things that could have a big impact on the company. I was really impressed with the people I met while interviewing, especially our then CTO (now President of Technology), Libor, as well as our exec team’s experience and the company’s mission. I’m still really inspired by our mission and I’m reminded every Wednesday when we send out Customer Spotlight emails with a customer interview about how Affirm improved their life. I also thought Affirm was in a good position to grow, which I’ve definitely seen during my time here, both in terms of growing our team and growing our business. I’ve been looking at loan volume numbers for our site every day for over two years and seeing the growth has been insane!

Coming out of college, I wasn’t really sure about what type of software engineering I wanted to do. I didn’t know I was on the Infrastructure team until my first day at Affirm when my manager told me I was going to be Affirm’s first SRE. My manager was impressed by my internships and had a gut feel that I would make a good SRE, and although I wasn’t really sure what I was getting myself into back then, I’m really glad I was placed on Infrastructure!

I’ve gotten the opportunity to learn about a lot of different technologies and systems, and I’ve been able to learn from some very strong engineers on the team who have really shaped how I think about software development, system design, and collaboration. Platform and Infrastructure can definitely be stressful at times, especially when things go wrong and you’re often the first point of contact for resolving the issue. However, it’s definitely fun and satisfying to work on foundational systems, and it’s really rewarding after periods like Cyber Weekend to be part of a team that built the systems that could handle so much traffic.

Q: What is your favorite thing about Affirm? Why? (e.g., the people, our mission, a specific perk, etc.)

A: My favorite thing is the yogurt pretzels, although I’m trying to cut down since they’re pretty bad for you! Other than that, I’m a big fan of a lot of our perks, especially the cool events we do (recent ones I’ve been to included a wine and paint night, Lightning Talks and a Black Leadership panel sponsored by Black@Affirm, one of our wonderful Employee Resource Groups).

However, my #1 favorite thing about Affirm is the platform engineering team. I think we’ve built a team of really smart, humble engineers who all have awesome ideas, are great at getting things done, and really care about making sure we build well-designed systems that are reliable and well-designed. People on the team have strengths/interests in different areas (e.g., some people are really well-organized and document everything, some people love refactoring code to make it better, some people will jump in at any problem that’s happening in production to help out, etc.), and I think our combination of different skills helps us be really effective as a team. I’m really impressed at the progress we’ve made and how our infrastructure has improved since I got here, and that’s a testament to how effectively our team has worked.

Q: What has been the most interesting project you’ve worked on since joining Affirm? What was the project’s significance for your team and/or Affirm?

A: I’ve worked on a really wide variety of cool stuff at Affirm, so it’s hard to pick one. The most exciting project has been Cyber Weekend scaling — it took a lot of hard work and planning, but watching the loan volume numbers on Black Friday and Cyber Monday and seeing how our infrastructure can handle the additional traffic without breaking is really fun and rewarding. (One of my favorite parts of my job is refreshing dashboards all day on Cyber Monday and watching the numbers go up!)

Another project that was really fun to work on and led to interesting results was improving overall observability within Affirm by adding instrumentation to all our SQL queries. I used sqlalchemy events to gather data including how long our queries took, which databases they hit, and whether they were using read replicas, and emitted logs with that data as well as information about where the query came from in our code. From this data, we can track all the queries that are run when a user makes a request, which can help us find places to improve how we’re emitting queries from Python code and identify slow queries that we should try to optimize. I had a lot of fun looking through this data and trying to find insights. One thing we found was that each request to our endpoint for changing users’ phone numbers caused our code to make over 10,000 SQL queries. After figuring this out, we were able to change the queries to gather data more efficiently and reduced the number of queries for the endpoint to 8, which sped the endpoint up significantly. We also found that we were making a lot of unnecessary `SELECT now()` queries and were able to replace them with `datetime.now()` in python, which speeds up our code by removing round-trips to the database as well as reduces load on the database.

The insights I found from this project were really fun, and I also learned a lot about our logging code and SQLAlchemy. The project also involved creating new Elasticsearch clusters and changing parts of our logging pipeline, and it was a great learning experience to touch so many different parts of our stack.

--

--