Hello everyone, my name is Atomic Artichoke, and I’m the newest employee of the interviewing.io team, having joined a couple months ago as a Data Scientist.
Atomic Artichoke isn’t my real name, of course. That’s the pseudonym the interviewing.io platform gave me, right before I took my final interview with the company. If you’ve never used interviewing.io before (and hey, if you haven’t already, why not sign up now?), it’s a platform where you can practice technical interviewing anonymously with experienced engineers (and do real job interviews anonymously too).
When it’s time to interview, you and your partner meet in a collaborative coding environment with voice, text chat, and a whiteboard (check out recordings of real interviews to see this process in action). During interviews, instead of your name, your partner will see your pseudonym, like so:
In my opinion, “Atomic Artichoke” is a pretty cool name. It sounds like a Teenage Mutant Ninja Turtles villain, and alliterative phrases are always cool. However, I had some reservations about that handle, because I feel like the pseudonym represented me in ways with which I didn’t identify. I don’t know how to eat or cook an artichoke, I never really understood atoms much, and I possess no mutant superpowers.
But I wondered, how did the interviewer perceive me? Did this person think “Atomic Artichoke” was a cool name? If so, did that name influence his or her perception of me in any way? More importantly, did my pseudonym have any influence in me getting hired? If I had a different, less cool name, would I have gotten this job?
I know, it’s a silly question. I’d like to think I was hired because of my skills, but who really knows? I was curious, so I
wasted invested a few days to investigate.
What we already know about names in the hiring process
You might be asking, “Why does interviewing.io have pseudonyms, anyway?” Anonymity. We want candidates to be assessed on their actual skills, not on proxies of skill like the colleges they’ve attended, the notoriety of their social circles, or prior companies they’ve worked at. If a hiring manager knows a person’s name and knows how to use the Internet, it’s easy to find this information.
I’m not the first to wonder about names and hiring. Plenty of academic literature exists exploring the impact of name choice on various life outcomes. I’ll briefly touch on a handful of those perspectives.
- A 1948 paper concluded people with unique names tended to have lower academic performance than those with more common names.
- A 2003 study observing the relationship between “black” names and life outcomes concluded that after for controlling for other factors, name choice did not affect life outcomes.
- Finally, a 2004 paper specifically focused on the jobs market suggested that resumes containing “black” names received fewer callbacks than those with “white” names, even after controlling for resume quality.
As you can see, academic opinions differ. However, in the case that name-based bias actually exists, maybe we can implement a cheap-enough solution to eliminate the bias completely. Randomly-generated pseudonyms fits that bill nicely.
But as I wondered before, maybe the pseudonym name generator creates a different kind of bias, leaving us in a similarly biased place that using real people’s names leaves us. I first needed to understand how pseudonyms get generated, so I dug into some code.
- Serpentine Gyroscope
- Moldy Parallelogram
- Frumious Slide Rule
- Supersonic Llama
But they can also come up with less memorable, more commonplace, and more boring phrases like:
- Ice Snow
- Warm Wind
- Red Egg1
- Infinite Avalanche
After running through a few example pseudonyms, anecdotally I felt the first list was more attractive to me than the second. It sparked more joy in me, one could say. I just couldn’t articulate why.
That’s when I noticed that certain themes kept recurring. For example, there were multiple Alice in Wonderland references, a bunch of animals, and many types of foods listed. At first glance the chosen words seemed odd. But after getting to know my co-workers better, the list of words began to make a lot more sense.
The co-worker sitting across from me is a huge Alice in Wonderland fan. Our founders seem to love animals, since they bring their dogs to work most days. Finally, food and restaurant discussions fuel most lunchtime arguments. Just in my first month, I had heard more discussion about chicken mole and Olive Garden than I ever had in my life.
While it’s true the pseudonym generator chooses words randomly, the choice of which words get onto the list isn’t necessarily random. If anything, the choice of words reflects the interests of the people who built the application. Might it be possible that the first list appealed to me because they reference math concepts, and I happen to like math-y things?
This insight helped me craft my hypothesis more concretely: all else equal, do some candidates receive better ratings on interviews, because interviewers happen to associate positively with users whose pseudonyms reference the interviewers’ personal interests?
This hypothesis rests upon the assumption that people are drawn to stuff that’s similar to themselves. This seems intuitive: when individuals share common interests or backgrounds with others, chances are they’ll like each other. Therefore, is it possible that interviewers like certain candidates more because they find commonality with them, even though we manufactured that commonality? And did that likability translate to better interview ratings?
To test this, I categorized users into one of the following 6 categories based on the noun part of their pseudonym, which will be called Noun Category going forward.
These broad categories aimed to differentiate among interest areas that might appeal differently to different interviewers. Among these 6 groups, I wanted to observe differences in interview performance. And knowing the pseudonym generator assigns names randomly, we would not expect to find a difference.
To proxy for interview performance, I used the “Would You Hire” response from the interviewer on the interviewee, which is the first item on the interviewer’s post-interview questionnaire.
These two pieces of data led to a clear, testable null hypothesis: there should exist no relationship between Noun Category and the Would You Hire response. If we reject this null hypothesis, we would have evidence suggesting our pseudonyms can impact hiring decisions.
Data analysis and interpretation
I pulled data on a sample of a few thousand interviewing.io candidates’ first interview on our platform, and performed a Chi-Squared test against the observed frequencies of the 6 “Noun Categories” and 2 “Would You Hire” interviewer responses. Each cell of the 6 x 2 matrix contained at least 40 observations.
Below are the mean percentage of candidates who received a Yes from their interviewer, broken out by Noun Category. While most of the categories seemed to clump around a similar pass rate, the History group seemed to under-perform while the Fantasy group over-performed.
The Chi-Square test rejected the null hypothesis at a 5% significance level.
These results suggest a relationship might exists between Noun Category and an interviewer’s Would You Hire response. Which again, should not occur because a candidate’s Noun Category was randomly assigned!2
While this analysis doesn’t predict outcomes for specific individuals, the result suggests it isn’t totally crazy to believe I may gotten lucky on my interview. Maybe I don’t suffer from imposter syndrome, maybe I am an imposter. How depressing.
So what now? Fortunately (or unfortunately) for my new company, if we want to eliminate this bias, I can suggest potential next steps.
One solution might be to pander to an interviewer’s interests. We could randomly generate a new pseudonym for candidates every time they meet a different interviewer, ensuring that pseudonym creates positive associations with the interviewer. Similarly, we could generate more pseudonyms referencing Lord of the Rings and Warcraft, if we know our interviewer pool tends to be fantasy-inclined.
An alternative solution might be to give candidates pseudonyms with no meaning at all. For example, we could generate random strings, similar to what password managers generate for you. This would eliminate any real world associations, but we’d lose some whimsy and human readability that the current pseudonyms provide.
Yet another alternative solution could be to do more analysis before acting. The analysis didn’t quantify the magnitude of the bias, so we could construct a new sample to test a more specific hypothesis about bias size. It’s possible the practical impact of the bias isn’t huge, and we should focus our energy elsewhere.
On the face of it, this pseudonym bias seems trivial, and in the universe of all biases that could exist, that’s probably true. However, it makes me wonder how many other hidden biases might exist elsewhere in life.
I think that’s why I was hired. I’m obsessed with bias. Though I’ll be doing normal business-y Data Scientist stuff, my more interesting responsibilities will be poking at all aspects of the hiring market and examining the myriad of factors, mechanisms, and individuals that make the hiring market function, and perhaps not function effectively for some people.
Going a step further than identifying hiring biases, I’d like to shift discussions toward action. It’s great that the tech industry talks about diversity more, but I think we can facilitate more discussions around which concrete actions are being taken, and whether those actions actually achieve our goals, whatever those goals may be.
I think it all starts with being introspective about ourselves, and investigating whether something as innocuous as a randomly generated phrase could ever matter.
1This is the shortest pseudonym possible on interviewing.io.↩
2This is not entirely true. Users can re-generate a random pseudonym as often as they want, meaning a user can choose their name if they re-generate a lot. However, there’s no evidence this happens often, because we found no significant difference in the observed and theoretical randomized distribution of Noun Categories.↩