Why? My experience with them was pretty bad. I took their assessment for web development, I think I even did an assignment, and got put on a video call with someone from Triplebyte. He never cracked a smile. Suddenly I got asked a bunch of CS questions that really were not very relevant to web development, some of which were entirely inappropriate like sorting a binary search tree. I even told the guy that I thought I was getting those questions wrong and he just scowled and said "well you just don't know when you're going to use this stuff." "My point exactly," I thought.
Ultimately I got rejected.
The whole idea that you can boil down a candidate to some coding challenges and a video quiz is bad. I do like the idea of streamlining the hiring process for developers, but there's more to it than knowing a bunch of stuff, because that can be gamed. And quizzing me on irrelevant material was a bad move. A firm like Triplebyte won't be as good at interviewing a candidate as the employer itself, and may even keep perfectly qualified candidates out of view from all employers affiliated with them.
1) Create an incremental game (start simple, see how far you can get) 2) Live debugging (can run tests, they fail, you need to figure out why and fix it) 3) Flash rounds (Do you know what Big O is? Can you explain linked lists?) 4) ... I forget
I thought it was one of the widest range of actual skills and their final assessment I agreed with. Stayed away from algorithms-y questions (which I hate)
- The screening had a lot of false negatives. "I got rejected by Triplebyte, but got a FAANG offer" is quite common.
- Most companies used Triplebyte not as an interview replacement, but as an additional screening process, which means that as a candidate, you don't have any real incentive to use them.
The only real use case I heard recently about Triplebyte is to send candidate who normally you wouldn't even screen, so if they pass Triplebyte process, you know that you should consider the candidate, but if they fail is fine because you would have passed them anyways
yes, there are too many variables between the candidate, job, company, and work environment to determine long-term fit via a test, especially for "creative" jobs. the more regimented the job (e.g., fast food cook), the lower the variability, but it's still significant. plus, such tests only evaluate technical skills, not the more important non-technical ones (like punctuality, integrity, steadfastness, etc.--note that these are a function of the involved parties and the relationship between them, not just the candidate).
but also, the underlying problem of hiring is not one of trying to get the best fit, but of trying to avoid the pain of firing. that's the thing that needs to be reframed/solved, but that's a much harder and a much less technical problem (alternatively put, technical tests are marginal at best).
I started using them about a year ago (first passively looking, then actively looking)
I really enjoyed the ability to be assessed on something besides Leetcode style questions.
I didn't take a job through their platform (though I did get one really strong offer), but even still, found the assessments incredibly useful, since they give you a percentile distribution of your performance for each topic-specific test.
After taking their assessments, when interviewers asked me how I am at, say, Python, I could tell them I have a hard time assessing my capabilities. "But hey, I took this standardized test that says I'm in the 85th percentile, not sure how good of a metric it is" (and not mentioning that I think I'm OK at best, at Python)
It's the only way I've found to get a measure of your talents compared to the rest of the field (even if it might not be reliable/useful)
A lot of the companies that interview through Triplebyte also skip LC mediums because they have a different signal about your potential suitability as a candidate.
Way too much of engineering is non-quantifiable. Putting a number to someone's skills is bound to be reductive at best.
I used them many years ago, this was my impression. When I got to company "on sites" they were just full-blown interview loops. I could have just applied to the companies directly.
With the money they raised, after spending so much on marketing, I assume they downsized, lost some talent, and pivoted mostly to a sales-driven recruiting business for their top clients.
Like honestly I might think I'm a 3 at X, but if some test that thousands of other people took tells me I'm in the 90th percentile of X users, that information is still useful to me.
But it needs to be:
- In-depth. Not just a single exam or interview. You need to really know the employee’s strengths and weaknesses
- Detailed. You can’t just give someone pass / fail or a single score. Not only is it mean, but you end up getting misaligned candidates anyways, because some people are really bad at some aspects of software but good at others. In fact maybe the process should ditch scores entirely and just show the recruiters the actual employee interviews, and what he/she has and has not accomplished.
- Changing over time. Not a short period of time. But like, if I take the assessment, 6 months later I can take a smaller assessment and it will update my scores and log my progress.
Triplebyte is not 1 or 2. Idk but I think it’s 3 and you can retake the quiz. But then it’s only telling employees if you’re basically competent for some arbitrary statistic, which doesn’t even tell if you’re basically competent at the company.
I think it would be nice if i could take one thorough interview instead of several less-thorough company-specific interviews, but that’s not Triplebyte.
And even if they stayed to their original model it would have been too easy for niche competitors to erode their margins. Think Triplebyte for Android developers only times 20 different programming areas.
Companies wouldn't trust a third party to run binding technical assessment for them, and quality devs would probably avoid places that hire without having someone from the destination team show up
I think the opposite would be more beneficial: a light check to validate work claims and some high level foundational question about code just to make sure one has basic proficiency in what he claims he has
Then companies would need lot less hr pre-screening and could focus in technology and culture matching
You dodged a bullet, some of the most "interesting" interactions I've had with founders were from those I talked to through TripleByte. There was also a pair of them that were clearly digging around for business ideas and markets to enter.
TripleByte isn’t what they used to be, though. I don’t think they do anything close to what I experienced anymore.
So it helped me in introducing me to companies I wasn't previously familiar with, but other job platforms work in similar manner.
I get the impression that Triplebyte has changed from what it used to be. I never even talked with anyone at Triplebyte. I did well enough on some skill test that I was put on a fast track and quickly approved without any interviews. I also got opportunities to take other tests to rate my skills in particular areas.
It seemed like a decent place for presenting possible candidates to employers with some pre-screening, but it wasn't anything particularly innovative. I imagine that as an employer, it helps filter out a lot of unskilled candidates with pretty resumes and reduces the number of interviews required.
Speaking as someone who doesn't like being reductive, I've had to make my peace with the fact that "flawed" can still be "way better than the status quo". And I really do think we do much, much better than the status quo of companies putting a bunch of top schools in as linkedin keywords.
If you're familiar with data science a bit - think of it as trying to project out the first couple principal components of your skills. It won't account for the whole data set, but you can go a long way with just those first couple components.
If you can hire someone skilled, but who other companies might overlook, that's a huge benefit to your team.
Triplebyte is really just the first screen, and the importance each company wants to put on the signal is up to them.
I spend weeks drumming up two or three days worth of coding work. The coding aspect is basically manual labor and pedantic arguments with other devs.
One complaint I do have is that (in addition to the percentile bucket) they give you a 1-5 rating, where 4 is "senior engineer level" and 5 is something "exceptional performance, a leader in the field"
But the ratings seem to fall at different percentile distributions for each test.
For example, I might get 80th percentile on one test, but get a 3 rating, and for another test, 80th percentile is a 5.
In general, different quizzes have vastly different populations of people attempting them. For example, our front-end quiz gets a lot of beginners and hobbyists, and thus has a very bottom-loaded score distribution. Our devops-related quizzes, on the other hand, have a population that skews skilled and senior, and has a very top-loaded score distribution.
Communicating this information to our users (particularly the less-quantitatively-oriented ones on the company side) has been a source of considerable UI challenges for us.
Personally speaking, I've used Python a handful of times over the years, but never as a primary language for any work I've done. I got a 4 on the Python test.
Compared to front-end, which I've been using professionally, and also dabbling in for ~20 years (still keeping up with developments in the years in which I wasn't primarily doing front-end dev professionally)
I got a 3.
I definitely know 100X as many random facts about front-end APIs, libraries, tooling, and technologies than I do about Python. So perhaps it just came down to luck (guessed unlucky for the front-end and lucky for Python). Or perhaps there's just so much more to know that falls in scope for the front-end quiz than there is for Python, to the point where you can spend 20 years learning the front-end technologies and still be "middle-of-the-road" in terms of "absolute level of skill".
But I think that makes your descriptions of the 1-5 rankings a bit disingenuous. If people who has (what most other companies would consider) senior-level knowledge is generally considered a 3 by your system, a more honest ranking of the descriptions would involve changing "4: level expected of seniors" to "4: knows roughly ~80% or more of all things there are to know about this subject".
a 10-day contract is better, since it's real work, for pay, but the relatively short duration doesn't tell you much about the candidate's intrinsic motivation or how relationships develop past the honeymoon stage.
so really, it'd be best and easiest if we all explicitly assumed that jobs had 6-12 month trial periods, for both parties, and that after that time, either can walk away without hard feelings (or negative judgment), other than in the most egregious cases (i've seen a couple cases that'd fall in this category). again, this is primarily about jobs that have the most variability. less variable jobs don't need as long of an evaluation period (but do need more than a few weeks).
my (now failed) startup in this space attempted to answer it for less variable jobs (hourly work), where we could tease out more signal from the noise, but even that had lots of unaccountable variability.
They're set by experts in the area, the same as the ones who write our question content.
To give a little more detail: the tests run in a beta state for a while before being fully released. We gather a bunch of data and calibrate parameters for our IRT model based on that. So the ordinal ordering of performance is entirely mathematical and data driven. (When we were still doing in-house human interviews, those were part of the data set as well, and still are for the subjects that overlap them.) But that produces a continuous, hard-to-interpret, and population-dependent score distribution, and SMEs draw the lines with which we bucket those scores. (For those of you familiar with IRT as a framework, they set theta thresholds.)
But yes, there is some chance involved. It's a tradeoff between the standard error in our scores and the length of the quiz, and we try to optimize for a sweet spot there (since most people don't want to take two hours of quizzes). And we are absolutely going to get it wrong sometimes. That's both for in-model reasons (the statistical standard error is enough that we we'll be off by a level either way around 20-25% of the time or something like that) and for out-of-model ones (maybe some of our questions just test the wrong thing in ways that don't show up in the data). Assuming your self-assessment is correct (and I will say that many peoples' are not! confidence correlates with skill, but with a whooooole lot of noise.) then yeah, you probably had a bad roll of the dice on one and not on the other.
As I say a lot (in this thread and elsewhere), we can't reasonably bat 1.000: our goal is to bat better than the next guy. And I think we do do that, messy though the entire space can be in practice.
---
For the record, when we talk to companies, here's what we tell them about scores:
2 = knows something in this area, but we can't say with confidence that they know enough to handle things independently. OK for entry-level roles, but lower than you'd like for others. We don't show a 2 on profiles. The only place companies see it is if they're using our screens to screen their own candidates via our Screen product. The idea being that if you have the choice of whether to take an assessment or not in the first place it shouldn't really hurt you to try.
3 ("Proficient") = professional competence in that area, can work independently in it. A score we'd expect of a mid-level engineer within their area of expertise. A recommendation for most roles, maybe not very senior ones (but not a point against even for senior roles). A score of 3 or above counts as certified, meaning it earns a shareable certificate and makes you appear in search results for a particular quiz score.
4 ("Advanced") = significant expertise, something more typical of a senior eng who really knows their way around the subject. A recc for all levels, even very senior ones.
5 ("Expert") = exceptional, above and beyond even by the standards of senior roles
I can't imagine the difficulty to accurately measure the success or failure of long-tail HR hiring processes like phone screens. The success or failure of a candidate post-hire has so many variables it must be very hard to attribute them to signals present in a screen. I imagine most of the data points are derived from signals found in successful candidates, and then trying to find them in an assessment or screen.
Its really hard, and I hope the negative tone of my comment does not suggest I don't respect the problemset and the people willing to throw themselves at it.