Four years at Version One & some thoughts on “Moneyball for VC”
Data / AI / MLThis week marks my fourth year at Version One. And to celebrate, I want to share the story of my interviews with Boris, my first endeavour to build a sourcing model, and his incredible leap of faith.
In the summer of 2013, I was introduced to Boris by a fellow Canadian, Brendan Baker. I had met Brendan that past Spring when I had just left Insight Data Science and he was at Greylock Partners. Boris had closed Version One Fund I a year earlier and was looking for an analyst, so Brendan made the introduction.
My first chat with Boris was over Skype and he asked me about my background and why I wanted to be in VC. In retrospect, I was probably a bit too honest when I said I didn’t know whether I wanted to be an investor. I told him that what mattered most was that my work aligned with my core values: continuous learning, compassion, and freedom.
My answers must have impressed him enough that Boris suggested we meet in-person two weeks later when he would be visiting SF from Vancouver. In the meantime, he gave me an assignment: find three companies that would be a fit for Version One.
My immediate reaction was: why only 3 companies? Naively thinking that the problem was our own capacity to sift through the large quantity of startups (and not the fact that there is a large noise to signal ratio), I asked myself, what if I just developed a model to find more companies?
Over the next two weeks, I set out to build my own Mattermark / CBInsights by aggregating the APIs of Crunchbase, AngelList, and Twitter, as well as any other relevant datasets I could get access to. I then applied some ML algorithms in an attempt to predict startup success.
When it came time meet Boris, he asked which three companies I had for him. I don’t remember the exact conversation, but it went something like this:
Me: I didn’t find three companies because my goal was to build a model that identifies more companies in a scalable way.
Boris: So did you build it? What did you learn?
I’ll share my answers in the paragraphs to follow. Despite my unconventional response, Boris must have thought that my out-of-the-box (read: crazy) thinking would somehow and in some way be complementary to him.
Four years later, we certainly continue to complement one another and I’m thrilled to be part of the Version One partnership.
Since that first attempt to build a sourcing model, we have explored data-driven sourcing. But just as I had uncovered during the interview, there are two major challenges:
1) Data is incomplete, inconsistent or unavailable
We only know that a company exists if they choose to announce or declare themselves to the world (this leads to incomplete data). The next challenge is that all the various data sources have different definitions and labels (leading to inconsistent data). And lastly, most business metrics, which are rich in information, are kept private (and this leads to unavailable data).
2) Does success = a $1B unicorn?
If we define “success” as a company with a valuation of $1B, then our sample is skewed extremely heavily to failure. In fact, in 2015, Aileen Lee at Cowboy VC found that only 0.14% of venture-backed consumer and enterprise tech startups became unicorns. As a result, most, if not all, ML models will simply predict “failure” because they can hit an accuracy of 99.86%. In other words, these algorithms aren’t so strong at picking outliers (which is every unicorn).
“Moneyball for VC” continues to fascinate me. While I am close to conceding that it is perhaps too difficult to predict success at the seed stage, I think there is great potential for funds that invest at later stages or are exploring other ways to provide “capital as a service” (like my friends at S+C). At later stages, there’s more access to private data that they can benchmark against their own successful portfolio.
So, what do we do at the seed stage? First, we may need to redefine the definition of “success”. For a micro-VC like Version One, we don’t need a unicorn to have a meaningful returns. For us, a $100m or $200m exit would be significant and as such, we can increase the sample size for success and have a less skewed failure set to help with prediction.
Another approach is to start quantifying founders and not companies. That is, can we identify the qualities and traits of a successful entrepreneur, and then evaluate founders accordingly? Boris and I have been refining our “founder checklist” which includes qualities like hunger/ambition, craftsmanship, storytelling, quick learner, etc. If any of you are interested in or are working on founder analytics, I’d love to chat.
In the meantime, as I cross into year five, here’s to another awesome year ahead. And thank you to the best partner I could ever ask for, for taking such a huge leap of faith in me.
-ange 🙂