Jay Kreps, Founder and CEO of Confluent (and co-creator of Apache Kafka)
When Jay Kreps was at LinkedIn, he had several mission-critical responsibilities. He was the technical lead on the platform’s search systems, recommendation engine, and social graph. But perhaps most impressively, he was one of the co-creators of Apache Kafka, an open-source software now used by over 100,000 organizations globally, that helps companies efficiently handle real-time data feeds like LinkedIn’s own.
But back when Kafka was built in 2008, hardly anyone understood the product outside of the developer community. So when Kreps left LinkedIn in 2014 to build Confluent, which was supposed to be “a fully managed Kafka service and enterprise stream processing platform,” few people knew what that meant and almost nobody believed in the mission.
To Kreps, that was okay — from day zero, he was determined to build a successful business in the event streaming niche, one he predicted six years ago “could serve as a kind of ‘central nervous system’” to some of the world’s most complex systems and applications.
That prediction paid off. The company recently announced that it hit $100 million in annual bookings and raised a $250 million Series E, bringing its valuation up to $4.5 billion.
Recently, I sat down with Kreps to dive deeper into the Confluent story, starting with his Kafka days and transitioning to what it’s like — both the ups and downs — to build a new category in software engineering.
Building Kafka Showed The Value Of Near-Instant Data Streaming
LinkedIn’s active status feature is an example of the the platform’s necessity to have real-time … [+]
For LinkedIn users like you and me, it can be difficult to visualize just how much data a platform needs to keep track of to make connection recommendations or even properly return the right result for something we’re searching for. Then consider that LinkedIn has over 260 million monthly active users — and now you have a real challenge on your hands.
“LinkedIn was an incredibly data-rich service, with lots of applications, databases, and analytics layers that helped it operate. The big problem we faced was how all these could connect into one holistic service in a way that really let us build interesting products and harness the data we had,” Kreps told me.
“Our approach to doing this was around event streams, that is thinking of everything happening in the business as a kind of continuous stream of data, whether that data was about the users joining the service, updates to peoples’ LinkedIn profiles, connections being formed, or posts being made to the newsfeed.”
Fundamentally, that’s what Kafka was — a technology that allowed other systems to tap into these streams and respond in real-time to everything that was happening. Despite the clear necessity for software to help manage real-time data streams, the category was still in its infancy when Kreps left LinkedIn in 2014.
The Early Challenges Of Creating A New Category
Despite the challenges, Confluent managed to raise money from the likes of Benchmark, Sequoia, and … [+]
Courtesy of Confluent
Despite knowing he was onto something from his experience building Kafka at LinkedIn, Kreps had to face the difficult challenge of building in a category other people felt to be unproven.
And just like building any other new category, it is a straightforward concept that showing people what exactly you are building and why they should care are key, but it’s a tough problem to solve.
“In theory, Silicon Valley folks love radically new ideas, but in practice, I think everyone is more comfortable with companies that are a new twist on something that already exists,” Kreps told me. And it was a lot of Silicon Valley investors’ inclination towards comfort that made it very difficult for Kreps to raise money for Confluent in the early days.
Critiques ranged from suggesting Confluent “create a vertical solution to one problem rather than a general-purpose infrastructure platform with many use cases” to “wanting us to attack a single existing entrenched vendor versus something more cross-cutting,” Kreps adds.
Staying The Course And Ignoring Noise
Kreps largely ignored these critiques and continued building what he thought would work, quietly garnering tens of thousands of users often originating from open-source adoption.
While early investors zoned out during the Kreps’s explanation of the problem he trying to solve at Confluent, his company’s numbers and traction quickly snapped them out of their trances.
Competing With Incumbents — And Winning Over Clients Like Walmart WMT , Capital One COF , And Domino’s
Real-time data streaming allows Domino’s to share the status of its orders with its customers
NurPhoto via Getty Images
While Confluent wanted to build a comprehensive solution to the real-time data problem, other far larger companies had already taken a stab at solving parts of the problem.
One of those companies was IBM, which already had middleware and messaging products, but in Kreps’s words, “these were a kind of ‘duct tape’ solution to gluing together applications and data systems.”
To him, these solutions weren’t close to sufficing. “They were either slow and batch-oriented, with data arriving only at the end of the day. Or they weren’t scalable and couldn’t support modern, data-intensive use cases. Or they were too limited in the kinds of use cases they could handle. Or all of the above,” Kreps told me.
So if he could build something real-time, reliable, and be able to scale up to trillions of events per day, Confluent would be able to offer a far superior service.
That it did — and companies like Walmart bought in.
In the retail giant’s case, its “customers can order groceries online or on the Walmart app, drive to the store, and have a Walmart associate put the groceries in the trunk. This is from the real-time inventory view in the app, to placing an order for pickup, to having the groceries placed in their car.”
The instances in which real-time data processing is crucial are seemingly endless; from banking to food delivery, companies like Capital One and Dominos have both benefited in similar ways from the Confluent platform despite being entirely different businesses.
Leveraging Kafka And Its Following To Attract Talent
Events like Kafka Summit bring together hundreds of engineers to discuss real-time data streaming
Confluent / 2019 Kafka Summit
From a talent acquisition perspective, there are two big challenges for highly-technical and category-creating businesses:
1) It can be difficult to find the right technical talent to keep up with development
2) There are considerably fewer people interested in “new categories” than ones that have already been proven
Fortunately for Confluent, building around Kafka made hiring considerably more streamlined. After all, there are few online communities that are more passionate than the open-source community. And Kafka, with widespread adoption among technology organizations and a massive following, is a prime example of this principle.
Beyond Kafka, as event streaming continues to grow as a category, people are starting to increasingly see its value. “A great example is our VP of Product and Engineering Ganesh Srinivasan who came from Uber UBER . He got to see the impact of large-scale usage of real-time streaming and came to Confluent because he thought that could have a similar impact in all kinds of companies,” Kreps shared with me.
As Confluent continues to grow, doubling its annual recurring revenue in the last year according to a recent press release, the company wants everyone to see the importance of real-time event streaming — even those who aren’t developers.