Data Engineering Podcast

Mapping The Data Infrastructure Landscape As A Venture Capitalist

Apr 2 '23

Summary

The data ecosystem has been building momentum for several years now. As a venture capital investor Matt Turck has been trying to keep track of the main trends and has compiled his findings into the MAD (ML, AI, and Data) landscape reports each year. In this episode he shares his experiences building those reports and the perspective he has gained from the exercise.

Announcements

Hello and welcome to the Data Engineering Podcast, the show about modern data management
Businesses that adapt well to change grow 3 times faster than the industry average. As your business adapts, so should your data. RudderStack Transformations lets you customize your event data in real-time with your own JavaScript or Python code. Join The RudderStack Transformation Challenge today for a chance to win a $1,000 cash prize just by submitting a Transformation to the open-source RudderStack Transformation library. Visit dataengineeringpodcast.com/rudderstack today to learn more
Your host is Tobias Macey and today I'm interviewing Matt Turck about his annual report on the Machine Learning, AI, & Data landscape and the insights around data infrastructure that he has gained in the process

Interview

Introduction
How did you get involved in the area of data management?
Can you describe what the MAD landscape report is and the story behind it?
- At a high level, what is your goal in the compilation and maintenance of your landscape document?
- What are your guidelines for what to include in the landscape?
As the data landscape matures, how have you seen that influence the types of projects/companies that are founded?
- What are the product categories that were only viable when capital was plentiful and easy to obtain?
- What are the product categories that you think will be swallowed by adjacent concerns, and which are likely to consolidate to remain competitive?
The rapid growth and proliferation of data tools helped establish the "Modern Data Stack" as a de-facto architectural paradigm. As we move into this phase of contraction, what are your predictions for how the "Modern Data Stack" will evolve?
- Is there a different architectural paradigm that you see as growing to take its place?
How has your presentation and the types of information that you collate in the MAD landscape evolved since you first started it?~~
What are the most interesting, innovative, or unexpected product and positioning approaches that you have seen while tracking data infrastructure as a VC and maintainer of the MAD landscape?
What are the most interesting, unexpected, or challenging lessons that you have learned while working on the MAD landscape over the years?
What do you have planned for future iterations of the MAD landscape?

Contact Info

Parting Question

From your perspective, what is the biggest gap in the tooling or technology for data management today?

Closing Announcements

Thank you for listening! Don't forget to check out our other shows. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used. The Machine Learning Podcast helps you go from idea to production with machine learning.
Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
If you've learned something or tried out a project from the show then tell us about it! Email hosts@dataengineeringpodcast.com) with your story.
To help other people find the show please leave a review on Apple Podcasts and tell your friends and co-workers

Links

The intro and outro music is from The Hug by The Freak Fandango Orchestra / CC BY-SA

Data Engineering Podcast Follow