DEV Community

Cover image for Notes on Data Engineering Zoomcamp 2025 - Launch Stream
Pizofreude
Pizofreude

Posted on

Notes on Data Engineering Zoomcamp 2025 - Launch Stream

Overview:

  • Course Edition: Fourth edition of the Data Engineering Zoomcamp.
  • Purpose of Stream: Introduction to the course, syllabus, logistics, team members, and Q&A session.
  • Key Topics Covered:
    • Course structure and syllabus.
    • Introduction to team members.
    • Tools and platforms for learning and communication.
    • Logistics and homework submissions.
    • Importance of community learning.

Course Team:

  1. Victoria:
    • Works at DT Hub and was part of the first Data Engineering Zoomcamp cohort.
    • Covers Analytics Engineering in the course.
  2. Alexey:
    • Founder of DataTalks Club.
    • Previously a data scientist with significant experience in data engineering tools.
    • Covers Docker and Spark modules.
  3. Michael:
    • Senior Data Analyst and teaching assistant for the past two years.
    • Content creator and runs a YouTube channel called "Data Slinger."
    • Assists with content and troubleshooting.
  4. Bruno:
    • Senior Data Engineer at Intuition Machines.
    • Extensive experience in data engineering.
    • Provides guidance and support to participants.
  5. Will and Anna:
    • Work at Castra.
    • Cover Workflow Orchestration (Module 2).
  6. Zach:
    • Staff Data Engineer and instructor in another data engineering bootcamp.
    • Focuses on advanced topics like Flink.
    • Founder of dataexpert.io
  7. Anush and Seal:
    • Part of the original team that launched the Zoomcamp initiative.
    • Anush remains active in supporting the community.

Course Syllabus:

  1. Module 1: Introduction to Docker and Google Cloud setup.
    • Google Cloud offers $300 in free credits for first-time users.
    • AFAIK, GCP offers two type of free trial for its cloud services:
      • Free Trial with $300 credit which requires billing details (all features available)
      • Sandbox option without requirement of billing details (limited features)
    • Since the Sandbox options does allows user to use the services required by this course, I will start with Sandbox option and will consider the Free Trial should the limited features of Sandbox doesnโ€™t comply with the course full requirements.
  2. Module 2: Workflow Orchestration.
    • Covers orchestration tools such as Prefect and Airflow.
  3. Module 3: Data Warehousing.
    • Emphasis on BigQuery and PostgreSQL.
  4. Module 4: Analytics Engineering.
    • Introduction to DBT (Data Build Tool) for SQL transformations.
  5. Module 5: Spark.
    • Focus on distributed data processing.
  6. Module 6: Stream Processing.
    • Includes tools like Kafka and Flink (TBA by Zach).
  7. Workshop: Data Ingestion with DLT (Delta Live Tables).

Logistics:

  • Content Delivery:
  • Homework:
  • Community Support:
    • Slack is the primary communication platform.
    • Participants encouraged to use threads for organized discussions.
    • Learning in public (e.g., posting progress on LinkedIn) is recommended.

Learning in Public:

  • Benefits:
    • Helps participants build their personal brand.
    • Encourages networking and community engagement.
    • Demonstrates growth and dedication.
  • Examples:
    • Sharing project updates or lessons learned on LinkedIn.

Tools and Recommendations:

  • Google Cloud Platform (GCP):
    • Recommended for the course due to its ease of use and free credits.
    • AWS and Azure are also options, but GCP is more straightforward.
  • Additional Tools:
    • Participants are encouraged to explore other platforms and tools beyond the syllabus, such as data governance and scripting with Makefiles and Bash.

Q&A Highlights:

  1. Career Preparation:
    • Prepares participants for roles in data engineering and analytics engineering.
    • Emphasizes project-based learning.
  2. AI and Data Engineering:
    • AI is unlikely to replace data engineers but may enhance productivity.
    • LLM Zoomcamp is pretty much AI for Data Engineering, highly recommended after completing DE Zoomcamp.
  3. Key Advice for Success:
    • Consistency in learning and building projects.
    • Active participation in the community.
    • Sharing work publicly to stand out.
  4. Beginner-Friendly:
    • Suitable for those new to data engineering, even without prior software engineering experience. Chicken and egg problem ๐Ÿ˜‰

Final Notes:

  • Contributions:
    • Participants encouraged to contribute to open-source projects via the "Open Source Spotlight" on the YouTube channel.
  • Focus on Projects:
    • Priority given to delivering projects rather than homework for certification.
  • Future Learning Opportunities:
    • Check out related courses like the LLM Zoomcamp for AI-focused topics.

Motivational Message:

  • Stay consistent, actively participate, and leverage the community for support.

Top comments (0)