DEV Community

Cover image for AI and All Data Weekly for 09 Dec 2024
Timothy Spann
Timothy Spann

Posted on

AI and All Data Weekly for 09 Dec 2024

AI+Data Weekly ( AI, Data, Iceberg, Polaris, Streamlit, Flink, Kafka, Python, Java, NiFi )

#167 - 09-December-2024

https://bsky.app/profile/paasdev.bsky.social

Big Announcement Coming

Happy Krampusnacht to all those who celebrate.

AWS Updates

🧊 S3 Tables for Iceberg

☃️ AWS re:Invent 2024 Announcements

🧊 AWS Trainium Chips

The Coolness this week

❄️ Apache Polaris + Iceberg Quickstart

⚡️ How to extract tables from pdfs

🚀 Microsoft 1bit LLM BitNet

🐿️ Verifying Kafka Transactions Entry 2

🐿️ FLUSS: Streaming Storage

🐿️ Fluss -> Flow for Flink Real Time Analytics

🌐 TableFlow - iceberg / kafka

❄️ Snowflake Cortex AI + Slack

🐿️❄️ Door dash flink, kafka, snowflake

🧠 Prompt Stack -- all in one

🔌 SpaCY Layout for PDF

📱 Responsible AI Pathways

📼 Megaparse documents python

🔌 Time Series LLM

❄️ Generate Synthetic Data in Snowflake

🐿️ LLMs and GenAI - When to use them

🐿️ Flink Observability with Prometheus

📡 New SQL GUI

🍫 TDD for GenAI

🕵️

🎁 Open Source Agent Framework for Production

💻 Cedit command line editor

🏭 ServiceNow AgentLab

🎤 Snowflake Lessons Learned in Replication

🎄 Privastead

🔌 Backup Icloud with nodejs on linux

🔌 Backup Google with nodejs on linux

🎄 HuggingFace macos chat source code

🎁 Ollama working with structured output

🎁 dspy ai how to

🔌 Piazza updater

🔌 Building a financial report with langgraph

ColPali Notebook with QWEN 2 VL

New Models

📼 Open Source Video Foundation Model by Hunyuan

🌐 marco-o1
image

☁️ Amazon Foundation Models - Nova

❄️ Snowflake Arctic Instruct

🏫 Large Scale World Model Google

💻 PaliGemma Google SmOl Vision

💻 Ollama 3.3

Upcoming

💻 Dec 19: Conf42 IoT 2024: Virtual: https://www.conf42.com/Internet_of_Things_IoT_2024_Tim_Spann_opensource_build

Recent Tim Stuff

💻 XTremePython 2024 - LLM

💻 PyData NYC

💻 Advanced RAG Techniques @ All Things Open Raleigh 2024

💻 Building Real Time LLM Models

💻 Big Data Conference EU Talk on Open Source Real-Time AI

💻 CloudX AI Real-Time

💻 BuildStuff - Adding Generative AI

🐈‍⬛ Conf42 Prompt Engineering

🥑 06 Nov 2024 AI Alliance Talk in Manhattan

💻 08 Nov 2024 PyData NYC slides

Apps, Demos, Examples, Models, Notebooks and Projects

🐍 RAG 101

🐦 Milvus Knowledgebase

👻 AIM Ghosts

🚕 Unstructured Data - Ghosts - Part 1

🤖 Multimodal RAG is not Scary Ghosts

✍🏼 Advanced RAG Techniques

Technologies

Python
Java
Snowflake
Streamlit
AWS
Google Cloud
Azure

CODE + COMMUNITY

© 2020-2024 Tim Spann https://www.youtube.com/@FLaNK-Stack
(AI + Vectors + LLM + Streaming + IoT)

Top comments (1)

Collapse
 
tejas_kumar_83c520d6bef27 profile image
Tejas Kumar

Great weekly roundup! For those interested in vector search capabilities, it's worth noting that Astra DB with its vector capabilities is another solid option, especially when integrated with streaming data pipelines. Looking forward to more updates on this space!