Skip to content

DEV Community

Mayank Laddha

Posted on Feb 16

My take on the Agentic Object Detection

#ai #computervision #llm #python

Here are the steps:

Segmenting Everything with SAM : We detect everything and worry about filtering later.
Filtering with CLIP: Once we have all the segmented objects, we don’t want all of them. We need to filter out the noise and keep only the relevant objects.
Adding Reasoning with a model like GPT-4o: Okay, so we’ve segmented and filtered. But what about finalising, understanding? That’s where a strong LLM like GPT-4o comes in.

Here is what I did with SAM and clip, we now need to use a good LLM on top and add some reasoning..

code: https://github.com/maylad31/agentic-object-detection

Top comments (0)

Subscribe

Read next

Artificial Intelligence ABCs

Suhavi Sandhu - Feb 8

How Positional Encoding & Multi-Head Attention Powers Transformers?

Param Ahuja - Feb 7

Supercharging Your Docker Workflow with Ask Gordon, Docker's AI Assistant 🚀

Prasad Bhalerao - Feb 8

My First Post on DEV

Mohammed Afkir - Feb 7