DEV Community

Mayank Laddha
Mayank Laddha

Posted on

My take on the Agentic Object Detection

Here are the steps:

  • Segmenting Everything with SAM : We detect everything and worry about filtering later.

  • Filtering with CLIP: Once we have all the segmented objects, we don’t want all of them. We need to filter out the noise and keep only the relevant objects.

  • Adding Reasoning with a model like GPT-4o: Okay, so we’ve segmented and filtered. But what about finalising, understanding? That’s where a strong LLM like GPT-4o comes in.

Here is what I did with SAM and clip, we now need to use a good LLM on top and add some reasoning..

Agentic Object Detection demo

code: https://github.com/maylad31/agentic-object-detection

Top comments (0)