Flag Phishing Emails with Deepseek or GPT

The usual spam filters are no longer enough, hackers are now using LLMs and GenAI to make it almost impossible for humans to detect them.

Fortunately, we can use the same technology to detect many of these phishing attempts and prevent them from ever hitting our wallets or our privacy.

In this blog post we will look at how you can easily use local LLMs and OpenAI to detect phishing attempts in Apple Mail. (Desktop version only).

Requirements:

Apple Mail (Desktop version)
(optional) Ollama installed if you want to use the local model

Just want to try it out?

If you’re pressed for time, feel free to skip the nitty-gritty details and jump straight to trying out the script (provided below).

Ollama Version (Default using the Deepseek-llm )

https://airabbit.blog/automatically-flag-phishing-emails-with-ollama-ai-source/

OpenAI Version https://airabbit.blog/automatically-flag-phishing-emails-with-openai-ai-source/

Why Local Models?

1. Privacy

By running an LLM entirely on your own machine (e.g., with tools like Ollama), no email content or metadata is sent to a remote server. This ensures sensitive data never leaves your system.

2. Speed & Reliability

Local inference means no waiting on network calls. Your phishing detector is always available, even if you lose internet connectivity.

3. Full Control

You decide exactly how the model is updated and configured, independent of any external provider’s API changes or downtime. This fosters a sense of security and keeps you from being locked into a specific vendor.

Non-Invasive Detection Philosophy

Traditional anti-spam systems often delete or move emails preemptively, which can cause legitimate messages (false positives) to be missed. The approach presented here is non-invasive. Rather than automatically deleting or moving suspected phishing emails, it uses Apple Mail’s built-in flagging system:

Red Flag: Potential phishing
Green Flag: Likely legitimate

Because the original message remains intact in the inbox, the user remains in the driver’s seat. You decide which messages to trust or review more closely.

How It Works

AppleScript Integration An AppleScript mail rule triggers every time a new message arrives. The script gathers the email’s raw source and prepares it for analysis.
System Prompt for the LLM

The LLM (in this example, a model called “DeepSeek”) is given a clear, minimal directive:

You are a phishing detector. 
Analyze emails and respond with a JSON object containing two fields:
{
  "isSpam": true/false,
  "reason": "brief explanation why"
}

DO NOT write anything else. ONLY output valid JSON.

This ensures the model’s output is easy to parse and entirely machine-readable.

Local API Call A simple curl command sends the email data to the local LLM. Because everything runs on localhost, no external services are used.
Interpreting the Result The script checks the JSON response—particularly the isSpam boolean—and then flags the message in Apple Mail accordingly. The user can see a short reason for the decision in the logs, but the message itself remains untouched.

Optional Parameters & Customization

Model Selection

Ollama Models: You can choose between different locally installed models (e.g., “4o-mini”, “DeepSeek-LLM”).

OpenAI (GPT-4): If you prefer, you can adapt the script to hit OpenAI’s GPT-4 API. GPT-4 often provides more reliable detection than smaller local models.

Maximum Characters for Analysis

By default, the script takes up to 10,000 characters from the email source.
You can adjust this limit to 5,000 or 15,000, etc., based on your system’s performance and model capabilities.

Flagging Strategy

Flagging Suspicious Only: By default, only phishing emails are flagged red.
Flagging Safe Messages Too: You can enable a “green flag” for legit messages if you want a quick visual cue that an email has been vetted.

The system prompt You can change the system prompt if it does not work well for some models OR if you want to make the check more or less stringent.

Why This Matters

Dynamic Analysis LLMs excel at identifying context and patterns that simple keyword-based filters miss. Email scammers often obfuscate their true intent with cleverly worded messages or by forging header information. A context-aware model can pick up on subtle cues in the text, HTML structure, or headers.
Extended Visibility The script also checks DKIM, SPF, and DMARC indicators in the full message source. These technical validations complement the LLM’s textual analysis, making the detection more robust.
User Empowerment Marking suspicious emails instead of discarding them places the final judgment in your hands. Instead of relying on an opaque filter, you see clearly how and why an email was flagged—helping you learn to spot red flags yourself.

Real-World Example

Phishing Attempt
Subject: "Your DHL package is waiting"
Headers: Forged SPF record, missing DKIM
Content: Urgency, suspicious link to confirm shipping details
LLM Output:

Apple Mail: Red-flagged to alert you.

Legitimate Newsletter
Subject: "Your Weekly Updates"
Headers: DMARC pass, proper routing
Content: Familiar format, no hidden or suspicious links
LLM Output:

Apple Mail: (Optionally) Green-flagged as safe.

Conclusion

LLM-based phishing detection is a game-changer for email security. By analyzing each message in detail, from headers to body content, the model can make informed decisions that traditional spam filters often miss. Best of all, you retain complete control over your data and remain the ultimate arbiter of which emails are trustworthy.