These last days I’ve reported to the admin a few posts as spam, so I’ve developed this small bash script to detect posible posts
INFO: You need to have httpie
and jq
installed. Also, an API-KEY is required
INFO: In fact, it’s more to practice httpie and jq filtering capabilities than a useful tool
latest=$(http api-key:$API_KEY accept:application/vnd.forem.api-v1+json per_page==80)
filtered=$(jq '.[] | select(.reading_time_minutes==1 and .user.user_id > 4)' <<< "$latest")
echo Total Last articles $(jq -M -r '.id' <<< "$filtered" | wc -l)
echo '-----'
echo Number of authors $(jq -M -r '.user.user_id' <<< "$filtered" | uniq | wc -l)
echo '-----'
users=$(jq -M -r '.user | .user_id' <<< "$filtered" | uniq)
for user_id in $(echo "$users"); do
strjoined_at=$(http GET "$user_id" api-key:$API_KEY accept:application/vnd.forem.api-v1+json | jq -r '.joined_at')
joined_at=$(date --date="$strjoined_at" "+%Y-%m-%d")
days=$((($(date +%s) - $(date -d "$joined_at" +%s))/86400))
if (( ${days:-2} < 3 )); then
echo "The $user_id user is suspect to be spam, see post:"
jq --arg jq_user_id ${user_id} '.[] | select(.user.user_id == ($jq_user_id|tonumber)) | .url' <<< "$latest"
| 1 | retrieve last articles (80 max) |
| 2 | filter by reading_time_minutes
as spam usually are short post |
| 3 | extract uniques user_id
| 4 | find user details for user_id
| 5 | check if this account was recently created |
Obviously not all articles that meet these conditions are spam. Lot of people (as me) write a hello-world just created the account so the script show the url, so I can read the post and decide if it’s spam or not.
For next version, I have time, I would like to include some kind of "IA" to automatically read the post and decide if the post is spam
