How long does a post have to be before the SPAM eater leaves it alone?
I am not a spammer, but my posts are very often deleted, maybe because I prefer to make short, to-the-point comments. If I knew the critical post length was I could pad my comments and avoid deletion.
I see all the posts that have been deleted in the folder every day, and looking at them here are my best guesses-
There are some trigger words. I remember back in the day that any time you typed the word 'Wow' in a post you immediately got spam posts offering to sell you WoW Gold (video game currency, for those of you not in the know). Ads for Viagra, worlds like 'selling', 'cash,' links to certain sights, short posts... it may or may not have to do with posting a bunch of posts rapidly (I have no evidence on this, but I know it's in some other sites software),...
My guess is that each thing you tick off on that list is worth a certain number of points or fractions of points and when you get to a certain total it's triggered. It's supposed to be 'smart' in that it's supposed to learn. That's why, if you post 10 copies of the same message and they are all deleted I restore every last one of them. I don't want to 'teach' it that it's okay to block some messages, and I suspect if I delete any it will assume that it got it right that time and score that as a higher result.
It operates across all sites by our host (and can't be turned off) so my guess is that anytime anyone on any one of those sites marks something as spam it uses that as a new filter. (I know email systems tried using that idea, but people would sign up for a newsletter and then just click 'it's spam' instead of removing themselves from the list, which pretty quickly would result in the newsletter, which you signed up for, getting automatically marked and sent to everyone else's spam folder as well.
Ours doesn't seem to have what would be the simplest and probably best filter item- the number of posts. If I was writing a filter I'd put in a option for forum members to mark posts as spam, and those posts would be reviewed. Admins could see who marked it as spam. Over time, if someone had a lot of posts that weren't marked as spam you could raise the confidence level for the rest of the filter that this was not spam. If a new member posted a bunch of posts on his first day the algorithm would flag them, but someone who had been a member with lots of posts would only get flagged if a member reported them and then an admin told the algorithm that yes, that was in fact spam (maybe they got hacked). I'd do the same thing with the captcha. Simple rule- For your first 50 posts and for your whole first month you'd have to answer them. After that you'd only have to answer them if you posted a bunch of comments in a short time, say more than 20 a day (so if you posted a 21st and 22nd comment you'd have to answer captchas on them, etc, just to prove you hadn't been hacked. But as it is, this is the software we've got.