Posts Tagged ‘bot’
Comments Spammer 404s Revisited
Tuesday, January 4th, 2011
I’ve been meaning to revisit my WordPress 404 Errors for Spam Comments but somehow it kept slipping through the cracks. I was at a bit of a loose end so I did a quick run through my spam comment log and my site’s error logs…
There were 424 spam comments to my site in December 2010. Of those 74 had an associated 404 error for the comment page that WordPress would have made had the comment been approved. All the comments and 404 errors that paired had IP addresses that matched. I’d say that suggests that the spambots are checking if they succeeded to get a comment posted rather than someone auditing to check the inbound links they’ve paid for exist.
The test for a comment page that causes the 404 error isn’t happening immediately so the bots have been coded to allow for comments being delayed. I can’t think of a reason a comment would be delayed other than moderation. Most of the comments are so blatantly spam that I can’t see most of them being approved. I did a quick sample of the ones I looked at the ones with long lists of links weren’t tested to see if they got through and the random garbage text with a couple of links were sometimes tested.
The most heavily checked spam comments were short like “A very interesting post thanks for writing it!” or “Thanks For the exelent info. I’ll be back in the future. Thanks again! [Link to drugs spam site]“. Maybe a few of the short ones like the second one might slip past someone who’s taken the drugs from the site linked to? I imagine some like the first one will get through on a high traffic blog with lots of comments to moderate.
WordPress 404 Errors for Spam Comments
Wednesday, May 5th, 2010
Ever since I rebuilt my website with WordPress and imported my blog from blogspot I’ve kept an eye on my server error logs. I like to make sure people who arrive here from the old blog or from other links end up where they were looking to go. While I’ve redirects that catch almost all of the possible routes in I want to be sure and checking the error logs gives me a way to spot any I’ve missed.
I just went through last months logs. Of the around 250 error pages served there was a rather annonymous blog page that needed redirection setting up for. It accounted for 5 of the hits. 10 were typos or calls to deleted pages. 60 were bots trying crude hack attacks on the site trying known vulnerabilities in a variety of software. That left around 170 that I’d not been able to explain. Every one a call to a comments page that when I checked didn’t exist. The address was a unique combination of a unique comment number and the page the comment had been posted on.
2006/02/more-thoughts-on-the-game-without-a-snappy-title/comment-page-1/#comment-9940
I’ve seen 404s like this for months and couldn’t work out why they were appearing.
Then it struck me what they were. Those comment pages were for comments that Akismet caught as spam. The only way (short of brute force which would show up in the logs) that someone could know the combination of the comments unique ID and the page it had been posted on was to be the original poster (or the spam system that posted it) . Those 404 pages must either be the spam bot coming back later to see if it worked or some other system running quality control before paying out for links to a site having been created.
I’ve not had a chance to match up spam messages to 404s because I keep my spam logs clear but I’m going to keep an eye on it and see if they support the idea. I’m intrigued to see how often they check a comment, if they come fromt the same IP as the spam message and how long after posting they check.
