Discussion:
more on false spam detection
Stewart Midwinter
2005-10-20 22:42:21 UTC
Permalink
I changed the blog settings so that you don't have to be logged in to
post comments. Then I stopped and restarted serv.py. When I tried to
post a comment, it was accepted. So perhaps the false spam detection
has to do with being logged in? enh?

One other thing - I noticed that blacklist.txt was a zero-byte file,
so I copied in an earlier version of the file before restarting the
server. Perhaps this was also a factor?

cheers,
--
Stewart Midwinter
stewart-rI9/***@public.gmane.org
stewart.midwinter-***@public.gmane.org
Skype, GoogleTalk, iChatAV, MSN, Yahoo: midtoad
AIM:midtoad1


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
Stewart Midwinter
2005-10-20 23:07:32 UTC
Permalink
sorry about the frequent posts....
Post by Stewart Midwinter
One other thing - I noticed that blacklist.txt was a zero-byte file,
so I copied in an earlier version of the file before restarting the
server. Perhaps this was also a factor?
this was probably it. I later switched back to requiring users to be
logged in, and a comment was accepted. When I added a known spammer's
URL to the comment, the comment was identified as spam and rejected.
So far so good.

I suspect that if you have a zero-length blacklist.txt file, then any
comment will be interpreted as matching a spammer's URL. Note this
code from antispam/blacklist.py:

# check the text for spam, return None if okay, errormessage if spam.
def check(self, text):
if text:
for rx in self.blacklist:
if rx.search(text):
#print "text to check:\n%s\nblacklist:\n%s" %
(text, str(rx))
return "blacklisted url found"
return None

initially, self.blacklist is set = []. Then the blacklist.txt file is
read. If the file exists, but is empty, then self.blacklist will
continue to be [], and all text will match the empty list. Does that
seem plausible?

cheers,


--
Stewart Midwinter
stewart-rI9/***@public.gmane.org
stewart.midwinter-***@public.gmane.org
Skype, GoogleTalk, iChatAV, MSN, Yahoo: midtoad
AIM:midtoad1


-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl
Irmen de Jong
2005-10-20 23:23:27 UTC
Permalink
Post by Stewart Midwinter
sorry about the frequent posts....
No problem, feedback is highly appreciated :)
One suggestion only is to use the snakelets-webapp list
for webapp-related issues such as these, instead of the
core snakelets mailing list.
See http://lists.sourceforge.net/lists/listinfo/snakelets-webapps
Post by Stewart Midwinter
Post by Stewart Midwinter
One other thing - I noticed that blacklist.txt was a zero-byte file,
so I copied in an earlier version of the file before restarting the
server. Perhaps this was also a factor?
this was probably it. I later switched back to requiring users to be
logged in, and a comment was accepted. When I added a known spammer's
URL to the comment, the comment was identified as spam and rejected.
So far so good.
I suspect that if you have a zero-length blacklist.txt file, then any
comment will be interpreted as matching a spammer's URL. Note this
[...]
Thanks for investigating. This seems to be the bug. I'll fix this ASAP.


About the other issue: (the IP address logging): do yo have a special
virtual host config setup in Snakelets?


Cheers,
--Irmen de Jong



-------------------------------------------------------
This SF.Net email is sponsored by:
Power Architecture Resource Center: Free content, downloads, discussions,
and more. http://solutions.newsforge.com/ibmarch.tmpl

Loading...