[ckan-dev] ckan.net spam
david.read at okfn.org
Wed Jun 8 16:58:34 BST 2011
I've been looking at a new ckan.net spam incident - see details and
It's great that with CKAN you can revert spam changes easily, but I
think we should make further improvements to combat it.
Details of the incident.
* starting 48hrs ago, we have had about 200 revisions
* changes are mostly to users, but a few packages too
* bot is inserting ad links in the user-about and package-notes
* its all coming from an IP in malaysia
Steps done to alleviate and questions:
* I've 'deleted' the revisions but there is an exception during
purging them. Hopefully John will get a chance to check this out. In
the meantime you can still see the users and packages for some reason.
Could @Rufus explain this? This one area that is not documented.
* I've added a filter to our Markdown display to stop sneaky and badly
fomed links (had been giving us exceptions) and add rel="nofollow" so
there is no Google value to the links, discouraging this. BUT we don't
have nofollow on package-url or resource-url since we actually want to
help Google find these for non-spam data. I guess we'll have to just
manually weed out spam using these fields.
* I've blocked the source IP address (on eu6, using routing tables -
* We could do with logging source IP addresses in the apache log. It
currently shows the proxy address. Anyone know how to log the original
source IP instead?
* Other instances of CKAN are vulnerable in the same way, and are
probably less often viewed than ckan.net. We don't want to allow our
many sites to turn into google bombs when no-one is looking. We should
look at Wikipedia's tools for fighting wiki spam.
2011/6/8 Adrià Mercader <amercadero at gmail.com>:
> Also, the three most recently changed packages are spam, and these are
> the one the people first see when they land on ckan.net.
> I already flagged them with meta.spam tags.
> 2011/6/8 David Read <david.read at okfn.org>:
>> Looks like we have some spam in the user About fields...
>> e.g. http://ckan.net/user/drinkwell360
>> We can of course discourage this with rel="nofollow" (I'll cover that
>> out now). Should we consider captcha or other ideas though?
>> ckan-dev mailing list
>> ckan-dev at lists.okfn.org
> ckan-dev mailing list
> ckan-dev at lists.okfn.org
More information about the ckan-dev