Direct database time...
First, BACK UP your database.
No. First, if you are not an SQL hack, get someone who is.
Now, BACK UP your database.
Next, you must delete all rows connected with spam comments.
I'm going to derive the process whereby you do this, because I'm a
little rusty on the XWiki database schemas, and because I'm an
inveterate pedant. You can skip some of these steps that exist for
information only if you wish, especially if you already know how to do
them.
First, it's necessary to know which rows constitute comments. XWiki's
dynamic typing is a great boon to users and an equally great bane to
administrators, as you will soon see. The xwikiclasses table describes
user-defined classes (well, actually it doesn't; it names them, and
describes server-defined custom classes if there are any, which
fortunately there aren't, because I couldn't tell you what to do if
there were). The class that defines XWiki comments is named
XWiki.XWikiComments. The xwikiclassesprop table contains the list of
fields for a given class, which is identified not by its name but by the
unique ID from the xwikiclasses table. So you need to get the ID of the
row in the xwikiclasses table whose name field is 'XWiki.XWikiComments',
and retrieve the field name and type from all rows of the
xwikiclassesprop table that have that ID. The following SQL does this
less verbosely:
select xwp_name, xwp_classtype
from xwikiclassesprop p, xwikiclasses c
where xwo_name = 'XWiki.XWikiComments'
and p.xwp_id = c.xwo_id
The results of this should be:
+-----------+---------------------------------------------+
| xwp_name | xwp_classtype |
+-----------+---------------------------------------------+
| author | com.xpn.xwiki.objects.classes.StringClass |
| comment | com.xpn.xwiki.objects.classes.TextAreaClass |
| date | com.xpn.xwiki.objects.classes.DateClass |
| highlight | com.xpn.xwiki.objects.classes.TextAreaClass |
| replyto | com.xpn.xwiki.objects.classes.NumberClass |
+-----------+---------------------------------------------+
5 rows in set (0.00 sec)
You'd have to go into the XWiki source (more specifically, the Hibernate
configuration, I believe) to see how the xwp_classtype field above links
to what I'm doing below; I leave that as an exercise for the reader.
All that's really needed is to know the types of the fields and which
tables they're stored in.
I've actually misled you somewhat: this is not the actual schema of
existing comment objects but the prototype for the creation of new
comments. I've done this ostensibly to give you a fuller understanding
of how XWiki allows one to change the definition of a user-defined class
without breaking existing instances, but it's really because I got
tripped up on it myself, and I wanted to share my misery with you.
To get the actual schema for existing comments you have to look at the
xwikiobjects table, which defines actual XWiki object instances in terms
of their class names and the document to which they belong. The xwo_id
field is the field that ties all of the tables together.
So the following query:
select distinct(xwp_name), xwp_classtype
from xwikiobjects o, xwikiproperties p
where xwo_classname='XWiki.XWikiComments'
and xwo_id = xwp_id
yields a similar-looking result (in my database, anyway), because the
XWiki.XWikiComments class hasn't changed since these comments were
added:
+-----------+-------------------------------------------+
| xwp_name | xwp_classtype |
+-----------+-------------------------------------------+
| author | com.xpn.xwiki.objects.StringProperty |
| date | com.xpn.xwiki.objects.DateProperty |
| comment | com.xpn.xwiki.objects.LargeStringProperty |
| replyto | com.xpn.xwiki.objects.IntegerProperty |
| highlight | com.xpn.xwiki.objects.LargeStringProperty |
+-----------+-------------------------------------------+
5 rows in set (0.05 sec)
So now we know (without explaining how) that (for example) the contents
of the "author" field of an instance of the class named
XWiki.XWikiComments is in the xwikistrings table, in a row whose xws_id
field matches the xwo_id field of an xwikiobjects row whose
xwo_classname file is "XWiki.XWikiComments". All the field values can
be found thus:
field name table name field value
author xwikistrings xws_value
date xwikidates xws_value
comment xwikilargestrings xwl_value
replyto xwikiintegers xwi_value
highlight xwikilargestrings xwl_value
So all the fields of a comment would be retrieved, given its ID, by the
following query:
select s.xws_value, /* author name */
d.xws_value, /* comment date */
l.xws_value, /* comment text */
i.xwi_value, /* reply-to field */
h.xwl_value /* highlight field */
from xwikistrings s,
xwikidates d,
xwikilargestrings l,
xwikiintegers i,
xwikilargestrings h
where s.xws_id = <comment ID> and s.xws_name = 'author'
and d.xws_id = s.xws_id and d.xws_name = 'date'
and l.xws_id = s.xws_id and l.xws_name = 'comment'
and i.xwi_value = s.xws_id and i.xwi_name = 'replyto'
and h.xwl_value = s.xws_id and h.xwl_name = 'highlight'
Of course, the job is much simpler than this, for several reasons.
First, the replyto and highlight fields are not populated by XWiki's
default templates, so they're always null anyway; second, of the
remaining values, each one is uniquely identified by the object's ID
field anyway, so the respective name fields don't need to be specified.
To get a list of all comments in the database, with only their object ID
field, author name, and date (which produces a fairly neat display) this
query should do it.
select o.xwo_id,
s.xws_value,
d.xws_value
from xwikiobjects o,
xwikistrings s,
xwikidates d
where xwo_classname='XWiki.XWikiComments'
and s.xws_id = o.xwo_id
and d.xws_id = o.xwo_id
order by xwo_id
With all that as background, what you need to do is:
1. Identify the offending comments (by their object IDs).
If the comments were all from bogus registrations, then the above query
with an added qualifier something like:
where s.xws_value in ('XWiki.spammer1', 'XWiki.spammer2'[,...])
will give you the object IDs you need.
If you have allowed anonymous users to comment, then other criteria must
be used, although of course "s.xws_value = 'XWiki.XWikiGuest'" should
be
in your WHERE clause. Most likely, you could probably nail most of them
by saying
select o.xwo_id from xwikiobjects o, xwikilargestrings l
where o.classname = 'XWiki.XWikiComments' and l.xwl_id = o.xwo_id
and l.xwl_value like "%Viagra%"
or l.xwl_value like "%sex"
or l.xwl_value like "%stock%"
or ...
you get the idea.
The best idea, of course, is to keep running and refining your query
until you are sure that you've identified all of the offending comments
and only the offending comments.
2. After you've built a query that identifies the set of records you
want to remove, you must delete them from each table where they appear.
The easiest way to do this is to modify your query to return only the
xwikiobjects.xwo_id field in the SELECT clause and put it into a
temporary table:
create temporary table badcomments (comment_id integer);
insert into badcomments select o.xwo_id from xwikiobjects o [... where,
etc...];
Then delete every row from xwikistrings, xwikilargestrings, xwikidates,
and xwikiintegers where the respective ID fields (xws_id, xwl_id,
xws_id, and xwi_id, respectively) match the comment_id field from your
badcomments table.
You can also do it using the $xwiki.search() method, but there you have
to tie the XWiki objects together using HQL. The advantage is that you
don't have to have server access; the disadvantage is that it's
miserable to get it right (my opinion).
brain[sic]
-----Original Message-----
From: wangwh(a)att.net [mailto:wangwh@att.net]
Sent: Tuesday, October 17, 2006 12:32 AM
To: xwiki-users(a)objectweb.org
Subject: [xwiki-users] delete spam comments
Hi,
My wiki site got over a thousand spam comments, anyone know
how can I delete them quickly (better than edit object, then
delete one by one).
Wei-hsing