Hi,
On 31 May 2017, at 20:50, Guillaume Delhumeau
<
guillaume.delhumeau(a)xwiki.com> wrote:
2017-05-31 15:59 GMT+02:00 Vincent Massol <vincent(a)massol.net>et>:
> Hi Guillaume,
>
>> On 31 May 2017, at 12:15, Guillaume Delhumeau <
> guillaume.delhumeau(a)xwiki.com> wrote:
>>
>> Help me to decide!
>>
>> TL;DR:
>>
>> * I need to know if performing a query on the database for each user
who
>> want to receive an email with all the
notifications, is a scalability
> issue
>> (in a job context).
>
> Yes whenever we do a lot of queries to the DB it’s a scalability issue.
If
we have
100K users then it’s 100K queries for definitely a scalability
issue.
Well, in that case, I don't know if sending 100K emails is scalable too.
The new mail system is made for that. There’s a single mail thread
(actually 2 but that’s a detail) and it can send an infinite number of
mails without slowing down XWiki. Ofc the only thing not guaranteed is how
long it takes to do so. But that can be fixed outside of XWiki by having a
proxy mail server which would accept immediately all mails sent by XWiki
before forwarding them to some cluster of mail servers. It may not be
enough though and maybe sending 100K mails to the proxy mail server would
already take too long. Would be interesting to have some measure of how
long it takes to send a single mail. I think I did some computation at some
point but i don’t remember the results.
Do you mean that the notification center would execute the DB queries one
by one?
In this case it could work indeed and it should be
left to the mail module
to handle that by implementing a custom MimeMessageFactory with an
iterator. It’s important to delegate this to the mail sender API IMO. See
UsersAndGroupsMimeMessageFactory for an example. AFAIR Edy refactored the
watchlist to use a MimeMessageFactory.
Thanks
-Vincent
We need to find a way to do a single query (or a
small fixed number of
> queries independent of the # of users).
>
> If not possible then we may need to either:
> A) Add some new table in our DB to help do that
> B) Use some tool other than the DB, e.g. SOLR, etc
>
> Thanks
> -Vincent
>
>> * If it's not an issue, I can implement the "naïve" solution which
> requires
>> less development.
>>
>> Full message:
>>
>> Status:
>> * notifications are displayed on the top menu when you browse the wiki.
>> * notifications are displayed differently for each individual user
>> according to their preferences (filters on event type, on locations,
>> etc...).
>> * similar notifications are grouped together into "composite
> notifications".
>> * there is only a few notifications displayed (5 by default).
>>
>> Objective:
>> * send an email periodically (every hour, every day, every week)
> according
>> to the user preferences with ALL events that happened during the last
>> period of time, but still according to the user preferences.
>>
>> Inspiration:
>> * the watchlist gets ALL events that happened during the last period of
> time
>> * then, for each user, remove the events which the user is not
> interested in
>> * Benefit: only one query to get the events from the database for all
> users
>>
>> Problems:
>> * in the notifications, I have introduced a NotificationFilter role the
>> make possible to inject some SQL in the query to get the events
according
>> to the user preferences. I call this
"pre-filters".
>> ** it means we generate a unique request for each individual user, so
if
> we
>> send a mail to 1000 users, we will have 1000 requests to the database.
>>
>> I wonder if it's a non-problem or a big scability issue. Because even
if
>> the whole job that send emails take ~10
minutes, it does not matter.
It's
>> not a realtime thing.
>>
>> For the records, NotificationFilter have "post-filters" too, that
perform
>> check on the event itself (for example
checking the permissions,
etc...).
Alternatives:
* just like the watchlist, perform a very generic query on the database
to
get all the events that happened during the last
period of time
* then for each user, use only the "post-filters" to remove events the
user
don't care of
Problem:
* it means the pre-filters that make sense in the notification use-case
cannot be used for emails. Developers must be aware of this.
* it requires some refactoring of the code that group similar
notifications.
Question:
Should I go with the "naive" solution, ie for each user get all
notifications and send a mail, or should I go with the "only 1 query to
the
database for all users" version?
Thanks,
--
Guillaume Delhumeau (guillaume.delhumeau(a)xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the
XWiki.org project
--
Guillaume Delhumeau (guillaume.delhumeau(a)xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the
XWiki.org project
--
Guillaume Delhumeau (guillaume.delhumeau(a)xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the