Hi devs,
We’ll soon start XWiki 10.7 (see https://markmail.org/message/qjemnip7hjva2rjd).
Goals
=====
There are 2 goals for this release:
1) Close as many bugs as possible (note that I didn’t say “fix” ;), the goal is really to reduce the number of open bugs and thus to close won’t fix, duplicates, etc and also to fix low hanging fruits, i.e. easy bugs). The goal is quantity. To try to reduce our "bug lag"
Our current status today:
* -47 bugs over 120 days (4 months), i.e. we need to close 47 bugs to have created bugs # = closed bugs #
* -95 bugs over 365 days (1 year)
* -160 bugs over 500 days (between 1 and 2 years)
* -331 bugs over 1600 days (4.3 years)
A good result would be to close 47 bugs during 10.7 and an excellent goal would be to close 95 bugs during 10.7 (i.e. have as many bugs closed than opened for the past year).
2) Improve our tests and make sure that our global TPC is increasing again and not going down. See
* http://markmail.org/message/up2gc2zzbbe4uqgn
* http://markmail.org/message/grphwta63pp5p4l7
* http://markmail.org/message/hqumkdiz7jm76ya6
I think the following activities would be good one for 10.7:
* Increase coverage especially for modules that have lost coverage. See all the lines in red on https://up1.xwikisas.com/#-GNXv9QYlBWPXTHNnvQD2g which should be high priority modules.
* Add tests for modules that don’t have tests yet (for example I added some functional tests last week to the xwiki-platform-menu module which didn’t have any test at all)
* Once coverage has been increased, up the jacoco threshold wherever possible ;)
* IMPORTANT: Fix known flickering tests
* (easy, to relax ;)) Convert JUnit3 and JUnit4 tests to JUnit5
Repartition of Work
===============
Ideally we should spend 1/2 on BFD and 1/2 on Tests. We could say that the 1st 15 days are on Tests and the last 15 days on BFD (or the opposite), or just let everyone handle his own time table and just make sure we do roughly half of both activities. I don’t think it would be good to have devs focus only on tests and others only on BFD. I’d really prefer that each dev does 1/2 of both.
My preference goes to leave each dev choose when they work on BFD and on tests with an agreement that we will try to do half of each.
WDYT?
Thanks
-Vincent
I resend this message that was accidentally sent to Sergiu only.
---------- Forwarded message ----------
From: Guillaume Delhumeau <guillaume.delhumeau(a)xwiki.com>
Date: 2018-07-02 16:06 GMT+02:00
Subject: Re: [xwiki-devs] [Brainstorm] Notifications filters capabilities
and performances.
To: Sergiu Dumitriu <sergiu(a)xwiki.org>
Hi and thanks you for your answers.
2018-06-29 5:41 GMT+02:00 Sergiu Dumitriu <sergiu(a)xwiki.org>:
> The two main problems that I see are that you're mixing two separate
> things:
>
> 1. Tags which are stored in one place, and events which are stored in a
> different place
> 2. Again tags, and document locations. They may seem similar, but they
> are completely different as they are implemented. That's why saying "not
> in this wiki, except for this tag" cannot ever be implemented in a sane
> way since the exception is on a different plane.
>
> The logical approach would be to also store the tags in the events, but
> this is not a good thing to do.
Also, a tag could be added to a page AFTER the event has been triggered. In
this case, if you still want to fetch these pages, it won't work.
> Where do we stop? For future-proofing
> filters, we would have to store the entire document in the event, in
> case someone wants to add a filter on "documents with a specific type of
> object attached", or "documents with a specific value for a specific
> property of a specific object", or "documents with attachments ending in
> .extension".
>
> On 06/28/2018 04:44 PM, Guillaume Delhumeau wrote:
> > For the tags filter, I can also:
> >
> > * perform a query to fetch all documents that correspond to the given
> tags
> > (it doesn't need to be permanent, but it would be better to cache it)
> > * add a HQL filter on these pages (OR document IN :list_of_pages).
> >
> > It's a little variation of solution A. It's ugly but it could do the job.
>
> It's the sanest proposal so far, but still impossible.
>
Apparently it's the way it's done on Activity Stream:
https://github.com/xwiki/xwiki-platform/blob/553afe7287dcc4ce8b588276e639bf
283352e805/xwiki-platform-core/xwiki-platform-activitystream/xwiki-platform-
activitystream-ui/src/main/resources/Main/Activity.xml#L1008
>
> Since "not in LOCATION except WITH TAG" doesn't work because it mixes
> two types of information in the same query fragment, how about making
> this a one-dimensional query as "only WITH TAG". And since this
> information is in a different place, pre-querying it is needed. However,
> if there are thousands or tens/hundreds of thousands of documents with a
> tag, won't the query crash anyway?
>
Sure, it won't scale.
>
> > 2018-06-27 12:58 GMT+02:00 Guillaume Delhumeau <
> > guillaume.delhumeau(a)xwiki.com>:
> >
> >> Hi developers.
> >>
> >> I am trying to add a new filter to the notifications to be able to
> follow
> >> pages
> >> that are marked with a given tag. And it leads me to some questions
> about
> >> the
> >> technical implementation of the notifications.
> >>
> >> To remind the context: notifications are computed on top of the events
> >> recorded
> >> by the event stream (a.k.a activity stream). We take events from the
> event
> >> stream SQL table, we apply some transformations on them, and we display
> >> them to
> >> the user.
> >>
> >> Then we have implemented the ability to filter on these events: for
> >> example
> >> "don't show events concerning the document A nor the wiki B". Filters
> are
> >> implemented with 2 distinct ways:
> >>
> >> 1/ SQL injections: each filter can add SQL elements in the query we
> make
> >> to
> >> fetch the events from the event stream table. We made this
> mechanism
> >> so we
> >> can let the database do a lot of the filtering process. After all,
> >> it's its
> >> job and it's supposed to perform well. To be precise, Clement has
> >> even
> >> created an Abstract Syntax Tree (AST) so it's easier to inject some
> >> content
> >> in the query and it creates an abstraction over the SQL language so
> >> we can
> >> even consider to change the storage of the event stream someday.
> >>
> >> The bad thing is that some complex filtering are difficult to write
> >> with
> >> the SQL language (event with the AST) or even impossible.
> >>
> >> 2/ Post-filtering: after the events have been fetched from the
> database,
> >> each
> >> filter can still decide to keep or filter them. This is useful for
> >> complex filtering that cannot be expressed with the SQL language.
> It
> >> is
> >> also needed by the real-time notification email sender, because it
> >> takes
> >> the events immediately when they occurs without fetching them in
> the
> >> database (so SQL filters are bypassed).
> >>
> >> The bad thing is that some events are loaded in the memory to
> finally
> >> be
> >> rejected, and these filters can perform costly operations such as
> >> loading
> >> documents.
> >>
> >> Until now, this double mechanism was working quite well, with each
> >> mechanism
> >> filling the lacks of the other.
> >>
> >> However, we still have technical limitations in our design:
> >>
> >> 1/ Users who have a lot of filter preferences can end up with a giant
> >> SQL
> >> query that is almost impossible to perform by the database.
> Actually
> >> we had
> >> a user complaining about an OutOfMemory problem in the HQL to SQL
> >> translator !
> >>
> >> 2/ I cannot implement the tag filter !
> >>
> >> The tag filter is supposed to show events that concern pages that hold a
> >> given
> >> tag, EVEN IF THE PAGE WAS EXCLUDED BY THE USER. Example of use-case: "I
> >> don't
> >> want to receive notifications about wiki A except for pages marked with
> >> the tag
> >> T".
> >>
> >> And it is not working. First because it is difficult to write a SQL
> query
> >> for
> >> that. It requires to make a join clause with the document and the object
> >> tables,
> >> which our SQL injection mechanism does not support. Even if it were
> >> possible,
> >> creating a SQL join with the document table will de-facto filter events
> >> that do
> >> not concern any pages or pages that do not have any objects: so many
> other
> >> filters will be broken. I also don't consider creating a SQL subquery, I
> >> think
> >> the whole query would became too big. So I decided to just not inject
> any
> >> SQL
> >> code for this filter and only implement the post-filtering mechanism.
> >>
> >> But the other filter "EXCLUDE WIKI A" generates a SQL injection such as
> >> "WIKI <> 'WIKI A'" so the events concerning the wiki A are not fetched
> >> from the
> >> database. Consequence: the tag filter never see the events that it is
> >> supposed
> >> to keep. It would be actually possible to by-pass the first SQL
> injections
> >> by
> >> injecting something like "OR 1=1". But doing something like this is like
> >> dropping the all SQL injections mechanism.
> >>
> >> I see some solutions for this problem:
> >>
> >> A/ For each tag, create a permanent list of pages that hold it. So I
> can
> >> inject "OR document IN (that_list)". I think this is heavy.
>
> Definitely not. You'll have to maintain this list in parallel to the
> canonical storage for tags, the xwikidoc table itself, with all the
> headaches that data duplication brings.
>
I agree.
>
> >> B/ Drop the SQL injection mechanism and only rely on the
> post-filtering
> >> mechanism. It would require to load from the database A LOT of
> events,
> >> but maybe we could cache this.
>
> Nope, filtering millions of events by inspecting them one by one is a
> performance nightmare.
>
> >> C/ Don't drop the SQL injection mechanism completely but use it as
> >> little as
> >> possible (for example, do not use it for LOCATION filtering). Seems
> >> hard to
> >> determine when a filter should use this feature or not.
>
> I think location filtering should still happen in the query itself,
> since it is a lot faster.
>
> >> D/ Don't implement the "tags" filter, since it is the root of the
> issue.
> >> But
> >> it is like sweeping dirt under the carpet!
>
> Well, not implementing something is always the fastest, least intrusive
> solution to any problem...
>
> Humor aside, I think that in this case it is the best approach.
>
Thanks for saying it :)
>
> >> Since we have the OutOfMemory problem with the SQL injections becoming
> too
> >> huge,
> >> I am more in favor of solution B or C. But I'm not sure for now, since I
> >> do not
> >> know how much it would impact the performances and the scalability of
> the
> >> whole
> >> notifications feature.
> >>
> >> This is a complex topic, but I hope this message will inspire you some
> >> suggestions or things I have not seen with my own eyes.
> >>
> >> Thanks for your help,
> >>
> >> Guillaume
> >>
> >
> >
> >
> Other options:
>
> E/ Drop SQL altogether, move the event stream into a nosql database, and
> do stream filtering; kind of like the B proposal
>
That's unfortunately out of the scope right now.
>
> F/ Drop SQL and querying altogether, switch to a pub-sub mechanism where
> a subscriber is created for every user and his filters, and matching
> events are placed in a queue until they are all sent (consumed) in a
> notification. Obviously, this only works for email notifications, not
> for browsing past events in a webpage.
>
Indeed it cannot replace Activity Stream. However, it was my first design
attempt:
http://design.xwiki.org/xwiki/bin/view/Proposal/
NotificationCenterforAppsImplementation#HIteration1
But then we decided to rely on what we had (the event stream) and to kill 2
birds with the same stones (having notifications + replacing activity
stream).
Thanks,
Guillaume
>
> --
> Sergiu Dumitriu
> http://purl.org/net/sergiu/
>
--
Guillaume Delhumeau (guillaume.delhumeau(a)xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the XWiki.org project
--
Guillaume Delhumeau (guillaume.delhumeau(a)xwiki.com)
Research & Development Engineer at XWiki SAS
Committer on the XWiki.org project
Hi developers.
I am trying to add a new filter to the notifications to be able to follow
pages
that are marked with a given tag. And it leads me to some questions about
the
technical implementation of the notifications.
To remind the context: notifications are computed on top of the events
recorded
by the event stream (a.k.a activity stream). We take events from the event
stream SQL table, we apply some transformations on them, and we display
them to
the user.
Then we have implemented the ability to filter on these events: for example
"don't show events concerning the document A nor the wiki B". Filters are
implemented with 2 distinct ways:
1/ SQL injections: each filter can add SQL elements in the query we make
to
fetch the events from the event stream table. We made this mechanism
so we
can let the database do a lot of the filtering process. After all,
it's its
job and it's supposed to perform well. To be precise, Clement has even
created an Abstract Syntax Tree (AST) so it's easier to inject some
content
in the query and it creates an abstraction over the SQL language so we
can
even consider to change the storage of the event stream someday.
The bad thing is that some complex filtering are difficult to write
with
the SQL language (event with the AST) or even impossible.
2/ Post-filtering: after the events have been fetched from the database,
each
filter can still decide to keep or filter them. This is useful for
complex filtering that cannot be expressed with the SQL language. It
is
also needed by the real-time notification email sender, because it
takes
the events immediately when they occurs without fetching them in the
database (so SQL filters are bypassed).
The bad thing is that some events are loaded in the memory to finally
be
rejected, and these filters can perform costly operations such as
loading
documents.
Until now, this double mechanism was working quite well, with each
mechanism
filling the lacks of the other.
However, we still have technical limitations in our design:
1/ Users who have a lot of filter preferences can end up with a giant SQL
query that is almost impossible to perform by the database. Actually
we had
a user complaining about an OutOfMemory problem in the HQL to SQL
translator !
2/ I cannot implement the tag filter !
The tag filter is supposed to show events that concern pages that hold a
given
tag, EVEN IF THE PAGE WAS EXCLUDED BY THE USER. Example of use-case: "I
don't
want to receive notifications about wiki A except for pages marked with the
tag
T".
And it is not working. First because it is difficult to write a SQL query
for
that. It requires to make a join clause with the document and the object
tables,
which our SQL injection mechanism does not support. Even if it were
possible,
creating a SQL join with the document table will de-facto filter events
that do
not concern any pages or pages that do not have any objects: so many other
filters will be broken. I also don't consider creating a SQL subquery, I
think
the whole query would became too big. So I decided to just not inject any
SQL
code for this filter and only implement the post-filtering mechanism.
But the other filter "EXCLUDE WIKI A" generates a SQL injection such as
"WIKI <> 'WIKI A'" so the events concerning the wiki A are not fetched from
the
database. Consequence: the tag filter never see the events that it is
supposed
to keep. It would be actually possible to by-pass the first SQL injections
by
injecting something like "OR 1=1". But doing something like this is like
dropping the all SQL injections mechanism.
I see some solutions for this problem:
A/ For each tag, create a permanent list of pages that hold it. So I can
inject "OR document IN (that_list)". I think this is heavy.
B/ Drop the SQL injection mechanism and only rely on the post-filtering
mechanism. It would require to load from the database A LOT of events,
but maybe we could cache this.
C/ Don't drop the SQL injection mechanism completely but use it as little
as
possible (for example, do not use it for LOCATION filtering). Seems
hard to
determine when a filter should use this feature or not.
D/ Don't implement the "tags" filter, since it is the root of the issue.
But
it is like sweeping dirt under the carpet!
Since we have the OutOfMemory problem with the SQL injections becoming too
huge,
I am more in favor of solution B or C. But I'm not sure for now, since I do
not
know how much it would impact the performances and the scalability of the
whole
notifications feature.
This is a complex topic, but I hope this message will inspire you some
suggestions or things I have not seen with my own eyes.
Thanks for your help,
Guillaume
Hi devs,
I’d like to give you some info about what I’ve started working on and verify you like the direction I’m proposing to take for the future of functional testing on the xwiki project.
Needs
=====
* Be able to test xwiki on multiple environments
Context
======
* Right now we test only in 1 env (Jetty+HSQLDB)
* I've started some docker images in xwiki-contrib
* I’ve also started some experiment through https://jira.xwiki.org/browse/XWIKI-14929 and https://jira.xwiki.org/browse/XWIKI-14930 (see also email thread "[Brainstorming] Implementing multi-environment tests - Take 2” and https://github.com/xwiki/xwiki-platform/compare/XWIKI-14929-14930). This email supersedes the "[Brainstorming] Implementing multi-environment tests - Take 2” thread.
* Initially I imagined doing the multi env testing in Jenkins thanks to the Jenkins Docker plugin/library. However I realized that it would be better to be able to run that on the dev machines and thus decided instead to implement it at the maven level thanks to the Fabric8 Maven plugin.
Proposal
=======
* The new proposal is to stop trying to do it at the maven level and instead do it at the Java level, i.e. be able to control (start/stop the various docker images for the DB, Servlet Container/XWiki and the Browser from within the java junit/selenium tests).
* There are several java libraries existing to control docker from within java. For example: https://github.com/docker-java/docker-java
* I got convinced when finding this awesome library that combines JUnit5/Selenium and Docker for multi-browser testing: https://bonigarcia.github.io/selenium-jupiter/
** Note that this relies on the browser docker images provided by the Selenoid project: https://aerokube.com/selenoid/latest/
* So the idea is to extend that to be able to control the other 2 docker containers for the DB + ServletContainer/XWiki.
Pros
====
* Very simple setup to start/stop functional tests (and to debug them). Only requires Docker to be installed locally.
* Very simple to test any combination of DB/Servlet Container/Browser.
* Always up to date images with the latest version (we can depend on LATEST of Browser images, MySQL, Tomcat, etc).
* Using JUnit5 and thus the latest features
* Moving to the latest Selenium version too
* Also supports manually executing tests in a given running xwiki instance
Implementation
============
Something like:
--> XWikiSeleniumExtension extends SeleniumExtension
@ExtendWith(XWikiSeleniumExtension.class)
public class Test
@Test
public void xxx(XWikiWebDriver driver)
{
…
}
And be able to configure the DB to use, the Servlet container to use, and the packaging to use from system properties (and also from the test itself, see https://bonigarcia.github.io/selenium-jupiter/#generic-driver).
The idea is to reimplement the XWiki Packaging Maven plugin as a java lib using Aether and to just start our functional tests using pure junit without anymore more. All the hard work will be performed by the JUnit5 extension (create the packaging if not already exist, update some part of it if files have been modifier, start/stop DB+Servlet+Browser+Selenium, download the docker images).
The packaging will be configurable. Some ideas of options:
* use an already running xwiki instance
* docker created from full XS zip from URL
* docker created from XS zip from maven artifact
* docker created from computed based on pom in current dir
Migration
=======
Once a first version is working, it’ll be easy to use it only for a single platform functional tests and then slowly move each module to use the new way for its functional tests.
WDYT?
I’m planning to continue my investigation/development of this. So please let me know if you have feedback.
Thanks
-Vincent
Hi devs,
I'm making a rest resource to get a list of pages and, for a query, I
want to specify an icon (as a metadata) for each pages in the resulted
json.
The problem is that the icon APIs (and more specifically the
IconManager class) only allow us to render the icon in HTML or
velocity and this shouldn't be put inside a json response.
Also we can't hardcode the icon class or image URL to be used as it
depends on the iconset configured for the wiki. Another possibility
would be to render the icon using javascript but it will not be very
efficient.
As discussed with Marius, our proposal would be to add a new method to
the IconManager to get either the icon URL (e.g.
http://xwiki.org/xwiki/resources/icons/silk/page.png) or the icon
class (e.g. fa fa-page) depending of the specified iconset.
We could then have this new property to the icon theme definition:
## Silk
xwiki.iconset.render.json=$xwiki.getSkinFile("icons/silk/${icon}.png")
## FontAwesome
xwiki.iconset.render.json=fa fa-$icon
We could name the new method renderJSON or something more generic (if
you have any idea).
WDYT?
Thanks,
Adel
Hi devs!
I would like to propose a new extension called
application-extramacrocontent.
Currently, macros can only provide one content field, that generally
contains the main body of the macro. Parameters are used to pass
additional information to the macro but are not the best to put large
text in them.
With this application, it would be possible to add multiple content
fields inside the macro, therefore adding new possibilities to macro
creation and leveraging the problem of passing multiple big inputs to
the wiki macros.
I'm not really sure about the name of the extension itself, so if you
have any better idea, please let me know :)
Thanks,
Clément