On Thu, Sep 5, 2019 at 3:43 PM Simon Urli <simon.urli(a)xwiki.com> wrote:
Hi everyone,
reopening this thread since I started to close some flicker issues as
part of BFD and got comments for those.
So the last mails on this threads suggested to close the flicker issues
if we didn't manage to reproduce them locally after a repeated tests,
and that we didn't see them after a while.
We didn't vote for those suggestion and I assumed a bit quick that I
could close some flicker issues that I personally don't remember about
on the CI after having tested them locally.
My point for doing that is the same as for the first mail I posted on
this thread: those flickers are old, and the code did change enough for
those to be fixed in a way or another.
Being old does not always means the code leading to those failures
changed that much.
Now I might be completely wrong, and the flicker to happen again, but I
don't think it's a problem since we can really easily open back the
issues if it's the case.
The other solution IMO is to indeed keep the issue open and in fact to
never really close them, because we just don't have time to investigate
each of them properly.
I really don't see any value of keeping things open and don't act on
them, that's why I suggest to close them after doing the checks we
suggested before:
1. try to repeat locally the failure;
This is totally useless IMO unless you make sure that your computer is
made super slow some way since that's the reason for most of the
flickering tests.
2. check that we didn't encounter those
flickers since last cycle.
This one is enough for me but the hard part is to knowing that.
So first question, do we all agree on that?
Then for the second check, Vincent suggested to add some tooling: it
will be best, but it takes time to do. So on the meantime, as Thomas
also suggested, we could add a check in the release plan to create or
update all jira issues that concerns flickers. It would allow us to keep
some information about the liveness of our flickers.
So second question, do you agree on that?
Depends what it exactly means. Have some dedicated jira field to
indicate when you saw it last ? Comment that you just saw that test
failing again ?
Other useful and a little more automated tricks not requiring much tooling:
* increase the currently very low history (10). The reason it's that
low is because of many performances issues we had in the past with old
style jobs but those most probably don't apply anymore so we should
increase the number now IMO (30 ?)
* create a pipeline job which execute platform master integration
tests once a day with
http://cpulimit.sourceforge.net (looks fun) and
keep a big history but not storing stuff like videos and images (100
?)
Final question: for the flickers that I closed today, I relied mainly on
my memory for the second check and on their age: I closed the older ones.
So what should we do on them?
My concern with them is that the reason you gave to close them (that
you cannot reproduce them locally) was not valid IMO. If you say some
test did not failed since a long time then fine, if what some test is
about has completely been rewritten then fine too but that's not what
you indicated :)
If your memory is only related tests being checked just before a
release I'm not sure this is good enough.
Thanks,
Simon
On 26/03/2019 10:58, Vincent Massol wrote:
On 26 Mar 2019, at 10:31, Simon Urli
<simon.urli(a)xwiki.com> wrote:
Hi everyone,
I was checking our list of flickering tests in JIRA
(
https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status…)
and I noticed that we had somehow old flickering test issue concerning test that I've
never seen failing.
So I propose we close some of them as inactive: the ones that we don't remember
having seen for a while. The ideal would be to have a mechanism to update the issue when
the CI fails on a flicker, but it takes time to do properly and it's not a priority.
On the contrary I propose to trust our memory: if we're wrong because we have closed
a flicker that is still happening, it will allow us to remind that we have this flicker to
fix and we can easily reopen the issue.
As Thomas mentioned on the chat, we should also update the release plan to include the
inactive flickers in the list of issue to check.
I should be able to easily create a report when any test fails inside our jenkins
pipeline and make it available similar to our clover report. I could indicate if it’s a
known flicker or not too in this report. That could compensate for the fact that we only
keep 7 days of records in our jobs.
Would need to define the report format, whether it’s the same file updated at each run or
a different one. If the same one, then either:
* I’d need to parse it first in memory, add the new tests and overwrite the file
* or add to the bottom of the file which will grow quite large quickly
WDYT?
Thanks
-Vincent
So for now I propose to close the following list of issues as inactive:
* XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering
(
https://jira.xwiki.org/browse/XWIKI-14399)
* XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering
(
https://jira.xwiki.org/browse/XWIKI-14396)
* XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x
(xwiki-enterprise-test-ui) is flaky (
https://jira.xwiki.org/browse/XWIKI-14394)
* XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is possibly flaky
(
https://jira.xwiki.org/browse/XWIKI-14386)
* XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is flickering
(
https://jira.xwiki.org/browse/XWIKI-14835)
* XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering
(
https://jira.xwiki.org/browse/XWIKI-14860)
And I propose in general to close the flickers we don't remember having seen after a
cycle as inactive.
WDYT?
Simon
--
Simon Urli
Software Engineer at XWiki SAS
simon.urli(a)xwiki.com
More about us at
http://www.xwiki.com
--
Simon Urli
Software Engineer at XWiki SAS
simon.urli(a)xwiki.com
More about us at
http://www.xwiki.com
--
Thomas Mortagne