On 05/09/2019 17:24, Thomas Mortagne wrote:
On Thu, Sep 5, 2019 at 3:43 PM Simon Urli
<simon.urli(a)xwiki.com> wrote:
Hi everyone,
reopening this thread since I started to close some flicker issues as
part of BFD and got comments for those.
So the last mails on this threads suggested to close the flicker issues
if we didn't manage to reproduce them locally after a repeated tests,
and that we didn't see them after a while.
We didn't vote for those suggestion and I assumed a bit quick that I
could close some flicker issues that I personally don't remember about
on the CI after having tested them locally.
My point for doing that is the same as for the first mail I posted on
this thread: those flickers are old, and the code did change enough for
those to be fixed in a way or another.
Being old does not always means the code leading to those failures
changed that much.
Now I might be completely wrong, and the flicker to happen again, but I
don't think it's a problem since we can really easily open back the
issues if it's the case.
The other solution IMO is to indeed keep the issue open and in fact to
never really close them, because we just don't have time to investigate
each of them properly.
I really don't see any value of keeping things open and don't act on
them, that's why I suggest to close them after doing the checks we
suggested before:
1. try to repeat locally the failure;
This is totally useless IMO unless you make sure that your computer is
made super slow some way since that's the reason for most of the
flickering tests.
2. check that we didn't encounter those
flickers since last cycle.
This one is enough for me but the hard part is to knowing that.
Ok, so the proposal is now to check only the age since last time we saw
them of the open flickers before closing them.
So first question, do we all agree on that?
Then for the second check, Vincent suggested to add some tooling: it
will be best, but it takes time to do. So on the meantime, as Thomas
also suggested, we could add a check in the release plan to create or
update all jira issues that concerns flickers. It would allow us to keep
some information about the liveness of our flickers.
So second question, do you agree on that?
Depends what it exactly means. Have some dedicated jira field to
indicate when you saw it last ? Comment that you just saw that test
failing again ?
My suggestion was about a dedicated JIRA field if possible.
Other useful and a little more automated tricks not requiring much tooling:
* increase the currently very low history (10). The reason it's that
low is because of many performances issues we had in the past with old
style jobs but those most probably don't apply anymore so we should
increase the number now IMO (30 ?)
+1
* create a pipeline job which execute platform
master integration
tests once a day with
http://cpulimit.sourceforge.net (looks fun) and
keep a big history but not storing stuff like videos and images (100
?)
Not sure what you want there: to have a test execution where you master
the slowness? to detect all problems we might have because of a slow
server?
Have a big history of slowly executed tests so that we can very
quickly see if some failing test in standard builds or in jira issues
is still a thing and is failing for speed reasons (this also help
fixing those tests and making sure they are actually fixed when you
try something).
Final question: for the flickers that I closed today, I relied mainly on
my memory for the second check and on their age: I closed the older ones.
So what should we do on them?
My concern with them is that the reason you gave to close them (that
you cannot reproduce them locally) was not valid IMO. If you say some
test did not failed since a long time then fine, if what some test is
about has completely been rewritten then fine too but that's not what
you indicated :)
I actually say that in my knowledge the test I closed did not failed
since a long time. I didn't checked the code for the tests, except for
one and I commented about it.
If your memory is only related tests being checked just before a
release I'm not sure this is good enough.
Not really the case since I check regularly the CI. Now I'm not sure
it's good enoug either :) Now as I said, we can reopen also later if needed.
Thanks,
Simon
On 26/03/2019 10:58, Vincent Massol wrote:
> On 26 Mar 2019, at 10:31, Simon Urli <simon.urli(a)xwiki.com> wrote:
>
> Hi everyone,
>
> I was checking our list of flickering tests in JIRA
(
https://jira.xwiki.org/issues/?jql=labels%20%3D%20flickering%20AND%20status…)
and I noticed that we had somehow old flickering test issue concerning test that I've
never seen failing.
>
> So I propose we close some of them as inactive: the ones that we don't remember
having seen for a while. The ideal would be to have a mechanism to update the issue when
the CI fails on a flicker, but it takes time to do properly and it's not a priority.
>
> On the contrary I propose to trust our memory: if we're wrong because we have
closed a flicker that is still happening, it will allow us to remind that we have this
flicker to fix and we can easily reopen the issue.
>
> As Thomas mentioned on the chat, we should also update the release plan to include
the inactive flickers in the list of issue to check.
I should be able to easily create a report when any test fails inside our jenkins
pipeline and make it available similar to our clover report. I could indicate if it’s a
known flicker or not too in this report. That could compensate for the fact that we only
keep 7 days of records in our jobs.
Would need to define the report format, whether it’s the same file updated at each run or
a different one. If the same one, then either:
* I’d need to parse it first in memory, add the new tests and overwrite the file
* or add to the bottom of the file which will grow quite large quickly
WDYT?
Thanks
-Vincent
>
> So for now I propose to close the following list of issues as inactive:
>
> * XWIKI-14399: AddRemoveTagsTest#addAndDeleteTagFromTagPage is flickering
(
https://jira.xwiki.org/browse/XWIKI-14399)
> * XWIKI-14396: AnnotationsTest#addAndDeleteAnnotations is flickering
(
https://jira.xwiki.org/browse/XWIKI-14396)
> * XWIKI-14394: SectionTest.testSectionEditInWikiEditorWhenSyntax2x
(xwiki-enterprise-test-ui) is flaky (
https://jira.xwiki.org/browse/XWIKI-14394)
> * XWIKI-14386: appwithinminutes.AppsLiveTableTest.testEditApplication is possibly
flaky (
https://jira.xwiki.org/browse/XWIKI-14386)
> * XWIKI-14835: DeletePageTest#deletePageIsImpossibleWhenNoDeleteRights is
flickering (
https://jira.xwiki.org/browse/XWIKI-14835)
> * XWIKI-14860: LoginTest#testDataIsPreservedAfterLogin is flickering
(
https://jira.xwiki.org/browse/XWIKI-14860)
>
> And I propose in general to close the flickers we don't remember having seen
after a cycle as inactive.
>
> WDYT?
>
> Simon
> --
> Simon Urli
> Software Engineer at XWiki SAS
> simon.urli(a)xwiki.com
> More about us at
http://www.xwiki.com
--
Simon Urli
Software Engineer at XWiki SAS
simon.urli(a)xwiki.com
More about us at
http://www.xwiki.com
--
Simon Urli
Software Engineer at XWiki SAS
simon.urli(a)xwiki.com
More about us at
http://www.xwiki.com