Re: [xwiki-devs] Checkstyle audit audit

11 Sep 2014

Hi Jeremie,
Sorry for not replying sooner, I felt that I should give your mail a bit
more thought and a little but more background about me since we don't know
eachother as well as I know Sergiu and Marius.
On 09/09/2014 06:02 PM, Jeremie BOUSQUET wrote:
...
  Hi,
 My own humble remarks ... :)
 2014-09-09 11:00 GMT+02:00 Caleb James DeLisle &lt;cjd(a)cjdns.fr&gt;fr>:
  Firstly, thank you very much for reviewing my
code.
 When I was young I hated homework, my parents told me that if I spent
 as much time on the work as I spent bellyaching about it, I'd have it
 done already. A lot of your review seems to take the point of view that
 it's "not that hard", my counter is that no rule makes contributing
 *easier* and as contributors are not exactly lining up at the door, any
 rule that exists needs to justify itself as necessary.

 There's the implicit idea here that a rule is something created to bother
 developers. Something that limit contribution, is costly eventually,
 requires an effort, and so on ...
 I think the concept and objectives of such rules is exactly the contrary,
 they're here to help contribution (by making the code clearer), and give a
 framework that everyone knows and understands (less effort to dig into code
 you didn't write). 
Although in this mail thread I have been very critical, I should probably
show my credentials as "pro best practice". I'm a leader of a project in C
and I developed my own codestyle validator which checks, among other things,
that all publicly accessible functions are prefixed with the name of the
file where they're defined.
Although I have on occasion wanted to violate this rule, I have yet to find
what I think is a *valid* reason for doing so and I have so far kept it 100%
which makes reading the code much easier for other people who upon seeing
a function which they don't recognize, need only scroll to the top of the
file to see which header it's defined in.
...
  Of course there are always rules that can be
discussed, regarding their
 pertinence, criticality etc.

That's what I'm after, I feel that every rule imparts a certain amount of
cost and we need to periodically review them to make sure that they are
net-positive value.
...

 On 09/08/2014 07:30 PM, Sergiu Dumitriu wrote:
  I'm gonna audit your audit audit.
 On 09/08/2014 05:14 AM, Caleb James DeLisle wrote:
> 
/src/main/java/org.xwiki.contrib.websocket/internal/DefaultWebSocketConfig.java:26:
 Missing a Javadoc comment.
   Missing
javadoc on a class... whose behavior is very obvious.
 I'm not here to teach programming. 
 Once again, I don't like "DefaultXYZ" as a name for components. Either
 only one implementation will ever make sense, in which case it probably
 shouldn't be an interface+implementation, or it would make sense to have
 alternate implementations. In the latter case, there's nothing "default"

 This is the first time I've heard that. Indeed I don't like "Default"
 either but it seems we don't use "Impl". This question brings up a more
 difficult point, why do we need an interface for each thing we register
 in the CM, even if there's only ever going to be one impl?

 The issue I see here, is answering the question : "only one implementation
 will ever make sense". Other contributors may invent over implementations
 you never thought about. 
I've been here myself, I have made a lot of very nice interfaces to
"future proof" my designs, and then when a new use case actually came along,
I found my interfaces unduely difficult to use for that use case and I
promptly re-factored the code.
Since then, I've developed a strong tendency toward YAGNI and
"simplest thing that could possibly work". By the time someone finds a new
use case, the code probably needs some refactoring anyway.
...
  Following IOC and java interfaces concepts, any
component should have a
 contract - even if there will be only one, ever. 
This makes sense for public components but the case I'm describing is a
component inside of a /internal/ directory so it's really a
"contract with myself".
...
  Question is more if you are willing to allow this code
to be replaced by
 another implementation or not, more than if it makes sense or not at the
 time being, IMHO, because you don't want thousands of components everywhere
 obviously.

  about the implementation, other than it is used
in the XWiki Platform as
 the default implementation. The name should mention what's specific to
 the implementation, and so should the javadoc:
 /** Implementation for the {@link WebSocketConfig} role which allows
 configuring the {@link XWikiWebSocketScriptService websocket service}
 through the usual XWiki configuration places: space preferences, wiki
 preferences, and {@code xwiki.properties}. */
 public class WikiWebSocketConfig 
 Although I'm tempted now to copy/paste your comment into the code, I'm
 not hesitant because this is not the XWiki Configuration infra, just one
 module's config.
 If we hardcoded "xwiki.properties" all over the code, that would be well
 accepted as a disaster, why then is it ok to hardcode documentation about
 how XWiki's configuration works all over the codebase?
 This class uses ConfigurationSource, whatever that might do.

>  /src/main/java/org.xwiki.contrib.websocket/internal/WebSocketConfig.java:55:
 Expected an @return tag.
 >
> I actually wrote javadoc on all of these, without even being prodded to  do so.
 > Do you think my freeform documentation is
less useful to a reader than  it would be if it contained
 > a @return tag? 

 If I want to know what's the return value I will only check the return
 description in javadoc, maybe not the sometimes length full description
 above of it ...
 As much as I love xwiki, I must say I'm not a bit fan of old core javadoc
 ;-)
 Take these ones for example:
http://nexus.xwiki.org/nexus/service/local/repositories/releases/archive/or…
http://nexus.xwiki.org/nexus/service/local/repositories/releases/archive/or…
 I dare any contributor to find what they do without looking at the code :)

As a result of the discussion with Sergiu, I have decided that public
functions should have the same javadoc standards as usual and optional
javadoc should only apply to private or /internal/ (inside of a directory
called internal) classes and functions.
...

 I usually describe the meaning of the return value in the method
 javadoc, and the format / possible values / special cases in the @return
 javadoc:
 /**
  * The filename of the SSL private key in OpenSSL PEM format.
  *
  * @return an absolute path, or {@code null} if no certificate is
  *     configured
  */
 String getPrivateKeyFilename();
 (Why doesn't this return a File, or an InputStream, since that's safer
 than a raw path?) 
 Business logic in the configuration?
 Should the configuration also be responsible for checking if the file
 exists?

> 
/src/main/java/org.xwiki.contrib.websocket/internal/XWikiWebSocketScriptService.java:45:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/XWikiWebSocketScriptService.java:50:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/XWikiWebSocketScriptService.java:100:5:
 Missing a Javadoc comment.
 > >> Do you really think that a 3
line function called getUser() needs a
 javadoc comment?

 I think that getUser doesn't belong here at all. Since it just returns 
 Skirting the question.
 Do you really think that a 3 line function called
 somethingOtherThanGetUser() needs a javadoc comment?
  the current user, why is it needed? There's
already
 $xcontext.userReference, and if you want to actually return a reference
 to XWikiGuest instead of null, then that's wrong and deprecated. 
 I do want a reference to XWikiGuest, null has a different meaning in this
 context. There are of course many ways to solve a given problem but if you
 claim that the implementation detail of how the internals of this class
 behave are wrong and deprecated, the burden of proof is on you.

 All this discussion is a proof that it was actually needed to comment this
 ! :)
 On my side if I just see "XWikiWebSocketScriptService#getUser", I have
 absolutely no idea of what it could return ...
 - current user ?
 - user of the web socket ? (if it means something)
 - user that owns a thread of the web socket service ?
 ...
 IMHO something called "getXxx" or "setXxx" is implicitely described
only in
 case of a bean or pojo, in a script service or component interface there's
 nothing really obvious usually ...

This all makes perfect sense and I would agree entirely if it were not for
the fact that it's a private method.
...

 As for the other longer methods, since this is a public API it should be
 documented.
 Also, it was agreed that script services should be in a non-internal
 org.xwiki.module.script package.

 That's a good point, I'll document those.
 > I can guarantee you that the competition are
not going to slow down and  smell the flowers.
 >
>
> 
/src/main/java/org.xwiki.contrib.websocket/internal/XWikiWebSocketScriptService.java:121:79:
 The String "/" appears 4 times in the file.
 >
> And it might appear 400 times, defining a static final string called 
URL_SEPERATOR is
 > exercise for my fingers and your eyeballs,
nothing more. If I really  wanted to obfuscate
 > my code, I could use an external library to
make you go hunting for the  value of

YouWillNeverFindMe.SEPERATOR 
 We can exclude certain strings from the check. We currently exclude a
 few common strings: "", "[0-9]", " ", "]". We
could add "/" as well.
 This rule is supposed to protect us against poor-man's constants
 copy-pasted in many places, since this is error-prone (typos) and hard
 to refactor. It also forces us to define useless constants just for
 working around this rule, which I agree is almost as bad. There's a
 tradeoff between maintainability and convenience, and the XWiki
 community agreed to go for more maintainability. 
  I think what's really missing here is that they never really exposed an URL
 separator constant in the JDK ...
 But it's difficult for a checking-tool to differentiate between those url
 separators (that will never change, so who cares as you say), and using
 file paths separators hard-coded (which is pretty bad and create many
 environment specific weirdness in so many tools). I think if they did that
 with maven plugins from the start, they wouldn't be hunting for these '/'
 and '\' like they do, and they'd have lighten many JIRAs ...

I think the correct solution to windows insanity is to create a filesystem
abstraction layer which deals with their their broken slashes and magical
filenames which you can never use (eg: COM1) and their sort-of-but-not-quite
international filenames based on UTF-16 which can *almost* represent all
international characters.
If that were implemented in an abstraction layer then it would save a whole
lot of bug reports and a whole lot of pain and suffering trying to deal with
it.
...

 It's not more maintainable, it just looks more maintainable.
 If someone's hardcoding filesystem paths inside of classes then they're
 certainly doing so many other horrible things that the file paths should
 be preserved as a red flag for reviewers!
 99% of strings are not file paths, they're error messages and forward
 slashes and other things which will never change. Even configuration
 keys are impossible to change once they're deployed in the field.
 #define TEN 10

>  /src/main/java/org.xwiki.contrib.websocket/internal/WebSocketService.java:25:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/WebSocketService.java:28:5:
 Missing a Javadoc comment.
 > >> It's inside of an
/internal/ directory, if you can't infer it's meaning
>> then either you can't read or I can't code.
>
> It's a component role, but it's internal. That's a bit of a
> contradiction, but let's ignore that for the moment.
>
> I looked at that interface (ignoring the JavaDoc that's there now), and
> out of context I can't infer it's meaning.
 Because it's internal, I assume that you can just look at the
 implementation.
 But this gets back to the question of why we need to define interfaces for
 which there will never be more than one impl.

 Default scope for javadocs with maven plugin is protected ... Not public.
 Personnally I (try) to put javadocs everywhere, even on private methods
 (Hey, I don't succeed, you can blame me ;) )
 Because I describe there the contract of the method - not how it's
 implemented - there are non-javadoc comments for that.
 And often I'm the first who reads them, when I'm back to work after a long
 time : "so, what was this for again ... ?"
 Sometimes the name of the method is obvious, but almost all the time it's
 not sufficient.
 It's also a nice way to document a change - you see the change in contract,
 and impact on implementation, or you see that implementation changed, but
 not the contract.
 (when I say contract here it's more than the method signature)
 But that's mainly my point of view because I like javadocs ! :)

>
>  /src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:33:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:58:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:59:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:60:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:62:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:67:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:69:5:
 Missing a Javadoc comment.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocket.java:70:5:
 Missing a Javadoc comment.
 > >> It's internal and this
module is small, if a reasonable programmer
 needs this
 > granularity to understand this module, it
should be rewritten from  scratch.

 Except the type, these actually are missing an @Override. As for the  type:
 Override indeed solves the checkstyle error.

 /**
  * {@link WebSocket} implementation using the
  * <a href="http://netty.io/">Netty library</a>.
  */ 
 Does telling the reader that Netty means netty.io make the code
 fundamentally better?
 If so then I could make all of my code better by telling the reader how to
 use google.

 Personnally I don't like those kind of external links, because in some time
 from now they could easily have disappeared, leading to broken links
 everywhere in your javadoc ...
 I'd document these in the overall component/module/software documentation,
 in a wiki, in somewhere easily up-to-date.

> 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:118:40:
 The String "SSL enabled with websocket.ssl.certChainFile set " appears 2
 times in the file.

 Easy fix:
 -             throw new RuntimeException("SSL enabled with
 websocket.ssl.certChainFile set " +
 -                 "but the pkcs8PrivateKeyFile does not seem to exist.");
 +             throw new RuntimeException(
 +                 "SSL enabled with websocket.ssl.certChainFile set but
 the pkcs8PrivateKeyFile does not seem to exist.");

 Easy fix is still worse than "not broken", I don't want to know how easy
 it is to trick this rule, I know a lot of tricks already. I want to hear
 a justification for this rule in the face of the fact that it makes
 development harder than it would otherwise be.

 I think the fix should logically bring another broken rule about appending
 strings with '+' :D
 I'm not sure where should be put the cursor, from "it makes development
 harder" compared to "it's an important rule", apart from applying
rules
 that were agreed among a community ...
 If I ever perform a modification on this code, let's say around the first
 RuntimeException, and I changed the name of the property to something else
 than "websocket.ssl.blabla", I could maybe easily miss the fact that this
 string appears somewhere else, then the user would get a different property
 name in different exceptions depending on the use-case, one good one wrong.
  (blame me for being a bad developer not searching for this string in all
 the code if you want :), but remember, I'm a contributor, not a committer, 
I won't, I believe form follows function, working code rules. But then that's
what this thread is about :)
However the example is not so great because configuration keys are impossible
to change in practice.
...
  you didn't hire me on my skills, you can just
provide me as many
 recommendations as you want, and I may never follow them.
 Your only possibility is to review my code thoroughly (which will not
 improve time for merging contributed code), or rely on some tool to
 automate some basic checks.)
 BTW IMO it should be the name of this property that should be in a
 constant, even before this exception message string.

For me this falls in the "I might do that when I refactor that code but
it's not critical" category. I see your point.
...

 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:118:92:
 '+' should be on a new line.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:122:92:
 '+' should be on a new line.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:132:94:
 '+' should be on a new line.
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:133:92:
 '+' should be on a new line.
 > >> It's arguably better to
put it on the line before because some
 languages (python)
   infer
semicolons and will fail to parse if it's on the line after. 
 This isn't Python code. But I agree, I don't have a good reason for
 putting it on the next line, and I've wondered about this rule as well. 
 I write code in a number of languages and I want to adopt habits which
 are as widely applicable as possible.
>
 > 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:235:9:
 Executable statement count is 37 (max allowed is 30).
 > >> Statements per function is
almost useless as a measure of code
 complexity.
   I claim
negative usefulness because it is misleading. 
 Would you say the same if the limit was at 5000 statements, and you were
 over that limit? We have to draw the line somewhere, and it was randomly
 agreed to set it at 30. This might be too low for some usecases, but in
 general it is a good enough rule IMHO. You can use your judgment and
 explicitly disable that rule for that function.
 But looking at the code, I think it could use some simplifying, I find
 it rather long.
 The rule isn't just about code complexity, but the attention span of a
 developer reading that code. How long does it take to put that whole
 method in the brain? Can most developers even fit that much code in
 memory? If not, then the method must be simplified / split. 
 I've written functions much longer and much easier to read as well as much
 shorter functions which, due to the complexity of the algorithm, push my
 own mental capacity to it's limit.
 Surely I still have much room for improvement in code craftsmanship but
 I don't think a checkstyle failure is any more likely to make me suddenly
 discover a clearer way to express a function any more than a rejected paper
 would suddenly make me a mathematician.
 On the contrary, fixing these types of errors will almost certainly make
 the code *worse* because the original state of the function is an
 expression
 of the mental model of the author and in response to failures, he will
 break
 it up along arbitrary lines in order to make the errors stop, but the code
 and the programmer will not and cannot be fundamentally better than they
 were 20 minutes before.

 I mostly agree with you on this. My personal idea is that there should be a
 limit, but far greater than the usual or expected limit.
 But maybe overriding the rule is enough on a case per case basis ...
 In my case I usually write much more statements than most developers would
 do, because I can't do another way :) I prefer to write all little steps,
 and let the compiler do its job, and it's the way I think it.
 I think my code would most of the time break this rule ... :/

>
> 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:246:58:
 The String "?k=" appears 2 times in the file.
 >
> And you don't need to look anything up to understand that line of code, 
any
   change
would be for the worse. 
 String key = StringUtils.substringAfter(uri, "?k=");
 uri = StringUtils.substringBefore(uri, "?"); 
 I don't want to see the tricks, I want to see compelling justification for
 the rule.

 As any rule has exceptions, there are exceptions to the compelling
 justification for a rule ...
 Maybe some rules are bad, or badly specified, or there's just some rare
 cases where they're more annoying than useful, but would you disable them
 completely in all cases ?

My opinion is that we need to periodically review our rules,
(as well as rules which we don't have and perhaps should!)
We should ask a few questions:
How much does a rule help when it's helpful?
    if the rule, for example, prevents deadlock conditions 1% of the time
    then maybe we're willing to accept more pain to keep using it.
How much does it hurt when it misdiagnoses a non-problem as a problem?
    Is it hard to "fix"?
    Does "fixing" the code make it worse?
How often is it accurate?
How often is it inaccurate or overbearing?
    This is a tricky question because often a check triggers for one
    developer and not for another and the other developer is tempted
    to think that his associate is wrong. We need to be careful to
    differentiate "unacceptable" from "could be better" or
    "not the way I would have done it".
Thanks,
Caleb
...

 (Why would any serious library use strings for representing URIs instead
 of java.net.URI?) 
 If I gave you a string and a handful of URI manipulating functions then
 even
 if you have no knowledge of those functions, you can still understand it as
 a plain old string, if I give you a java.net.URI then you have to go look
 up
 API documentation no matter what, an obvious cost increase.
 If there are no solid justifications for this cost then I call
 overengineering.

> 
/src/main/java/org.xwiki.contrib.websocket/internal/NettyWebSocketService.java:300:38:
 'nws' hides a field.
 >
> I always use 'this.' prefix when accessing object fields, I would 
prefer that
   as a rule
but in the absence of that rule, this one is fairly important. 
 That's not an absence, the "always use this." rule is there. 
 It's not enforced by checkstyle, ftr this is one rule which I would like
 to see
 added (but only if we can at least balance the development pain budget, if
 not
 lower it).

 Having both rules is a bit redundant, since the this rule is supposed to
 allow the field hiding that the second rule is complaining about.
 Still, why do you need to declare nws instead of assigning to this.nws?
 As I can see, there's no exception that could stop the method between
 the nwsLocal declaration and the this.nws assignment.
 And I think you just introduced a bug,

https://github.com/xwiki-contrib/xwiki-contrib-websocket/blob/master/src/ma…
  now references "nws" which became
this.nws instead of nwsLocal.

 It will "probably never happen" but I fixed it for the next version,
 thanks for
 spotting that. Since it's used in an inner class, I though I needed it to
 be
 final but indeed using the field works, fixed.
 So to summarize, most painful rules:
 1. Multiple identical string constants in a file
 2. Javadoc /internal/ classes and methods
 3. (not a checkstyle rule) interfaces with one impl.
 4. There are 3 or 4 rules which force you to completely re-factor a
 function or
 class, every time one of these trips and the developer "fixes it", we're
 creating
 bad code.
 Thanks,
 Caleb
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs
  _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

--
Caleb James DeLisle
XWiki SAS
calebjamesdelisle(a)xwiki.com

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] Checkstyle audit audit