Re: [xwiki-devs] XWiki Scalability/Challenge Question

28 Aug 2014

Hello Paul,
Thanks for the robot link, I will use it as reference!
As you said, the random is a sort random, so in order to make my scenario
work I need to do exactly what you have said= default query with sort
random. Or follow the tutorial that I provided, the guy show how to use a
random index in a random dynamic field(defined in the schema.xml) to
generate a random value.
Anyway, in this case after to try solr I am using HQL because this query is
returning in a short time and it is not necessary to complicate simple
problems =)
thanks all.
2014-08-26 18:54 GMT-03:00 Paul Libbrecht &lt;paul(a)hoplahup.net&gt;et>:
...
  Hello Danilo,
 against GoogleBot trying all these fancy (linked) actions, I'd suggest you
 make use of Robots.txt.
 We've made almost all actions away from robots using it:
   http://www.curriki.org/robots.txt
 Of course, you can also write apache rewrite rules… this is finer grained
 (even checking the identity of the client).
 On the solr random queries, I am a bit surprised your scenario works… what
 random value would you take?
 Is that random only for sorting and you use the default query (*:*)? I
 guess that would work (it's not a random query then, it's a random
 ordering, something you don't want the users to intentionally formulate I
 think).
 paul
 On 26 août 2014, at 23:33, Danilo Oliveira &lt;daniloa.oliveira(a)gmail.com&gt;
 wrote:
  Hello Clemens,
 I have checked the XWIKILIST and I noticed that my genre, country and
 language lists of the movies, that are defined in my movieClass, are
 recorded in this table. Do you think that is the cause of the slowness?
 But I discovered who is generating these queries, the GoogleBot see:
 66.249.69.197 - - [26/Aug/2014:18:07:09 -0300] "GET
 /bin/view/Main/Tags?do=viewTag&tag=tang-breakfast-drink HTTP/
 There are googlebots requisitions trying to delete my tags too...
 Well. I am blocking them according this doc[0]
 Rodrigues,
 I accessed the neo4j site. This db looks like very interesting and I  think
  that is applicable to my application. However my
app is in proof of  concept
  phase, so actually XWiki attend my necessities.
Absolutely, I will  consider
  it if my application grows. thanks for the tip!
 Well, I changed my queries to SOLR and now my application is working
 perfectly, even better than at the beginning.
 But I have just one more necessity. Random Query.
 I checked in SOLR how to make a random query and I found this article [2]
 And on the "Additional Configuration" section of the article, you can 
read
  that we need the two parameters below in the
schema configuration.  However
  In schema.xml of XWiki, we just have the first
[1]
 <fieldType name="random" class="solr.RandomSortField"
indexed="true" />
 <dynamicField name="random_*" type="random" />
 I am not expert on SOLR, but if I just add the second parameter will it
 work or Do I need to worry about other things?
 [0]http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Performances
 [1]

https://github.com/xwiki-contrib/xwiki-platform-solr/blob/master/solr/conf/…
  [2]

 http://solr.pl/en/2013/04/02/random-documents-from-result-set-giveaway-resu…

 Thanks everyone for the attention!
 Danilo
 2014-08-26 4:33 GMT-03:00 Clemens Klein-Robbenhaar <
 c.robbenhaar(a)espresto.com&gt;gt;:
>
> This Query looks much like it is generated by the tag service when
> searching
> for documents with a given tag (the code is in class TagQueryUtils,  method
 > getDocumentsWithTag, in the
> xwiki-platform-core/xwiki-platform-tag/xwiki-platform-tag-api
> module)
>
> This query might be triggered by any kind of UI element (Panel, macro  etc.)
 > I do not think it is used to update any
search index or the like.
> Instead it is used on some pages, e.g. Main.Tags when clicking on a tag  to
 > see its list
> of documents.
>
> I wonder why this query takes so long. Even a 100K docs should not be
> that much
> (I mean, 5 minutes query time, huh?)  Is there any chance some binary  data
 > of the movie
> objects or the like ended up in the xwikilistitems table or any other
> table used in the query?
>
> Clemens
>
>> Hello,
>>
>> As I mentioned, I discovered that the queries that are hogging my DB  are
    similar to:
 '102', 'xwiki', 'localhost:52614', 'xwiki',
'Query', '372', 'Creating  sort
  index', 'select xwikidocum0_.XWD_FULLNAME
as col_0_0_ from xwikidoc
 xwikidocum0_ cross join xwikiobjects baseobject1_ cross join xwikilists
 dbstringli2_ inner join xwikiproperties dbstringli2_1_ on
 dbstringli2_.XWL_ID=dbstringli2_1_.XWP_ID and
 dbstringli2_.XWL_NAME=dbstringli2_1_.XWP_NAME inner join xwikilistitems
 list3_ on dbstringli2_.XWL_ID=list3_.XWL_ID and
 dbstringli2_.XWL_NAME=list3_.XWL_NAME where (xwikidocum0_.XWD_HIDDEN<>1  oy
  xwikidocum0_.XWD_HIDDEN is null) and
 baseobject1_.XWO_CLASSNAME=\'XWiki.TagClass\' and
 baseobject1_.XWO_NAME=xwikidocum0_.XWD_FULLNAME and
 baseobject1_.XWO_ID=dbstringli2_.XWL_ID and  dbstringli2_.XWL_NAME=\'tags\'
  and
lower(list3_.XWL_VALUE)=lower(\'shock-rock\') order by
 xwikidocum0_.XWD_FULLNAME'
 Anyone knows what is the component that is responsible for this query?  for
> each new tag, this kind of query is executed to create sort index?
>
> Thanks
>
>
> 2014-08-23 3:46 GMT-03:00 O.J. Sousa Rodrigues &lt;osoriojaques(a)gmail.com :
>>
>>> Wouldn't this be a perfect case for a NoSQL-DB like Neo4J?
>>> Am 22.08.2014 23:13 schrieb "Paul Libbrecht"
&lt;paul(a)hoplahup.net&gt;et>:
>>>
>>>> Danilo,
>>>>
>>>> have you checked the MySQL process list?
>>>> I'd suspect something is hogging.
>>>> For search, I'd recommend to leverage solr… but with an amount of
>>>> customizations. There are some hooks in the solr-plugin, I believe.
>>>>
>>>> hope it helps.
>>>>
>>>> paul
>>>>
>>>>
>>>> On 22 août 2014, at 22:54, Danilo Oliveira < 
daniloa.oliveira(a)gmail.com
 >>
>>>> wrote:
>>>>
>>>>> Hello Devs,
>>>>>
>>>>> I am developing an application based on XWiki that is mapping,
>>>> connecting,
>>>>> relating and graphical disposing movie information in order to make
>>>>> possible to the user explore their trailers.
>>>>>
>>>>> At the beginning with a light data set (<5k movies) the
application
> was
>>>>> running well, but today I started to populate my database (MYSQL)
 and
 >>> the
>>>>> application became unusable, the queries is taking more than 5 
minutes
 >>> to
>>>>> complete. Actually, it has more than 15k movies (1 movie = 1 doc)
 and
 > I
>>>>> need to upload more 100k.
>>>>>
>>>>> I already have checked the cache and performance page but I don't
 know
 >>> if
>>>>> they[1][2] solve my problem:
>>>>> I think that is a architecture challenge.
>>>>>
>>>>> My AS IS process is:
>>>>> -User insert a movie,
>>>>> -the application search for the movie and their related films based
 on
 >>>> its
>>>>> characteristics (a lot of joins and other algorithms) (bottleneck)
>>>>> -the application returns the results as a map;
>>>>>
>>>>> I am wondering if I could use the custom mapping[3] to solve my
> problem
>>>> due
>>>>> the fact that the relationship information for each movie, in this
>>> first
>>>>> moment, don't need to change often. Each movie has X movies
related,
>>>> sorted
>>>>> by similarity. So, I could create some relationship algorithm that
> will
>>>> run
>>>>> scheduled ( 1 time by week) and populate this new table .I am 
thinking
 >>> to
>>>>> use dataframe panda of python to talk directlly with mysql and make
>>> data
>>>>> analysis, any other suggestion?
>>>>>
>>>>> So I would create a custom map to my relationship movie class, run
 the
 >>>>> algorithm, populate the new
table, so my TO BE would be:
>>>>>
>>>>> TO BE
>>>>> -user insert movie info;
>>>>> -simple select on the customtable "MoviesRelated";
>>>>> -the application returns the results;
>>>>>
>>>>> I  would appreciate some opinion. Thank you very much.
>>>>>
>>>>> [1]http://platform.xwiki.org/xwiki/bin/view/AdminGuide/Performances
>>>>> [2] 
http://extensions.xwiki.org/xwiki/bin/view/Extension/Cache+Module
   >>>
[3]http://platform.xwiki.org/xwiki/bin/view/DevGuide/CustomMapping
>>>
>>> Danilo
>>> --
>>> Danilo Amaral de Oliveira
>>> Engenheiro de Computação
>>> celular (32) 9111 - 6867
>>> _______________________________________________
>>> devs mailing list
>>> devs(a)xwiki.org
>>> http://lists.xwiki.org/mailman/listinfo/devs
>>
>> _______________________________________________
>> devs mailing list
>> devs(a)xwiki.org
>> http://lists.xwiki.org/mailman/listinfo/devs
>>
> _______________________________________________
> devs mailing list
> devs(a)xwiki.org
> http://lists.xwiki.org/mailman/listinfo/devs
>

 mit freundlichen Grüßen
 Clemens Klein-Robbenhaar
 --
 Clemens Klein-Robbenhaar
 Software Development
 EsPresto AG
 Breite Str. 30-31
 10178 Berlin/Germany
 Tel: +49.(0)30.90 226.763
 Fax: +49.(0)30.90 226.760
 robbenhaar(a)espresto.com
 HRB 77554 B - Berlin-Charlottenburg
 Vorstand: Maya Biersack, Peter Biersack
 Vorsitzender des Aufsichtsrats: Dipl.-Wirtsch.-Ing. Winfried Weber
 Zertifiziert nach ISO 9001:2008
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

 --
 Danilo Amaral de Oliveira
 Engenheiro de Computação
 celular (32) 9111 - 6867
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs 
 _______________________________________________
 devs mailing list
 devs(a)xwiki.org
 http://lists.xwiki.org/mailman/listinfo/devs

--
Danilo Amaral de Oliveira
Engenheiro de Computação
celular (32) 9111 - 6867

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

Re: [xwiki-devs] XWiki Scalability/Challenge Question