Hi,
Since we use EmbeddedSolrServer how do we handle
clustering? One instance
per wiki instance? How do they reconcile their
indexes? Need an architecture diagram for our solution
for heavy loads.
May be you can use an embedded Infinispan
<http://www.jboss.org/infinispan/> (in
replication mode ) to store the index and do the clustering.
Regards,
Sasinda.
On Tue, Aug 28, 2012 at 5:34 PM, Vincent Massol <vincent(a)massol.net> wrote:
Hi Savitha,
I've started reviewing quickly the SOLR code in preparation for an
integration in the platform and I have some questions which I have jotted
down below as I was reviewing the code. Sorry for the terse format, I
actually wrote the questions to myself and then decide to send them as is :)
General:
* Need an architecture diagram showing the main modules and threads and
how they interact with the platform
Search-api:
* Is the Search API supposed to be independent of SOLR?
* Search interface is strange, it has implementation details such as:
getImplementation(), initialize(),
* It also has other concerns such as getStatus(), getStatusAsJson(),
getVelocityUtils(), getSearchRequest()
* Why do we need a Search interface? Why not instead use the Query module
and introduce a new query type? (note return List from Query.execute()
probably needs to be clarified). Replace SearchRequest with Query impl
* Naming of interfaces are a bit strange. For example: BuildIndex; should
it be IndexBuilder instead? What about DeleteIndex, should it be
IndexDeleter?
* I don't think we need deleteDocumentIndex(), deleteWikiIndex(),
deleteSpaceIndex(), etc. We need a single deleteEntity(EntityReference
reference, EntityType type). Same for IndexBuilder.
* Why is there a DocumentIndexer interface? Why is a Document different
from other entities? For ex I can see DocumentIndexer.deleteIndex() why not
IndexDeleter.deleteEntity(documentRef)?
* Why is there a need for RebuildIndex (which I assume is IndexRebuilder)
and why cannot we use the IndexBuilder?
* Why the need for SearchIndex?
Search-solrj:
* solrj server in embedded mode is used.
* Shouldn't use system property but the xwiki configuration instead for
the solrj home (see below in misc)
* EmbeddedSolrServer depends on Servlet API? "Also, if using
EmbeddedSolrServer, keep in mind that Solr depends on the Servlet API. "
from
http://wiki.apache.org/solr/Solrj
* EmbeddedSolrServer should be started by listening to the app started
event instead of lazily in Initializable IMO
* Since we use EmbeddedSolrServer how do we handle clustering? One
instance per wiki instance? How do they reconcile their indexes? Need an
architecture diagram for our solution for heavy loads.
Misc:
* all API to review and improve/stabilize
* typos to fix
* licenses to fix
* pom to fix
* missing class javadoc (eg BuildIndex, DeleteIndex, etc)
* exception handling to verify (ex: SolrjSearchEngine)
* Remove unneeded javadoc when @override
* Need to use the XWiki Permanent Directory for storing SOLR configuration
data (the solr home) - Need to move data currenty in solr/ in a
solr-configuration jar module which gets used as a fallback if the data
doesn't exist in the solr home dir.
* Idea: use solr JMX to provide admin features (
http://wiki.apache.org/solr/SolrJmx)
* TODO: Think about how to migrate users to use SOLR instead of Lucene or
DB Searches. Need a plan.
Thanks!
-Vincent
_______________________________________________
devs mailing list
devs(a)xwiki.org
http://lists.xwiki.org/mailman/listinfo/devs