Hello Savitha,
sorry to be slow at responding.
Le 25 mai 2012 à 22:48, savitha sundaramurthy a écrit :
I have come up with some basic set of API's
for Solr search component.
I have made it generic such that it can use solr or lucene as the backend.
I have also documented in the DEsign proposal.
http://dev.xwiki.org/xwiki/bin/view/Design/SOLRSearchIntegration
*getBackend*
public String getBackend()
*Returns:*
Returns the name of the backend which is currently in use. It could be
Lucene or Solr.
It would be useful to give use cases of things here.
Wouldn't it make sense to rather have getImplementation?
*rebuildIndex*
*public int rebuildIndex(com.xpn.xwiki.api.XWiki wiki,
com.xpn.xwiki.api.Context context)*
Number of documents scheduled for indexing. -1 in case of errors
This is insufficient.
In a big wiki such as
Curriki.org, it can take several days to re-index.
Aside of the proposal of Sergiu about the possible parameters, I'd suggest you create
an "IndexerProcess" class with such APIs as:
- getQueueSize
- getPreQueueSize (sometimes, indexing processes have multiple queues)
- getNextIndexerProcess (if another request was filed)
- getEstimatedCompletionDate
- getIndexingSpeed
- getLastTenIndexedDocuments
I also think we want to re-index based on an iterator of doc-names.
*getSearchResults*
public SearchResponse *getSearchResults*(java.lang.String query,
java.lang.string languages,
com.xpn.xwiki.api.XWiki wiki)*Returns:*
public searchResponse *getSearchResults*(java.lang.String query, java.lang.String
virtualWikiNames, java.lang.string languages, com.xpn.xwiki.api.XWiki wiki)
Searches the configured Indexes using the specified query for documents in
the given languages belonging to one of the given virtual wikis.
*Parameters : *
query - query string given by the user
languages - comma separated list of language codes to search in, may be
null to search all languages. Language codes can be:
- default for content having no specific language information
- lower case 2- letter language codes like en, es , fr.
virtualWikiNames - Names of the virtual wikis to search in. May be null for
global search.
*Returns:*
a searchResponse instance containing the results (i.e) the response objects.
[...]
a searchResponse instance containing the results (i.e) the response objects.
You need to document things here.
I believe a simple inspiration is probably the structure of a SolrResponse.
Methods such as getStart, getTotalResults, getSolrDocument(int_ (a SolrDocument!),
getXWikiDocument, next.
Think about this would be implemented in velocity.
Consider the PageTool of SolrItas (which supports views into displaying or not links to
previous or next pages of results).
I'm not sure on whether to expose the startIndex
as API
Now that Sergiu has written, I understand the question!!
It is crucial that each method has a start int and a maxResults!
Maybe one or two commodity query methods can ignore it (and default to 0 and 100) but do
not let buffers get filled by huge result-sets (and discourage any attempt to "go
through all documents").
Also, please also provide search methods with query objects.
This is the sole and only possibility to write an application whose search can be really
tweaked preventing query-parts injections (similar to SQL injection).
search(SolrQuery query, String wikis, List<String> langs, int start, int maxDocs)
(... or Object query??)
parseQuery(String q, List<String> langs, String queryModule)
(queryModule would be something corresponding to a part of solrconfig.xml, maybe this is
too much but this allows to have different solr-tunings to be exploited for different
search-types)
*Admin Module:*
I has some thoughts on API's for admin settings where one could check the
precision and recall results based on some known documents and could tweak the boost
values accordingly.
I believe I will need more customizability in the API but maybe that can be offered in the
Admin module where such things as "ReIndexingPolicy" or
"IndexDataFactory" can be configured for particular applications to be either
Java classes or Groovy pages.
paul