At the moment, indexing of documents uses the index API for chunking, embedding and indexing of documents in Solr. Unfortunately, this doesn't provide any insights into how many indexing tasks are pending or which documents have already been indexed. Further, if the provider of the embedding model is temporarily down, documents that failed to index because of this won't be tried again when the provider comes back. It seems inevitable to implement a custom indexing queue for the LLM application and a custom worker for indexing the documents. Features to implement:
- Wait between indexing requests when there are failures of the embedding model
- Keep track and display which documents had an error during indexing
- Display how many outstanding index requests there are for each collection and each embedding model
- Support queuing whole collections for re-indexing when the collection's configuration changes (asking the user for confirmation)
- Allow re-triggering the indexing of documents that failed to be indexed
- Store the queue in the database such that it persists across instance restarts
- Submit multiple chunks in parallel to the embedding model, also chunks from different documents with a configurable limit (the limit should be per embedding model)
|