Solrcloud 4 inconsistent results count-commitwithinms issue:
Issue description:
I have four nodes solrcloud setup version 4.10 and my collection has 4 shards, 2 replicas and 5 zookeeper nodes. My application provide the search ability with realtime data ingestion, both data ingestion and search processes runs in parallel.
Every day the data load is around 2~3MM records(insert/update operations) and total documents count is 80MM+.
The problem I was facing is that the solr returns very inconsistent records count during peak time of data ingestion.
Sample query:
for i in `seq 1 50`; do curl 'http://localhost:8888/solr/OPTUM/select?q=\*:\*&wt=json&indent=true'|grep numFound|rev|cut -d'{' -f1 |rev done
The response numfound
variable shows sometime very less documents count then actually present in solr.
Solution:
I could not find exact root cause of this problem but temporarily I did a work around to resolve this error.
I had been using solrj4.x softcommit method(UpdateRequest.setCommitWithin( commitWithinMs )) which I commented and used all commit strategy at solr end.
<autocommit>
<maxtime>15000</maxtime>
<opensearcher>false</opensearcher>
</autocommit><autosoftcommit>
<maxtime>2000</maxtime>
</autosoftcommit>
I am getting consistent result from solr but still I am not sure why solrj client side commit isn’t working.