[prev in list] [next in list] [prev in thread] [next in thread]
List: solr-user
Subject: Re: Missing Cores on one node and error running SPLIT command?
From: Jan_Høydahl <jan.asf () cominvent ! com>
Date: 2021-09-22 16:23:07
Message-ID: CE1A7AF8-772D-4B74-BFF2-E1D04ED7D644 () cominvent ! com
[Download RAW message or body]
According to clusterstatus, all the replicas/shards for all collections are located \
on the 7574 node, and all of them are "down". So I suspec the reason for the failed \
splitshard command is that the node is not healthy. I would first try to reboot both \
nodes, and then see if they come up correctly.
I suppose this is a test environment since you deploy two nodes on the same physical \
box. Note that when doing so, you should ideally separate them into two completely \
separate installs, or at least completely separate SOLR_HOME folders, e.g. \
/var/solr1/data and /var/solr2/data. If you start two nodes from the same install, \
you may end up in a bad state.
Jan
> 22. sep. 2021 kl. 15:38 skrev Charlie Hubbard <charlie.hubbard@gmail.com>:
>
> Hi Jan,
>
> Here is a link to the image:
>
> https://app.box.com/s/mxidpcbm2lezm9ts47k2wgqwm34tytce
>
> From the logs:
>
> Collection: customer operation: splitshard
> failed:org.apache.solr.common.SolrException: missing index size
> information for parent shard leader
> at org.apache.solr.cloud.api.collections.SplitShardCmd.checkDiskSpace(SplitShardCmd.java:657)
> at org.apache.solr.cloud.api.collections.SplitShardCmd.split(SplitShardCmd.java:159)
> at org.apache.solr.cloud.api.collections.SplitShardCmd.call(SplitShardCmd.java:102)
> at org.apache.solr.cloud.api.collections.OverseerCollectionMessageHandler.processMessage(OverseerCollectionMessageHandler.java:270)
> at org.apache.solr.cloud.OverseerTaskProcessor$Runner.run(OverseerTaskProcessor.java:524)
> at org.apache.solr.common.util.ExecutorUtil$MDCAwareThreadPoolExecutor.lambda$execute$0(ExecutorUtil.java:218)
> at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>
> CLUSTERSTATUS output:
>
> {
> "responseHeader": {
> "status": 0,
> "QTime": 78
> },
> "cluster": {
> "collections": {
> "ultipro_audit": {
> "pullReplicas": "0",
> "replicationFactor": "1",
> "shards": {
> "shard1": {
> "range": "80000000-ffffffff",
> "state": "active",
> "replicas": {
> "core_node3": {
> "core": "ultipro_audit_shard1_replica_n1",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> },
> "shard2": {
> "range": "0-7fffffff",
> "state": "active",
> "replicas": {
> "core_node4": {
> "core": "ultipro_audit_shard2_replica_n2",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> }
> },
> "router": {
> "name": "compositeId"
> },
> "maxShardsPerNode": "-1",
> "autoAddReplicas": "false",
> "nrtReplicas": "1",
> "tlogReplicas": "0",
> "health": "RED",
> "znodeVersion": 15,
> "configName": "ultipro_audit"
> },
> "audit": {
> "pullReplicas": "0",
> "replicationFactor": "1",
> "shards": {
> "shard1": {
> "range": "80000000-ffffffff",
> "state": "active",
> "replicas": {
> "core_node3": {
> "core": "audit_shard1_replica_n1",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> },
> "shard2": {
> "range": "0-7fffffff",
> "state": "active",
> "replicas": {
> "core_node4": {
> "core": "audit_shard2_replica_n2",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> }
> },
> "router": {
> "name": "compositeId"
> },
> "maxShardsPerNode": "-1",
> "autoAddReplicas": "false",
> "nrtReplicas": "1",
> "tlogReplicas": "0",
> "health": "RED",
> "znodeVersion": 13,
> "configName": "audit"
> },
> "customer": {
> "pullReplicas": "0",
> "replicationFactor": "1",
> "shards": {
> "shard1": {
> "range": "80000000-ffffffff",
> "state": "active",
> "replicas": {
> "core_node3": {
> "core": "customer_shard1_replica_n1",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> },
> "shard2": {
> "range": "0-7fffffff",
> "state": "active",
> "replicas": {
> "core_node4": {
> "core": "customer_shard2_replica_n2",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> }
> },
> "router": {
> "name": "compositeId"
> },
> "maxShardsPerNode": "-1",
> "autoAddReplicas": "false",
> "nrtReplicas": "1",
> "tlogReplicas": "0",
> "health": "RED",
> "znodeVersion": 15,
> "configName": "customer"
> },
> "fusearchiver": {
> "pullReplicas": "0",
> "replicationFactor": "1",
> "shards": {
> "shard1": {
> "range": "80000000-ffffffff",
> "state": "active",
> "replicas": {
> "core_node3": {
> "core": "fusearchiver_shard1_replica_n1",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> },
> "shard2": {
> "range": "0-7fffffff",
> "state": "active",
> "replicas": {
> "core_node4": {
> "core": "fusearchiver_shard2_replica_n2",
> "node_name": "172.30.1.104:7574_solr",
> "base_url": "http://172.30.1.104:7574/solr",
> "state": "down",
> "type": "NRT",
> "force_set_state": "false",
> "leader": "true"
> }
> },
> "health": "RED"
> }
> },
> "router": {
> "name": "compositeId"
> },
> "maxShardsPerNode": "-1",
> "autoAddReplicas": "false",
> "nrtReplicas": "1",
> "tlogReplicas": "0",
> "health": "RED",
> "znodeVersion": 12,
> "configName": "fusearchiver"
> }
> },
> "live_nodes": [
> "172.30.1.104:7574_solr",
> "172.30.1.104:8983_solr"
> ]
> }
> }
>
>
> On Wed, Sep 22, 2021 at 8:48 AM Jan Høydahl <jan.asf@cominvent.com> wrote:
>
> > The image did not make it to the list, please try uploading it elsewhere
> > or copy the text only?
> >
> > Can you check the solr.log and paste relevant section from it?
> > Was the "empty" node supposed to have a core from the collection? Can you
> > do a CLUSTERSTATUS command and paste the output here?
> >
> > Jan
> >
> > > 22. sep. 2021 kl. 13:41 skrev Charlie Hubbard <charlie.hubbard@gmail.com
> > > >
> > >
> > > Hi,
> > >
> > > I have a simple 2 node solr cluster running 8.9 (1 node on 8983 and 1
> > node on 7574), but one of my nodes (8983) shows collections, but no cores
> > in the admin interface. When I run the following command I get an error:
> > "missing index size information for parent shard leader" when I try to do
> > > curl
> > http://localhost:7574/solr/admin/collections?action=SPLITSHARD&collection=customer&shard=shard1&wt=xml
> > <
> > http://localhost:7574/solr/admin/collections?action=SPLITSHARD&collection=customer&shard=shard1&wt=xml
> >
> > >
> > > Here is the error:
> > >
> > > Any ideas why the cores are missing on my 8983 node? I tried moving it
> > to 8984 because I had some trouble with caching before, but it had the same
> > results.
> > > TIA
> > > Charlie
> >
> >
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic