[prev in list] [next in list] [prev in thread] [next in thread]
List: solr-dev
Subject: [jira] Issue Comment Edited: (SOLR-1855) Script to monitor Solr
From: "Shawn Smith (JIRA)" <jira () apache ! org>
Date: 2010-03-30 21:58:27
Message-ID: 1385855717.590701269986307380.JavaMail.jira () brutus ! apache ! org
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/SOLR-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12851462#action_12851462 \
]
Shawn Smith edited comment on SOLR-1855 at 3/30/10 9:58 PM:
------------------------------------------------------------
I've attached a first pass implementation of this script: 'checksolr' attachment. \
It's basically the script we're using in our production environment to monitor Solr \
health. As such, it's not completely generic, but it should be a \
decent start:
* bash script tested only on Linux
* dependencies on curl, xmllint, xmlstarlet (curl, libxml2, xmlstarlet packages)
* assumes url structure corresponding to the default multi-core Solr configuration \
(http://<host>:<port>/solr/admin/cores, .../solr/<core>/admin/ping, \
.../solr/<core>/replication?command=details)
* checks slave replication health assuming Solr 1.4 Java replication
* dynamically determines the set of Solr cores, so it's useful in a multi-core \
deployment where the set of cores may change relatively often
Example usage:
{noformat}
$ ./checksolr -?
Usage:
checksolr [OPTIONS]
Options:
--help | -h
Print the brief help message and exit.
--man
Print the manual page and exit.
--host | -H HOST
Check this host instead of localhost.
--port | -P Port
Use this port instead of the default(8983) to connect.
--diff | -D Time difference between now and when solr last replicated
Use this option to set the maximum difference in seconds between the
time when the solr slave replicated and now.
--slave
Perform slave checks on the host instead of ping tests.
$ ./checksolr --host solrmaster1
Core "core0" returned "OK".
Core "core1" returned "OK".
Core "core2" returned "OK".
$ echo $?
0
$ ./checksolr --slave --host solrslave1
Core "core0" is up to date.
Core "core1" is up to date.
Core "core2" is having trouble replicating.
$ echo $?
1
{noformat}
was (Author: ssmith):
I've attached a first pass implementation of this script: !checksolr!. It's \
basically the script we're using in our production environment to monitor Solr \
health. As such, it's not completely generic, but it should be a \
decent start:
* bash script tested only on Linux
* dependencies on curl, xmllint, xmlstarlet (curl, libxml2, xmlstarlet packages)
* assumes url structure corresponding to the default multi-core Solr configuration \
(http://<host>:<port>/solr/admin/cores, .../solr/<core>/admin/ping, \
.../solr/<core>/replication?command=details)
* checks slave replication health assuming Solr 1.4 Java replication
* dynamically determines the set of Solr cores, so it's useful in a multi-core \
deployment where the set of cores may change relatively often
Example usage:
{noformat}
$ ./checksolr -?
Usage:
checksolr [OPTIONS]
Options:
--help | -h
Print the brief help message and exit.
--man
Print the manual page and exit.
--host | -H HOST
Check this host instead of localhost.
--port | -P Port
Use this port instead of the default(8983) to connect.
--diff | -D Time difference between now and when solr last replicated
Use this option to set the maximum difference in seconds between the
time when the solr slave replicated and now.
--slave
Perform slave checks on the host instead of ping tests.
$ ./checksolr --host solrmaster1
Core "core0" returned "OK".
Core "core1" returned "OK".
Core "core2" returned "OK".
$ echo $?
0
$ ./checksolr --slave --host solrslave1
Core "core0" is up to date.
Core "core1" is up to date.
Core "core2" is having trouble replicating.
$ echo $?
1
{noformat}
> Script to monitor Solr health including replication status
> ----------------------------------------------------------
>
> Key: SOLR-1855
> URL: https://issues.apache.org/jira/browse/SOLR-1855
> Project: Solr
> Issue Type: New Feature
> Components: replication (java)
> Affects Versions: 1.4
> Reporter: Shawn Smith
> Attachments: checksolr
>
>
> It would be useful to have a simple monitor script that checks the health of all \
> cores on a solr server. # Call the "ping" command and verify success.
> # Check for replication failures, for replication slaves.
> The script should return a non-zero exit code if any serious errors are discovered. \
> This should make it easy to plug the script into a monitoring framework (Nagios, \
> etc.)
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic