[prev in list] [next in list] [prev in thread] [next in thread] 

List:       drill-dev
Subject:    Re: Review Request 27271: DRILL-1592: Detect drillbit node failure and cancel affected running queri
From:       "Jinfeng Ni" <jni () maprtech ! com>
Date:       2014-10-31 18:08:09
Message-ID: 20141031180809.7138.72355 () reviews ! apache ! org
[Download RAW message or body]

--===============0456490502461736208==
MIME-Version: 1.0
Content-Type: text/plain; charset="utf-8"
Content-Transfer-Encoding: 7bit


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/27271/
-----------------------------------------------------------

(Updated Oct. 31, 2014, 11:08 a.m.)


Review request for drill and Jacques Nadeau.


Changes
-------

Revise code based on new comments.

1. Split the notification into two : drillbitRegistered() and drillbitUnregistered(). \
For now, the cluster will only take action for drillbitUnregistered(). 2. Remove the \
unused map in LocalClusterCoordinator. 


Repository: drill-git


Description
-------

Use cluster's ZK to keep track the available drillbit node. Whenever there is a node \
change detected by ClusterCoordinator, such change will be notified to either Foreman \
or the non-root Fragment's FragmentExecutor. The notification would lead to drillbit \
to cancel the affected queries.

Basic design is to register a DrillbitStatusListener with ClusterCoordinator. \
QueryManager or FragmentExecutor will implemented a different DrilbitStatusListener.  \
1. For Foreman's DrillbitStatusListener, it will check if the active drillbits still \
contain all the drillbit runnning the queries. If not, send cancel requet to non-root \
fragments. 2. For Non-root fragment's DrillbitStatusListener, it will check if the \
foreman drillbit is in the active drillbit. If not, cancel itself.


Diffs (updated)
-----

  exec/java-exec/src/main/java/org/apache/drill/exec/coord/ClusterCoordinator.java \
508a5b2   exec/java-exec/src/main/java/org/apache/drill/exec/coord/local/LocalClusterCoordinator.java \
035c1aa   exec/java-exec/src/main/java/org/apache/drill/exec/coord/zk/ZKClusterCoordinator.java \
7f538d2   exec/java-exec/src/main/java/org/apache/drill/exec/ops/FragmentContext.java \
9b78c1d   exec/java-exec/src/main/java/org/apache/drill/exec/ops/QueryContext.java \
ea48b05   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/DrillbitStatusListener.java \
PRE-CREATION   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/Foreman.java \
0163f55   exec/java-exec/src/main/java/org/apache/drill/exec/work/foreman/QueryManager.java \
b200edc   exec/java-exec/src/main/java/org/apache/drill/exec/work/fragment/FragmentExecutor.java \
37074d8 

Diff: https://reviews.apache.org/r/27271/diff/


Testing
-------

Unit test. 

Pending : functional / TPCH SF100. 


Thanks,

Jinfeng Ni


--===============0456490502461736208==--


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic