[prev in list] [next in list] [prev in thread] [next in thread] 

List:       incubator-cvs
Subject:    =?utf-8?q?=5BIncubator_Wiki=5D_Update_of_=22SliderProposal=22_by_SteveLou?= =?utf-8?q?ghran?=
From:       Apache Wiki <wikidiffs () apache ! org>
Date:       2014-03-31 18:45:40
Message-ID: 20140331184540.82380.44574 () eos ! apache ! org
[Download RAW message or body]

Dear Wiki user,

You have subscribed to a wiki page or wiki category on "Incubator Wiki" for change \
notification.

The "SliderProposal" page has been changed by SteveLoughran:
https://wiki.apache.org/incubator/SliderProposal

Comment:
slider proposal -successor to the hoya proposal

New page:
= Slider Proposal =

== Abstract ==

'''Slider''' is a collection of tools & technologies to package, deploy, and manage \
long running applications on Apache Hadoop YARN clusters. 

=== Background ===

Slider is a framework to support deployment and management of arbitrary applications \
on YARN and leverage YARN's resource management capabilities without having to \
rewrite the applications. Slider is actively being worked on to expand the ecosystem \
of applications that can be easily deployed and managed on Apache Hadoop YARN \
clusters.

The core Slider technologies were initially developed at Hortonworks as part of the \
''Hoya'' project -- an effort to support the deployment of HBase and later Accumulo \
clusters in YARN. This work showed the value in supporting more applications on YARN, \
that the client should be an API -rather than just a command line- and what key \
issues need to be addressed.

Slider is an evolution of the previous proposal, in that the proposal now includes \
agent-based deployment, makes packaging applications to be deployable and manageable \
a core area of work.

== Rationale ==

Hadoop YARN offers the following key capabilities: 

''Availability (always-on)'' - YARN works with the application to ensure recovery or \
restart of running application components.

''Flexibility (dynamic scaling)'' - YARN provides the application with the facilities \
to allow for scale-up or scale-down

''Resource Management'' - YARN handles allocation of cluster resources --and hence \
the scheduling of work across a Hadoop cluster.

Today, developers need to design or re-engineer their application to operate in a \
YARN clusters using the YARN APIs and its application architecture.  
Slider's objective is to make it easy for existing distributed application to be \
deployed on a YARN cluster without changes and with little or no custom code.  == \
Proposal Details ==

Slider allows users to deploy distributed applications across a Hadoop cluster, \
leveraging the YARN Resource Manager to allocate and distribute components of an \
                application across the cluster. Key characteristics of Slider: 
 * No need to change the application code [as long as the application follows \
                developer guidelines]
 * No need to develop a custom Application Master or other YARN code
 * Slider leverages YARN facilities to manage:
  * Application recovery in cases of container failure
  * Resource allocation and flexing (adding/removing containers)

== Initial Goals ==
 1. Donate the Slider source code and documentation to the Apache Software Foundation
 1. Set up and standardize the open governance of the Slider project
 1. Build a user and developer community
 1. Tie in better with Apache HBase, Apache Accumulo, and other projects -- both ASF \
and external -- that can be deployed in a YARN cluster without any code changes  1. \
Improve Slider capabilities to expand on list of apps that can be deployed on YARN \
using Slider

== Longer Term Goals ==
There are some longer term possibilities that could improve Slider:
 1. Implement a reusable management API for managing Slider applications by tools \
such as Apache Ambari  1. Provide a Java API to ease creation and manipulation of \
Slider-deployed clusters by other programs.  1. Address the service registration and \
discovery problem, to aid discovery and binding to YARN applications.  1. Explore \
load-driven cluster sizing.  1. Collaborate with other YARN applications, libraries, \
and frameworks to develop better libraries for YARN applications and their clients, \
monitoring and management, and configuration

Slider is driving YARN service support via YARN-896. We intend to evolve features and \
get practical experience using them before merging them into the Hadoop codebase.

== Current Status ==

Slider is currently under active development and functions end to end following the \
Slider specifications. 

=== Meritocracy ===

The core of Slider was originally driven by Steve Loughran, who has long-standing \
experience in Apache projects, and is being advanced with significant contributions \
from Ted Yu, Josh Elser, Billie Rinaldi, Sumit Mohanty, and Jon Maron with deep \
experience architecting and implementing key parts of key Apache projects including, \
HBase, Accumulo, Ambari, and other open source projects. 

=== Community ===

We are happy to report that there are folks in Accumulo, HBase, and some users \
outside Hortonworks who are closely involved in the project already.

We hope to extend the user and developer base further in the future and build a solid \
open source community around Slider, growing the community and adding committers \
following the Apache meritocracy model.

=== Alignment ===

The project is completely aligned with Apache, from its build process up. It depends \
on Apache Hadoop, and it currently deploys HBase and Accumulo.

Slider and Apache Samza are driving the work of supporting long-lived services in \
YARN. While many of these relate to service longevity, there is also the challenge of \
having low-latency table lookups co-exist with CPU-and-IO intensive analytics \
workloads.

=== Relationship with Apache Twill ===

Twill is a library that one can use to write YARN applications. Slider aims to \
provide a general framework using which one can take existing applications (HBase & \
Accumulo to start with), and make them run well in a YARN cluster, without intruding \
at all into their internals.

The key differentiators are
 * '''Long lived static applications''': the application's containers are expected to \
be relatively stable, with their termination being an unexpected event to which \
                Slider must react.
 * '''No application code-changes''': The only glue between the App and Slider is a \
Slider interface that the App needs to implement for it to be deployable/manageable \
by Slider.

Twill and Slider are therefore very different. The former is a convenience library \
for new YARN applications, the latter a YARN Framework to adapt existing applications \
to YARN.

While Slider can be written using Twill libraries (which is something we should \
pursue as part of long/medium-term collaboration between the two projects), the goals \
of the two projects are different - Twill will continue to make YARN application \
developers' lives easier, while Slider is a framework that can deploy \
distributed-applications easily in a YARN cluster, and perform basic management \
operations. 

Capabilities such as dynamic patching of the application's configuration to run in \
the YARN cluster, failure detection, reacting to failures, storing application state \
to facilitate better application restart behavior, etc. are under the purview of \
Slider. 

Management frameworks could use Slider as a tool to start/stop/shrink/expand an \
instance of an application. === Relationship with Apache Helix ===
Slider shares some common goals with Apache Helix. Helix is more sophisticated and is \
designed to work standalone. Slider is designed to work only in the context of a YARN \
cluster, and focuses on that YARN integration.

We have discussed Slider with the Helix team, and feel that the work we are doing in \
YARN integration, and driving YARN changes, will be of direct benefit to Helix. We \
plan to collaborate on features which can be shared across both projects.

=== Relationship with Apache Accumulo and Apache HBase ===

We offer Accumulo and HBase the flexible operation in a YARN cluster. As such, it \
should expand the uses of the applications, and their user base.

There may be some changes that the applications can make to help them live more \
easily in a YARN cluster, and to be managed by Slider. To date, changes have focused \
on supporting dynamic port allocations and reporting of the values.

It may be in future that we encounter situations where other changes to the \
applications can help them work even better in Slider-managed deployments. If these \
arise we would hope to work with the relevant teams to get the changes adopted - \
knowing up front that neither of these project teams would countenance any changes \
that interfered with classic static application deployments.

The initial Slider committer list includes committers for both Accumulo and HBase, \
who can maintain cross-project collaboration.

== Known Risks ==

The biggest risk is getting the critical mass of use needed to build a broad \
development team. We don't expect to have or need many full-time developers, but \
active engagement from the HBase and Accumulo developers would significantly aid \
adoption and governance.

The other risk is YARN not having the complete feature set needed for long lived \
services: restarting, security token renewal, log-capture and other issues. We are \
working with the YARN developers to address these issues, issues shared with other \
long-lived services on YARN. 

=== Orphaned Products ===

Steve, Sumit, Jon, and Billie will continue to work on Slider 100% of the time for \
the foreseeable future with others from Hortonworks and growing community \
contributing as well. 

=== Inexperience with Open Source ===

All of the core developers have long-standing experience in open source, Two of them \
are Accumulo committers and two are HBase committers. Steve Loughran has been a \
committer on various ASF projects since 2001 (Ant, Axis), a mentor to Incubated \
projects, a Hadoop committer since 2008, and full-time developer on HP's open -source \
SmartFrog project from 2005-2012. Sumit and Billie are committers on Ambari. Jon \
Maron has worked extensively with Ambari APIs and has contributed to the OpenStack \
Savanna (now Sahara) project. 

=== Homogeneous Developers ===

The current core developers are all from Hortonworks. However, we hope to establish a \
developer community that includes users of Slider and developers on the applications \
themselves - HBase, Accumulo, etc.

=== Reliance on Salaried Developers ===

Currently, the developers are paid to do work on Slider. A key goal for the \
incubation process will be to broaden the developer base.

=== Relationships with Other Apache Products ===

This is covered in the Alignment section.

=== An Excessive Fascination with the Apache Brand ===

While we respect the reputation of the Apache brand and have no doubts that it will \
attract contributors and users, our interest is primarily to give Slider a solid home \
as an open source project with a broad developer base -and to encourage adoption by \
the related ASF projects.

== Documentation ==

All Slider documentation is currently in \
[[https://github.com/hortonworks/slider/blob/develop/src/site/markdown/slider_specs/index.md|markdown-formatted \
text files in the source repository]]; they will be delivered as part of the initial \
source donation.

== Initial Source ==

The initial source -all ASF-licensed- can be found at \
[[https://github.com/hortonworks/slider]]

Slider is written in Java. Its source tree is entirely self-contained and relies on \
Apache Maven as its build system. Alongside the application, it contains unit, \
localhost, and functional tests. The latter for use with remote clusters.

== Source and IP Submission Plan ==

 1. All source will be moved to Apache Infrastructure
 1. All outstanding issues in our in-house JIRA infrastructure will be replicated \
into the Apache JIRA system.  1. We have pre-emptively acquired a currently-unused \
twitter handle @apacheslider which would be passed to the PMC.

== External Dependencies ==

Slider has no external dependencies except for some Java libraries that are \
considered ASF-compatible (JUnit, SLF4J, jcommander, groovy), BSD-licensed Jinja, and \
Apache artifacts : Hadoop, Log4J and the transient dependencies of all these \
artifacts.

== Required Resources ==

Mailing Lists:
 1. slider-dev
 1. slider-commits
 1. slider-private

Infrastructure:
 1. Git repository
 1. JIRA Slider (Slider)
 1. Gerrit for reviewing patches

The existing code includes local host integration tests, so we would like a Jenkins \
instance to run them whenever a new patch is submitted. == Initial Committers ==
 1. Steve Loughran (stevel at a.o)
 1. Jon Maron
 1. Sumit Mohanty
 1. Billie Rinaldi (billie at a.o)
 1. Ted Yu (tedyu at a.o)
 1. Josh Elser (elserj at a.o)
== Sponsors ==
Champion: Vinod Kumar Vavilapalli

Nominated Mentors:
 1. Jean-Baptiste Onofré
 1. Mahadev Konar
 1. Arun Murthy
 1. Devaraj Das (ddas at a.o)

== Sponsoring Entity ==

Incubator PMC

---------------------------------------------------------------------
To unsubscribe, e-mail: cvs-unsubscribe@incubator.apache.org
For additional commands, e-mail: cvs-help@incubator.apache.org


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic