[prev in list] [next in list] [prev in thread] [next in thread]
List: hadoop-user
Subject: Re: Any thoughts making Submarine a separate Apache project?
From: <dujunping () gmail ! com>
Date: 2019-07-30 10:01:28
Message-ID: CADaA6KvmGPCLq=CiZXKVSheempoo8-pue70n+Bs5ni3Ty+Mp1w () mail ! gmail ! com
[Download RAW message or body]
Thanks Vinod for these great suggestions. I agree most of your comments
above.
"For the Apache Hadoop community, this will be treated simply as
code-change and so need a committer +1?". IIUC, this should be treated as
feature branch merge, so may be 3 committer +1 is needed here according to
https://hadoop.apache.org/bylaws.html?
bq. Can somebody who have cycles and been on the ASF lists for a while look
into the process here?
I can check with ASF members who has experience on this if no one haven't
yet.
Thanks,
Junping
Vinod Kumar Vavilapalli <vinodkv@apache.org> 于2019年7月29日周一 下午9:46写道:
> Looks like there's a meaningful push behind this.
>
> Given the desire is to fork off Apache Hadoop, you'd want to make sure
> this enthusiasm turns into building a real, independent but more
> importantly a sustainable community.
>
> Given that there were two official releases off the Apache Hadoop project,
> I doubt if you'd need to go through the incubator process. Instead you can
> directly propose a new TLP at ASF board. The last few times this happened
> was with ORC, and long before that with Hive, HBase etc. Can somebody who
> have cycles and been on the ASF lists for a while look into the process
> here?
>
> For the Apache Hadoop community, this will be treated simply as
> code-change and so need a committer +1? You can be more gently by formally
> doing a vote once a process doc is written down.
>
> Back to the sustainable community point, as part of drafting this
> proposal, you'd definitely want to make sure all of the Apache Hadoop
> PMC/Committers can exercise their will to join this new project as
> PMC/Committers respectively without any additional constraints.
>
> Thanks
> +Vinod
>
> > On Jul 25, 2019, at 1:31 PM, Wangda Tan <wheeleast@gmail.com> wrote:
> >
> > Thanks everybody for sharing your thoughts. I saw positive feedbacks from
> > 20+ contributors!
> >
> > So I think we should move it forward, any suggestions about what we
> should
> > do?
> >
> > Best,
> > Wangda
> >
> > On Mon, Jul 22, 2019 at 5:36 PM neo <neo@pingcap.com> wrote:
> >
> >> +1, This is neo from TiDB & TiKV community.
> >> Thanks Xun for bring this up.
> >>
> >> Our CNCF project's open source distributed KV storage system TiKV,
> >> Hadoop submarine's machine learning engine helps us to optimize data
> >> storage,
> >> helping us solve some problems in data hotspots and data shuffers.
> >>
> >> We are ready to improve the performance of TiDB in our open source
> >> distributed relational database TiDB and also using the hadoop submarine
> >> machine learning engine.
> >>
> >> I think if submarine can be independent, it will develop faster and
> better.
> >> Thanks to the hadoop community for developing submarine!
> >>
> >> Best Regards,
> >> neo
> >> www.pingcap.com / https://github.com/pingcap/tidb /
> >> https://github.com/tikv
> >>
> >> Xun Liu <liuxun@apache.org> 于2019年7月22日周一 下午4:07写道:
> >>
> >>> @adam.antal
> >>>
> >>> The submarine development team has completed the following
> preparations:
> >>> 1. Established a temporary test repository on Github.
> >>> 2. Change the package name of hadoop submarine from
> org.hadoop.submarine
> >> to
> >>> org.submarine
> >>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;
> >>> 4. On the Github docked travis-ci system, all test cases have been
> >> tested;
> >>> 5. Several Hadoop submarine users completed the system test using the
> >> code
> >>> in this repository.
> >>>
> >>> 赵欣 <xinzhao@seu.edu.cn> 于2019年7月22日周一 上午9:38写道:
> >>>
> >>>> Hi
> >>>>
> >>>> I am a teacher at Southeast University (https://www.seu.edu.cn/). We
> >> are
> >>>> a major in electrical engineering. Our teaching teams and students use
> >>>> bigoop submarine for big data analysis and automation control of
> >>> electrical
> >>>> equipment.
> >>>>
> >>>> Many thanks to the hadoop community for providing us with machine
> >>> learning
> >>>> tools like submarine.
> >>>>
> >>>> I wish hadoop submarine is getting better and better.
> >>>>
> >>>>
> >>>> ==============================
> >>>> 赵欣
> >>>> 东南大学电气工程学院
> >>>>
> >>>> -----------------------------------------------------
> >>>>
> >>>> Zhao XIN
> >>>>
> >>>> School of Electrical Engineering
> >>>>
> >>>> ==============================
> >>>> 2019-07-18
> >>>>
> >>>>
> >>>> *From:* Xun Liu <liuxun@apache.org>
> >>>> *Date:* 2019-07-18 09:46
> >>>> *To:* xinzhao <xinzhao@seu.edu.cn>
> >>>> *Subject:* Fwd: Re: Any thoughts making Submarine a separate Apache
> >>>> project?
> >>>>
> >>>>
> >>>> ---------- Forwarded message ---------
> >>>> 发件人: dashuiguailuyun@gmail.com <dashuiguailuyun@gmail.com>
> >>>> Date: 2019年7月17日周三 下午3:17
> >>>> Subject: Re: Re: Any thoughts making Submarine a separate Apache
> >> project?
> >>>> To: Szilard Nemeth <snemeth@cloudera.com.invalid>, runlin zhang <
> >>>> runlin512@gmail.com>
> >>>> Cc: Xun Liu <liuxun@apache.org>, common-dev <
> >>> common-dev@hadoop.apache.org>,
> >>>> yarn-dev <yarn-dev@hadoop.apache.org>, hdfs-dev <
> >>>> hdfs-dev@hadoop.apache.org>, mapreduce-dev <
> >>>> mapreduce-dev@hadoop.apache.org>, submarine-dev <
> >>>> submarine-dev@hadoop.apache.org>
> >>>>
> >>>>
> >>>> +1 ,Good idea, we are very much looking forward to it.
> >>>>
> >>>> ------------------------------
> >>>> dashuiguailuyun@gmail.com
> >>>>
> >>>>
> >>>> *From:* Szilard Nemeth <snemeth@cloudera.com.INVALID>
> >>>> *Date:* 2019-07-17 14:55
> >>>> *To:* runlin zhang <runlin512@gmail.com>
> >>>> *CC:* Xun Liu <liuxun@apache.org>; Hadoop Common
> >>>> <common-dev@hadoop.apache.org>; yarn-dev <yarn-dev@hadoop.apache.org
> >;
> >>>> Hdfs-dev <hdfs-dev@hadoop.apache.org>; mapreduce-dev
> >>>> <mapreduce-dev@hadoop.apache.org>; submarine-dev
> >>>> <submarine-dev@hadoop.apache.org>
> >>>> *Subject:* Re: Any thoughts making Submarine a separate Apache
> project?
> >>>> +1, this is a very great idea.
> >>>> As Hadoop repository has already grown huge and contains many
> >> projects, I
> >>>> think in general it's a good idea to separate projects in the early
> >>> phase.
> >>>>
> >>>>
> >>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <runlin512@gmail.com> wrote:
> >>>>
> >>>>> +1 ,That will be great !
> >>>>>
> >>>>>> 在 2019年7月10日,下午3:34,Xun Liu <liuxun@apache.org> 写道:
> >>>>>>
> >>>>>> Hi all,
> >>>>>>
> >>>>>> This is Xun Liu contributing to the Submarine project for deep
> >>> learning
> >>>>>> workloads running with big data workloads together on Hadoop
> >>> clusters.
> >>>>>>
> >>>>>> There are a bunch of integrations of Submarine to other projects
> >> are
> >>>>>> finished or going on, such as Apache Zeppelin, TonY, Azkaban. The
> >>> next
> >>>>> step
> >>>>>> of Submarine is going to integrate with more projects like Apache
> >>>> Arrow,
> >>>>>> Redis, MLflow, etc. & be able to handle end-to-end machine learning
> >>> use
> >>>>>> cases like model serving, notebook management, advanced training
> >>>>>> optimizations (like auto parameter tuning, memory cache
> >> optimizations
> >>>> for
> >>>>>> large datasets for training, etc.), and make it run on other
> >>> platforms
> >>>>> like
> >>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to donate TonY
> >>>>> project
> >>>>>> to Apache so we can put Submarine and TonY together to the same
> >>>> codebase
> >>>>>> (Page #30.
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30
> >>>>>> ).
> >>>>>>
> >>>>>> This expands the scope of the original Submarine project in
> >> exciting
> >>>> new
> >>>>>> ways. Toward that end, would it make sense to create a separate
> >>>> Submarine
> >>>>>> project at Apache? This can make faster adoption of Submarine, and
> >>>> allow
> >>>>>> Submarine to grow to a full-blown machine learning platform.
> >>>>>>
> >>>>>> There will be lots of technical details to work out, but any
> >> initial
> >>>>>> thoughts on this?
> >>>>>>
> >>>>>> Best Regards,
> >>>>>> Xun Liu
> >>>>>
> >>>>>
> >>>>> ---------------------------------------------------------------------
> >>>>> To unsubscribe, e-mail: common-dev-unsubscribe@hadoop.apache.org
> >>>>> For additional commands, e-mail: common-dev-help@hadoop.apache.org
> >>>>>
> >>>>>
> >>>>
> >>>>
> >>>
> >>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: yarn-dev-unsubscribe@hadoop.apache.org
> For additional commands, e-mail: yarn-dev-help@hadoop.apache.org
>
>
[Attachment #3 (text/html)]
<div dir="ltr">Thanks Vinod for these great suggestions. I agree most of your \
comments above.<div> "For the Apache Hadoop community, this will be treated \
simply as code-change and so need a committer +1?". IIUC, this should be treated \
as feature branch merge, so may be 3 committer +1 is needed here according to <a \
href="https://hadoop.apache.org/bylaws.html">https://hadoop.apache.org/bylaws.html</a>?</div><div><div><br></div><div>bq. \
Can somebody who have cycles and been on the ASF lists for a while look into the \
process here?</div><div>I can check with ASF members who has experience on this if no \
one haven't yet.</div></div><div><br></div><div>Thanks,</div><div><br></div><div>Junping</div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">Vinod Kumar Vavilapalli <<a \
href="mailto:vinodkv@apache.org">vinodkv@apache.org</a>> 于2019年7月29日周一 \
下午9:46写道:<br></div><blockquote class="gmail_quote" style="margin:0px 0px \
0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Looks like \
there's a meaningful push behind this.<br> <br>
Given the desire is to fork off Apache Hadoop, you'd want to make sure this \
enthusiasm turns into building a real, independent but more importantly a sustainable \
community.<br> <br>
Given that there were two official releases off the Apache Hadoop project, I doubt if \
you'd need to go through the incubator process. Instead you can directly propose \
a new TLP at ASF board. The last few times this happened was with ORC, and long \
before that with Hive, HBase etc. Can somebody who have cycles and been on the ASF \
lists for a while look into the process here?<br> <br>
For the Apache Hadoop community, this will be treated simply as code-change and so \
need a committer +1? You can be more gently by formally doing a vote once a process \
doc is written down.<br> <br>
Back to the sustainable community point, as part of drafting this proposal, you'd \
definitely want to make sure all of the Apache Hadoop PMC/Committers can exercise \
their will to join this new project as PMC/Committers respectively without any \
additional constraints.<br> <br>
Thanks<br>
+Vinod<br>
<br>
> On Jul 25, 2019, at 1:31 PM, Wangda Tan <<a href="mailto:wheeleast@gmail.com" \
target="_blank">wheeleast@gmail.com</a>> wrote:<br> > <br>
> Thanks everybody for sharing your thoughts. I saw positive feedbacks from<br>
> 20+ contributors!<br>
> <br>
> So I think we should move it forward, any suggestions about what we should<br>
> do?<br>
> <br>
> Best,<br>
> Wangda<br>
> <br>
> On Mon, Jul 22, 2019 at 5:36 PM neo <<a href="mailto:neo@pingcap.com" \
target="_blank">neo@pingcap.com</a>> wrote:<br> > <br>
>> +1, This is neo from TiDB & TiKV community.<br>
>> Thanks Xun for bring this up.<br>
>> <br>
>> Our CNCF project's open source distributed KV storage system TiKV,<br>
>> Hadoop submarine's machine learning engine helps us to optimize data<br>
>> storage,<br>
>> helping us solve some problems in data hotspots and data shuffers.<br>
>> <br>
>> We are ready to improve the performance of TiDB in our open source<br>
>> distributed relational database TiDB and also using the hadoop submarine<br>
>> machine learning engine.<br>
>> <br>
>> I think if submarine can be independent, it will develop faster and \
better.<br> >> Thanks to the hadoop community for developing submarine!<br>
>> <br>
>> Best Regards,<br>
>> neo<br>
>> <a href="http://www.pingcap.com" rel="noreferrer" \
target="_blank">www.pingcap.com</a> / <a href="https://github.com/pingcap/tidb" \
rel="noreferrer" target="_blank">https://github.com/pingcap/tidb</a> /<br> >> \
<a href="https://github.com/tikv" rel="noreferrer" \
target="_blank">https://github.com/tikv</a><br> >> <br>
>> Xun Liu <<a href="mailto:liuxun@apache.org" \
target="_blank">liuxun@apache.org</a>> 于2019年7月22日周一 \
下午4:07写道:<br> >> <br>
>>> @adam.antal<br>
>>> <br>
>>> The submarine development team has completed the following \
preparations:<br> >>> 1. Established a temporary test repository on \
Github.<br> >>> 2. Change the package name of hadoop submarine from \
org.hadoop.submarine<br> >> to<br>
>>> org.submarine<br>
>>> 3. Combine the Linkedin/TonY code into the Hadoop submarine module;<br>
>>> 4. On the Github docked travis-ci system, all test cases have been<br>
>> tested;<br>
>>> 5. Several Hadoop submarine users completed the system test using \
the<br> >> code<br>
>>> in this repository.<br>
>>> <br>
>>> 赵欣 <<a href="mailto:xinzhao@seu.edu.cn" \
target="_blank">xinzhao@seu.edu.cn</a>> 于2019年7月22日周一 \
上午9:38写道:<br> >>> <br>
>>>> Hi<br>
>>>> <br>
>>>> I am a teacher at Southeast University (<a \
href="https://www.seu.edu.cn/" rel="noreferrer" \
target="_blank">https://www.seu.edu.cn/</a>). We<br> >> are<br>
>>>> a major in electrical engineering. Our teaching teams and students \
use<br> >>>> bigoop submarine for big data analysis and automation \
control of<br> >>> electrical<br>
>>>> equipment.<br>
>>>> <br>
>>>> Many thanks to the hadoop community for providing us with \
machine<br> >>> learning<br>
>>>> tools like submarine.<br>
>>>> <br>
>>>> I wish hadoop submarine is getting better and better.<br>
>>>> <br>
>>>> <br>
>>>> ==============================<br>
>>>> 赵欣<br>
>>>> 东南大学电气工程学院<br>
>>>> <br>
>>>> -----------------------------------------------------<br>
>>>> <br>
>>>> Zhao XIN<br>
>>>> <br>
>>>> School of Electrical Engineering<br>
>>>> <br>
>>>> ==============================<br>
>>>> 2019-07-18<br>
>>>> <br>
>>>> <br>
>>>> *From:* Xun Liu <<a href="mailto:liuxun@apache.org" \
target="_blank">liuxun@apache.org</a>><br> >>>> *Date:* 2019-07-18 \
09:46<br> >>>> *To:* xinzhao <<a href="mailto:xinzhao@seu.edu.cn" \
target="_blank">xinzhao@seu.edu.cn</a>><br> >>>> *Subject:* Fwd: Re: \
Any thoughts making Submarine a separate Apache<br> >>>> project?<br>
>>>> <br>
>>>> <br>
>>>> ---------- Forwarded message ---------<br>
>>>> 发件人: <a href="mailto:dashuiguailuyun@gmail.com" \
target="_blank">dashuiguailuyun@gmail.com</a> <<a \
href="mailto:dashuiguailuyun@gmail.com" \
target="_blank">dashuiguailuyun@gmail.com</a>><br> >>>> Date: \
2019年7月17日周三 下午3:17<br> >>>> Subject: Re: Re: Any thoughts \
making Submarine a separate Apache<br> >> project?<br>
>>>> To: Szilard Nemeth <snemeth@cloudera.com.invalid>, runlin \
zhang <<br> >>>> <a href="mailto:runlin512@gmail.com" \
target="_blank">runlin512@gmail.com</a>><br> >>>> Cc: Xun Liu <<a \
href="mailto:liuxun@apache.org" target="_blank">liuxun@apache.org</a>>, common-dev \
<<br> >>> <a href="mailto:common-dev@hadoop.apache.org" \
target="_blank">common-dev@hadoop.apache.org</a>>,<br> >>>> yarn-dev \
<<a href="mailto:yarn-dev@hadoop.apache.org" \
target="_blank">yarn-dev@hadoop.apache.org</a>>, hdfs-dev <<br> \
>>>> <a href="mailto:hdfs-dev@hadoop.apache.org" \
target="_blank">hdfs-dev@hadoop.apache.org</a>>, mapreduce-dev <<br> \
>>>> <a href="mailto:mapreduce-dev@hadoop.apache.org" \
target="_blank">mapreduce-dev@hadoop.apache.org</a>>, submarine-dev <<br> \
>>>> <a href="mailto:submarine-dev@hadoop.apache.org" \
target="_blank">submarine-dev@hadoop.apache.org</a>><br> >>>> <br>
>>>> <br>
>>>> +1 ,Good idea, we are very much looking forward to it.<br>
>>>> <br>
>>>> ------------------------------<br>
>>>> <a href="mailto:dashuiguailuyun@gmail.com" \
target="_blank">dashuiguailuyun@gmail.com</a><br> >>>> <br>
>>>> <br>
>>>> *From:* Szilard Nemeth <snemeth@cloudera.com.INVALID><br>
>>>> *Date:* 2019-07-17 14:55<br>
>>>> *To:* runlin zhang <<a href="mailto:runlin512@gmail.com" \
target="_blank">runlin512@gmail.com</a>><br> >>>> *CC:* Xun Liu <<a \
href="mailto:liuxun@apache.org" target="_blank">liuxun@apache.org</a>>; Hadoop \
Common<br> >>>> <<a href="mailto:common-dev@hadoop.apache.org" \
target="_blank">common-dev@hadoop.apache.org</a>>; yarn-dev <<a \
href="mailto:yarn-dev@hadoop.apache.org" \
target="_blank">yarn-dev@hadoop.apache.org</a>>;<br> >>>> Hdfs-dev \
<<a href="mailto:hdfs-dev@hadoop.apache.org" \
target="_blank">hdfs-dev@hadoop.apache.org</a>>; mapreduce-dev<br> \
>>>> <<a href="mailto:mapreduce-dev@hadoop.apache.org" \
target="_blank">mapreduce-dev@hadoop.apache.org</a>>; submarine-dev<br> \
>>>> <<a href="mailto:submarine-dev@hadoop.apache.org" \
target="_blank">submarine-dev@hadoop.apache.org</a>><br> >>>> \
*Subject:* Re: Any thoughts making Submarine a separate Apache project?<br> \
>>>> +1, this is a very great idea.<br> >>>> As Hadoop \
repository has already grown huge and contains many<br> >> projects, I<br>
>>>> think in general it's a good idea to separate projects in the \
early<br> >>> phase.<br>
>>>> <br>
>>>> <br>
>>>> On Wed, Jul 17, 2019, 08:50 runlin zhang <<a \
href="mailto:runlin512@gmail.com" target="_blank">runlin512@gmail.com</a>> \
wrote:<br> >>>> <br>
>>>>> +1 ,That will be great !<br>
>>>>> <br>
>>>>>> 在 2019年7月10日,下午3:34,Xun Liu <<a \
href="mailto:liuxun@apache.org" target="_blank">liuxun@apache.org</a>> \
写道:<br> >>>>>> <br>
>>>>>> Hi all,<br>
>>>>>> <br>
>>>>>> This is Xun Liu contributing to the Submarine project for \
deep<br> >>> learning<br>
>>>>>> workloads running with big data workloads together on \
Hadoop<br> >>> clusters.<br>
>>>>>> <br>
>>>>>> There are a bunch of integrations of Submarine to other \
projects<br> >> are<br>
>>>>>> finished or going on, such as Apache Zeppelin, TonY, \
Azkaban. The<br> >>> next<br>
>>>>> step<br>
>>>>>> of Submarine is going to integrate with more projects like \
Apache<br> >>>> Arrow,<br>
>>>>>> Redis, MLflow, etc. & be able to handle end-to-end \
machine learning<br> >>> use<br>
>>>>>> cases like model serving, notebook management, advanced \
training<br> >>>>>> optimizations (like auto parameter tuning, \
memory cache<br> >> optimizations<br>
>>>> for<br>
>>>>>> large datasets for training, etc.), and make it run on \
other<br> >>> platforms<br>
>>>>> like<br>
>>>>>> Kubernetes or natively on Cloud. LinkedIn also wants to \
donate TonY<br> >>>>> project<br>
>>>>>> to Apache so we can put Submarine and TonY together to the \
same<br> >>>> codebase<br>
>>>>>> (Page #30.<br>
>>>>>> <br>
>>>>> <br>
>>>> <br>
>>> <br>
>> <a href="https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30" \
rel="noreferrer" target="_blank">https://www.slideshare.net/xkrogen/hadoop-meetup-jan-2019-tony-tensorflow-on-yarn-and-beyond#30</a><br>
>>>>>> ).<br>
>>>>>> <br>
>>>>>> This expands the scope of the original Submarine project \
in<br> >> exciting<br>
>>>> new<br>
>>>>>> ways. Toward that end, would it make sense to create a \
separate<br> >>>> Submarine<br>
>>>>>> project at Apache? This can make faster adoption of \
Submarine, and<br> >>>> allow<br>
>>>>>> Submarine to grow to a full-blown machine learning \
platform.<br> >>>>>> <br>
>>>>>> There will be lots of technical details to work out, but \
any<br> >> initial<br>
>>>>>> thoughts on this?<br>
>>>>>> <br>
>>>>>> Best Regards,<br>
>>>>>> Xun Liu<br>
>>>>> <br>
>>>>> <br>
>>>>> \
---------------------------------------------------------------------<br> \
>>>>> To unsubscribe, e-mail: <a \
href="mailto:common-dev-unsubscribe@hadoop.apache.org" \
target="_blank">common-dev-unsubscribe@hadoop.apache.org</a><br> >>>>> \
For additional commands, e-mail: <a href="mailto:common-dev-help@hadoop.apache.org" \
target="_blank">common-dev-help@hadoop.apache.org</a><br> >>>>> <br>
>>>>> <br>
>>>> <br>
>>>> <br>
>>> <br>
>> <br>
<br>
<br>
---------------------------------------------------------------------<br>
To unsubscribe, e-mail: <a href="mailto:yarn-dev-unsubscribe@hadoop.apache.org" \
target="_blank">yarn-dev-unsubscribe@hadoop.apache.org</a><br> For additional \
commands, e-mail: <a href="mailto:yarn-dev-help@hadoop.apache.org" \
target="_blank">yarn-dev-help@hadoop.apache.org</a><br> <br>
</blockquote></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic