[prev in list] [next in list] [prev in thread] [next in thread] 

List:       solr-dev
Subject:    Re: [DISCUSS][SOLR] Multiple Repos For Contributions
From:       Jason Gerlowski <gerlowskija () gmail ! com>
Date:       2020-11-25 18:45:42
Message-ID: CAPCX2-KA87wmtgeuDcmeUEy_Zz26F+v6HR2S+98K1UxfSK55vg () mail ! gmail ! com
[Download RAW message or body]

> then I'll start putting together individual proposals for a few repos to =
exercise the process and get more contributions going

solr-operator is a great example of PMC-maintained code that makes
sense to have in a separate repository.  It's written primarily in a
different language, it's an integration with 3rd party software, etc.

But there are downsides to managing multiple repositories that make me
hesitant about the idea more generally.  There's no easy way to
prevent changes in one repo from unintentionally breaking another.
There's at least some duplication in the maintenance of things all
repos need (build systems, etc.).  It may add overhead on release
volunteers and the PMC if there are more releases.

I'm not sure how much those'll cause problems in practice.  Hopefully
they'll be minimal, but it's possible they won't be.  They might end
up outweighing the benefits.  I'm not saying we should be afraid of
additional repositories where it makes sense for the domain.  But
maybe it'd make sense to use solr-operator as a test case for a few
releases before putting in the effort to move out our current contribs
or change our process of adopting new ones.  Since this is more about
long term management, and less about getting in a particular feature
or value for users, we've got a cool opportunity to let the
solr-operator experiment play out before we necessarily need to decide
how to handle similar scenarios.

Just my two cents.

Jason

On Wed, Nov 25, 2020 at 8:23 AM Jan H=C3=B8ydahl <jan.asf@cominvent.com> wr=
ote:
>
> Let each sub project decide for themselves. PYLUCENE has its own svn repo=
 and its own Jira space.
> Solr-operator should be allowed to continue with GH issues and PRs i.m.o.=
 No need to force them into JIRA as long as the ASF allows projects to choo=
se.
>
> Jan
>
> 24. nov. 2020 kl. 20:59 skrev David Smiley <dsmiley@apache.org>:
>
>> > Q: Should we create a separate JIRA for these contribs... or ditch JIR=
A entirely for them, relying on GitHub alone?
>>
>> I'd start with same JIRA, with a separate component or label. I don't th=
ink GH issues would be good because it becomes harder to link between core =
and contrib issues in case of compat or tandme feature development.
>
>
> By "hard to link", are you basically saying pasting URLs is hard ;-).  ? =
 There was a committer meeting in Montreal where some folks like Jan Hoydal=
 and Varun (if I'm not mistaken; I may be) advocated for considering more G=
itHub centric issue tracking.  I was not in favor of that... however for co=
ntribs/modules that get their own separate repos, it affords an opportunity=
 for a break with the past in the interests of simplicity and familiarity f=
or what contributors are already familiar with.
>
> ~ David Smiley
> Apache Lucene/Solr Search Developer
> http://www.linkedin.com/in/davidwsmiley
>
>
> On Tue, Nov 17, 2020 at 7:42 PM Mike Drob <mdrob@apache.org> wrote:
>>
>> Thank you for the replies so far.
>>
>> I think that each contrib would necessarily have to have their own relea=
se schedule and release vote. I suspect that there might be frequent releas=
es at first, and then these will smooth out into basically once per major r=
elease. I also think that contribs having releases could reduce the number =
of minor releases that we need to do, if a certain feature is well containe=
d.
>>
>> Compatibility breaks will happen, but I feel like we should try to avoid=
 them. Sometimes they're inevitable though, and we'll need to clearly mark =
that version X of the contrib is only compatible with version X of Solr, an=
d for newer versions of Solr you have to use Y. Maybe we'll be able to rele=
ase contrib Y first, and have it bridge the Solr releases. I think we'll ne=
ed to invest in CI tooling to catch these kinds of situations sooner.
>>
>> > * More build files; copying the rules/setup/standards of the Solr moth=
ership and will become divergent over time no doubt.  Or just KISS principl=
e; no sharing; simple Maven projects.
>>
>> I wonder what the Gradle equivalent would be here. In maven-land, we can=
 define a parent pom and attach a bunch of configuration and rules and plug=
ins to it, and reuse across repositories and projects. Maybe the gradle bui=
ld rules turn into an externally referenced project as well. I don't know w=
hat we'll need, but being able to apply all of our validation and precommit=
 rules consistently to the contribs seems important.
>>
>> > Q: Could & should many contribs live in one repo (no more internal con=
tribs), yet each still have its own release cycle?  This could make sharing=
 build infrastructure easier, and detecting Solr compatibility with them ea=
sier.  Although it would mean sharing GitHub project area, thus sharing iss=
ues/PRs.
>>
>> I don't know. It would make source releases more complicated, which are =
what the ASF releases provide. I think it would make testing a contrib agai=
nst multiple versions of Solr more difficult as well.
>>
>> > Q: Should we create a separate JIRA for these contribs... or ditch JIR=
A entirely for them, relying on GitHub alone?
>>
>> I'd start with same JIRA, with a separate component or label. I don't th=
ink GH issues would be good because it becomes harder to link between core =
and contrib issues in case of compat or tandme feature development.
>>
>> > Q: Would contribs be treated as first class citizens in the Solr Refer=
ence Guide (they are still in the ASF after all), or would they be banished=
 like the DIH was?
>> Probably a link in the reference guide to a list of contribs, and then e=
ach contrib has its own documentation.
>>
>> On Tue, Nov 17, 2020 at 10:00 AM Anshum Gupta <anshum@apache.org> wrote:
>>>
>>> Thanks for sending this email, Mike and thanks for the follow up, David=
.
>>>
>>> The idea of having multiple repos under the project seems like the reas=
onable way to go for our project. This allows us to support more features/t=
ooling/etc. without having to link them to Solr or Lucene releases.
>>>
>>> An important thing here is to understand that if it comes from under th=
e same umbrella, it should be treated with the same care and respect - at l=
east we should attempt to.
>>>
>>>> Q: Is it "okay" to release new Solr versions that break any of these e=
xternal contribs?  Knowingly or unknowingly -- does it matter?
>>>
>>> I think it's really important to understand that breaking compat here s=
hould be a well thought off thing, especially as that's the differentiating=
 factor for code that resides under the project vs. external repos. It does=
n't mean that compat breaks can't happen, it's just that there would be mor=
e responsibility to providing a smooth upgrade path for users in case of co=
mpat breaks.
>>>
>>> From my perspective, the code in the external repos here would be just =
like the code in the core repo, just with a different release cadence.
>>>
>>>> Q: Would contribs be treated as first class citizens in the Solr Refer=
ence Guide (they are still in the ASF after all), or would they be banished=
 like the DIH was?
>>>
>>> The repos are supposed to grow, and with that, adding more to the curre=
nt ref guide would be just bad user experience. In addition, the different =
release cadence would make it difficult to support documentation for the co=
de in these repos via the ref guide that would be released with the core. W=
e should certainly aim for the same quality of documentation, but not make =
it to be a part of the ref guide.
>>>
>>>
>>>
>>> On Sat, Nov 14, 2020 at 8:54 PM David Smiley <dsmiley@apache.org> wrote=
:
>>>>
>>>> Thanks for shining a spotlight on this Mike.
>>>> I have some questions to consider.  I'll call these additional repos, =
"external contribs", or just contribs for short here; perhaps our internal =
contribs would migrate.
>>>>
>>>> Q: Would each contrib be released at its own cadence unrelated to Solr=
?  I suppose so.
>>>> Q: Would each contrib have it's own release vote?  I suppose so, as it=
 has its own artifact.  I think the ASF requires this.
>>>> Q: Is it "okay" to release new Solr versions that break any of these e=
xternal contribs?  Knowingly or unknowingly -- does it matter?
>>>> Q: What technical work is needed to extricate an internal contrib to a=
n external?
>>>> * source control history.  (note: i've done this git history in a sing=
le folder extraction before, with a popular Stackoverflow answer)
>>>> * mandatory ASF files, e.g. license, notice
>>>> * more files that we may want: CHANGES.txt
>>>> * More build files; copying the rules/setup/standards of the Solr moth=
ership and will become divergent over time no doubt.  Or just KISS principl=
e; no sharing; simple Maven projects.
>>>> Q: Could & should many contribs live in one repo (no more internal con=
tribs), yet each still have its own release cycle?  This could make sharing=
 build infrastructure easier, and detecting Solr compatibility with them ea=
sier.  Although it would mean sharing GitHub project area, thus sharing iss=
ues/PRs.
>>>> Q: Should we create a separate JIRA for these contribs... or ditch JIR=
A entirely for them, relying on GitHub alone?
>>>> Q: Would contribs be treated as first class citizens in the Solr Refer=
ence Guide (they are still in the ASF after all), or would they be banished=
 like the DIH was?
>>>>
>>>> ~ David Smiley
>>>> Apache Lucene/Solr Search Developer
>>>> http://www.linkedin.com/in/davidwsmiley
>>>>
>>>>
>>>> On Thu, Nov 12, 2020 at 6:40 PM Mike Drob <mdrob@apache.org> wrote:
>>>>>
>>>>> Solr Devs,
>>>>>
>>>>> We've slowly been moving into a multi-repository model, and I wanted =
to bring some more attention to it and have a more focused discussion. We'v=
e recently embarked upon the acceptance of solr-operator as a distinct repo=
[1] under the care of the Lucene (soon to be Solr) PMC. I expect that there=
 will be more cases of this as we transition additional contribs out of cor=
e, or as more plugins, packages, and integrations mature. Some will make se=
nse as externally maintained code bases, but I believe other contributions =
may benefit our community more as part of the Apache Foundation.
>>>>>
>>>>> I think there was a very insightful comment[2] made by GP regarding a=
dopting a similar model to Apache Commons governance, bringing attention to=
 it here because I fear it may have gotten lost deep in the thread. Based o=
n observations of Commons and a few other Apache projects with multi-repo s=
etups, there thankfully does not appear to be a limit on how many repositor=
ies a PMC can maintain. The size and scope of each individual repository ca=
n vary greatly. I see potential ideas for anything that could be standalone=
 and not tied to a release cycle (Admin UI, DIH, etc...), or anything that =
bridges integrations between Solr and other systems (k8s, HDFS, etc...).
>>>>>
>>>>> The risks that new repos face are similar to the risks they would hav=
e encountered as contrib modules, but I don't think they should dissuade us=
. Each project would need to start with a champion or sponsor and a discuss=
ion on the mailing list. From there, we can vote to accept the code, or jus=
t the idea if there is no code yet, as a community and create the repo. As =
part of a natural lifecycle, if there's not enough momentum or adoption ove=
r time, then we can update the README and docs and "retire" certain project=
s. The exact mechanisms can be undetermined for now; maybe it's a repo rena=
me, maybe it's marking the repo read-only, maybe it's something else.
>>>>>
>>>>> The Commons model is that everyone is a committer on everything. Ther=
e are other governance models, like Hadoop, with "area committers" who are =
limited to the specific repositories they have contributed frequently to. I=
'm not sure which model ultimately suits us better, but I think that levera=
ging area committers would allow us to recognize and empower contributors s=
ooner and more frequently. Releases would still need to be voted on and app=
roved by the singular PMC.
>>>>>
>>>>> There's no real action items here, it's more of a discussion prompt. =
If it looks like we have general consensus to this approach, then I'll star=
t putting together individual proposals for a few repos to exercise the pro=
cess and get more contributions going. I'll probably put the proposals toge=
ther even if there's no replies here, but I'd much rather have some acknowl=
edgement from the community that I'm headed in a sustainable direction!
>>>>>
>>>>> Mike
>>>>>
>>>>> [1]:https://lists.apache.org/thread.html/rb90f530155dc6edc6f1ccd5f056=
db1618142fdfcbd32d83f539d984b%40%3Cdev.lucene.apache.org%3E
>>>>> [2]:https://lists.apache.org/thread.html/r9965cb693369d927a942f805c13=
4bfeb45c5e80f447ad0fe2f663fae%40%3Cdev.lucene.apache.org%3E
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@lucene.apache.org
For additional commands, e-mail: dev-help@lucene.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic