[prev in list] [next in list] [prev in thread] [next in thread] 

List:       subversion-dev
Subject:    Re: merge logs
From:       Nick Thompson <nickthompson () agere ! com>
Date:       2005-09-20 16:02:27
Message-ID: 200509201702.28084.nickthompson () agere ! com
[Download RAW message or body]

I think svnmerge does a very good job. My only concern about compatibility was 
simple the way the merge information is stored, not the content of the 
information. If the information required to be stored about merges could be 
agreed, and added to the repository as part of the underlying format of the 
repository, you can quickly get some benefits that can be further augmented 
later, without throwing away todays information. I don't think there is a 
perfect solution to this, so I suggest deciding on an approach (albeit for a 
medium term plan) and identifying what data is required. I think its a very 
small amount of data that is required and easy to identify - but I'm no 
expert in Subversion (I'm a ClearCase admin and user).

In the interests furthering the discussion, I thought I would try and 
contribute a bit more with the attached txt file. It shows my thoughts from 
an svn user perspective. It poses a few questions for svn experts and 
proposes a few ideas on what a merge log would contain and how it might be 
implemented. I'm happy to work on it a bit more if you have suggestions. (oh 
my, its 9KB already)

I know your response was a partial cut and paste, but this one is on the road 
map isn't it? Humbly, if its only there to make to roadmap look good, I need 
to continue my search.

Hopefully that penguins beak will be feeling the pressure a little more :-)

Best regards,
Nick.

On Tuesday 20 Sep 2005 03:30, kfogel@collab.net wrote:
> Nick Thompson <nickthompson@agere.com> writes:
> > I've read much about the ideas for merge nirvana that are on the SVN
> > road map and it looks like there is lots to do. It looks like fixing
> > up renaming is the first on the list.
> >
> > But looking from the outside in, what about adding some kind of merge
> > logs to the repository as a relative priority? For my purposes
> > svnmerge looks like a good improvement over standard svn merge, but
> > its a shame that the way it records merges is probably not compatible
> > with any future solution. If merge logs where added to svn base, then
> > add-on tools like svnmege could make use of them today, "best
> > practices" that require non-expert users to remember to write a sane
> > log entry for a merge could be relaxed, and merges initiated from GUIs
> > would be captured, all in a compatible way. A way to list these merge
> > logs would of course be required as well. This might even meet some
> > group's requirement for merge tracking, in that it is at least
> > formally recorded and manually traceable.
> >
> > This step, I guess, should be fairly easy to scope and provide good
> > extra features. It will also allow developers who don't want to fiddle
> > with repository formats to start looking at improving merge so as to
> > start moving along the roadmap to that distant nirvana.
> >
> > Just and outsider looking in, Thanks for listening,
>
> Here's something I just wrote about another proposal:
>  | Well, my initial reaction is: neat idea, but this specification barely
>  | scratches the tip of the beak of a penguin standing on the tip of the
>  | iceberg.  Designing such a feature would require a lot of discussion &
>  | thought; it will turn out to be made almost entirely of edge cases.  I
>  | personally don't have time to work on it, unfortunately.  If you can
>  | find developers who do, then it may go somewhere.  But please note
>  | that there are always hundreds of ideas floating around.  Most will
>  | never get implemented.  It's not because they're bad, it's just
>  | because there are always more ideas than hands and heads to code them.
>
> I think it applies here too, even more so in fact.
>
> This would not be easy to scope at all.  Whatever those merge logs
> are, once they enter our APIs, we must support them forever (well, at
> least until 2.0, and even then we don't want to toss things unless we
> absolutely must -- compatibility is an obsession around here).
>
> So, that means we'd have to design them very carefully, making sure
> that they can efficiently express everything they might need to
> express.  If you think about this, you will see that it is equivalent
> to designing merge tracking in the first place.
>
> If coming up with universally useful merge logs were easy, then
> svnmerge would have done so, and you wouldn't be worried about its
> logs not being compatible with whatever Subversion comes up with in
> the future.  As it is, you are right to worry, but this is not because
> svnmerge has done anything wrong; it's because it's a hard, hard
> problem.
>
> -Karl

["merge.txt" (text/plain)]

Merge logging for subversion:
=================================

Purposes of merging:
1. "copy" features/fixes from one branch to another.
2. undo changes (reverse difference)
3. resurrection of deleted items (better to use copy?)

Purposes of merge tracking:
1. identifies which features/fix have been merged to another branch.
2. allows automated common ancestor determination (for merging)

Requirements of merge tracking:
1. Identify which revision a merge was from and to (bi-directionally).
2. Identify which range of revisions a merge is from.
3. Logs required per merged file and for the whole repository.
4. Must be able to generate merged change set.

Commits can be removed from a branch be applying a reverse difference using the merge \
command. Is this a hack? Should it be recorded as a merge? Is it a merge at all? Need \
a new "svn uncommit" command?

Reverse differences are possible because we have to commit revisions in the \
repository that can then be reverse differenced. It has been mentioned that a reverse \
merge should also be possible and that the merge and subsequent conflict resolutions \
edits should be separately identifiable items. This is partly possible, since the \
merge differences can be recalculated at any time (given an accurate log entry), but \
it may be impossible to fully separate resolution conflict edits and other manual \
changes without an intervening commit. This would be undesirable (and even impossible \
in subversion) since conflicts would have to be committed unresolved. Why is a \
reverse merge desirable? Can it even be considered possible?

Since merging is performed to a working copy, it is always possibly to merge more \
than once, from different revisions, into a working copy before a commit is \
performed. It is obviously also the case that one revision can be merged to more than \
one working copy and committed. Merge logs must therefore be able to store more that \
one merged-to and merge-from record.

Best practice requires the user to record merge information in the commit log. This \
requirement will in many cases be forgotten and is in any case difficult to analyze, \
either manually or automatically, and is therefore inherently unreliable. Need an \
automated merge log which contains the same information: merge source (example: \
"http://svn.example.com/repos/calc/trunk") and merged revisions (examples "343" or \
"343:344"). This information can be represented in a standardized form: \
"http://svn.example.com/repos/calc/trunk@343" and \
"http://svn.example.com/repos/calc/trunk@343:http://svn.example.com/repos/calc/trunk@344". \
There is no further useful information to be stored and all of this can be \
automatically generated from "svn merge"'s parameters.

One possible solution is to have a merged-to record and a merged-from record list \
(with zero or more entries - preferably with zero overhead if it contains zero \
entries) for each revision of each file and directory in the repository. Each record \
in the list should be matched at each end of each of the respective merge operations, \
containing either a record of the form: "http://svn.example.com/repos/calc/trunk@343" \
or: "http://svn.example.com/repos/calc/trunk@343:http://svn.example.com/repos/calc/trunk@344" \
at both ends.

An alternative solution is to store "merge objects" with a unique merge-id as \
completely separate entities in the repository which are easily referenced from the \
appropriate file revisions. Could this use properties? "svn:merge-from" and \
"svn:merge-to" with a merge-id list as a setting? The "merge objects" would then \
contain only a single "http://svn.example.com/repos/calc/trunk@343" or: \
"http://svn.example.com/repos/calc/trunk@343:http://svn.example.com/repos/calc/trunk@344" \
range reference. Actually this solution is similar to the ClearCase approach.

ClearCase represents merges (for the purpose of merge tracking) using a directional \
"hyperlink" object with a type "Merged", which identifies the from and to revisions \
of each file that is merged. This information is indicative only. There is no \
mechanism to stop users from altering the merged files (to fix conflicts or for \
unrelated changes), or reject or revert some of the merged data, before file check-in \
(commit) and therefore no way to separate merged data from user modified data.

It is either impossible or at least impractical to make merged information uniquely \
identifiable. And there is no perfect solution available from any VCS or SCM system. \
ClearCase for example tries only to identify a common ancestor, searching "paths" \
across merges and branches, to automatically find the latest ancestral point from \
which to start generating differences.

. Common ancestor.
=====================

The Following diagram show a problematic scenario for "svn merge" which works well in \
ClearCase:

See original details at: \
http://svn.collab.net/repos/svn/trunk/notes/merge-tracking.txt

                    1     
                    2     
                    3     
                  /   \   
                 /     \  
                /       \ 
            one           1   
            two           2.5 
            three         3   
             |     \      |
             |      \     |   
             |       \    |            
             |        \   |            
             |         \ one                ## This node is a human's
             |           two-point-five     ## merge of two sides.
             |           three        
             |            |
             |            |
             |            |
            one          one
            Two          two-point-five
            three        newline       
               \         three  
                \         |   
                 \        |
                  \       |
                   \      |
                    \     |
                     \    |
                      \   |
                       \  |
                         one                ## This node is a human's
                         Two-point-five     ## merge of the changes
                         newline            ## since the last merge.
                         three

Creating a merge log at the "one\ntwo-point-five\nthree" node, which references the \
"one\ntwo\nthree" node allows the second merge to identify the "one\ntwo\nthree" node \
as a common ancestor and thereby only apply the (-two\n+Two) difference, leading to \
the single expected conflict, rather than "svn merge"'s whole file conflict. \
Subversion can handle this today, using a range restrictions on the merge, however \
its up to the user to figure out the common ancestor. (add a new revision identifier? \
Example, -rCOMMON:HEAD). The common ancestor has to be identified per merged file.

The limitation here, is that if there where prior changes on the left branch before \
the first merge, that for some reason where not merged during the first merge (due to \
a revision range restriction), they will not be considered for the second merge \
either. Perhaps, though, this is desirable. It is of course possible to merge these \
earlier changes with a separate merge, which would then lead to the "1\n2\n3" node \
being used as the common ancestor (probably a bad idea, users choice though).

. Conclusion.
=================

The intention of this text is not to show how merging should be done, but only to \
show what information is required to improve merge capabilities. Hopeful it can be \
seen that the requirements are low and the benefits high. Merge Tracking doesn't make \
merging better, but it does provide information that can make merging better.

The information that can be stored in a merge log, must be automatically determined \
from the merge command parameters or else where from within subversion - you can't \
rely on users to do this. The amount of information available is small, but this is \
actually all that is required. The merge-from a -to ends of the merge are all that is \
needed, but it is probable that an indication of a range restriction in the merge is \
useful.

Merge logging can't lead to a perfect merge system that gets every merge correct. It \
is only required to aim for a realistic goal. A reliably identifiable common ancestor \
is the only requirement of the best in breed merge tools that currently exist. Merge \
logging itself doesn't need to be perfect either, but should capture the information \
that the user makes available. There is no mechanism available in subversion, and \
there need not be, to allow for perfect separation of merged data and user \
modifications in the same working copy before a commit.

Separation of merge for application of reversed differences and merge for copying \
updates between branches makes the idea simpler, but requires a new command \
(uncommit?) or new parameters to merge. Creation of a revision alias (COMMON?) might \
be the only other required change from the users point of view. This would allow the \
current merge tool to provide a better merge solution, with no change to its \
fundamentals. (Diff variance would be nice too, of course, but that also needs common \
ancestor functionality.)

Blame is a difficult issue, considering that one or more merges can be completed, and \
the user can edit the files manually at any time, before a commit occurs. I have no \
ideas to solve that.



---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@subversion.tigris.org
For additional commands, e-mail: dev-help@subversion.tigris.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic