[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    [PATCH] git-repack: generational repacking (and example hook script)
From:       Sam Vilain <sam.vilain () catalyst ! net ! nz>
Date:       2007-06-30 8:56:21
Message-ID: 1183193782608-git-send-email-sam.vilain () catalyst ! net ! nz
[Download RAW message or body]

Add an option to git-repack that makes the repack run suitable for
running very often.  The idea is that packs get given a "generation",
and that the number of packs in each generation (except the last one)
is bounded.

The useful invocation of this is git-repack -d -g

The -a option then becomes a degenerate case of generative repacking.

Signed-off-by: Sam Vilain <sam.vilain@catalyst.net.nz>
---
 Documentation/git-repack.txt |    6 +++
 git-repack.sh                |   74 +++++++++++++++++++++++++++++++++++-------
 templates/hooks--post-commit |   14 +++++++-
 3 files changed, 81 insertions(+), 13 deletions(-)

diff --git a/Documentation/git-repack.txt b/Documentation/git-repack.txt
index be8e5f8..d458377 100644
--- a/Documentation/git-repack.txt
+++ b/Documentation/git-repack.txt
@@ -42,6 +42,12 @@ OPTIONS
 	existing packs redundant, remove the redundant packs.
 	Also runs gitlink:git-prune-packed[1].
 
+-g::
+	Enable "generational" repacking.  This attempts to keep the
+	number of packs under control when repacking very often.  Most
+	useful when called from the `post-commit` hook (see
+	link:hooks.html[hooks] for more information).
+
 -l::
         Pass the `--local` option to `git pack-objects`, see
         gitlink:git-pack-objects[1].
diff --git a/git-repack.sh b/git-repack.sh
index 8c32724..3d253fa 100755
--- a/git-repack.sh
+++ b/git-repack.sh
@@ -3,19 +3,21 @@
 # Copyright (c) 2005 Linus Torvalds
 #
 
-USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [--max-pack-size=N] [--window=N] [--depth=N]'
+USAGE='[-a] [-d] [-f] [-l] [-n] [-q] [-g] [--max-pack-size=N] [--window=N] [--depth=N]'
 SUBDIRECTORY_OK='Yes'
 . git-sh-setup
 
-no_update_info= all_into_one= remove_redundant=
-local= quiet= no_reuse= extra=
+no_update_info= generations= remove_redundant=
+local= quiet= no_reuse= extra= generation_width=
 while case "$#" in 0) break ;; esac
 do
 	case "$1" in
 	-n)	no_update_info=t ;;
-	-a)	all_into_one=t ;;
+	-a)	generations=0 ;;
 	-d)	remove_redundant=t ;;
 	-q)	quiet=-q ;;
+	-g)	generations=3 generation_width=10 ;;
+	-G)	generations=$2; generation_width=5; shift ;;
 	-f)	no_reuse=--no-reuse-object ;;
 	-l)	local=--local ;;
 	--max-pack-size=*) extra="$extra $1" ;;
@@ -40,24 +42,69 @@ PACKTMP="$GIT_OBJECT_DIRECTORY/.tmp-$$-pack"
 rm -f "$PACKTMP"-*
 trap 'rm -f "$PACKTMP"-*' 0 1 2 3 15
 
+generation=
+redundant=
+
 # There will be more repacking strategies to come...
-case ",$all_into_one," in
+case ",$generations," in
 ,,)
 	args='--unpacked --incremental'
 	;;
-,t,)
+,*,)
 	if [ -d "$PACKDIR" ]; then
+		max_gen=0
 		for e in `cd "$PACKDIR" && find . -type f -name '*.pack' \
 			| sed -e 's/^\.\///' -e 's/\.pack$//'`
 		do
 			if [ -e "$PACKDIR/$e.keep" ]; then
 				: keep
 			else
-				args="$args --unpacked=$e.pack"
 				existing="$existing $e"
+				if [ -e "$PACKDIR/$e.gen" ]; then
+					gen=`cat $PACKDIR/$e.gen`
+				else
+					gen=1
+				fi
+				[ "$max_gen" -lt $gen ] && max_gen=$gen
+				eval "gen_${gen}=\"\$gen_${gen} $e\"";
+				eval "c_gen_${gen}=\$((\$c_gen_${gen} + 1))";
 			fi
 		done
+		i=$max_gen
+		packing=
+		while [ $i -gt 0 ]
+		do
+			eval "c_gen=\$c_gen_$i"
+			eval "packs=\$gen_$i"
+			if [ -n "$c_gen" -a $i -gt "$generations" ]
+			then
+				echo "saw $c_gen packs at generation $i"
+				echo "therefore, repacking everything"
+				packing=1
+				[ -z "$generation" ] && generation=$(($i + 1))
+			elif [ -n "$c_gen" -a "$c_gen" -ge "$generation_width" -a "$i" -lt "$generations" ]
+			then
+				echo -n "generation $i has too many packs "
+				echo "($c_gen >= $generation_width)"
+				echo "repacking at this level and below"
+				packing=1
+				[ -z "$generation" ] && generation=$(($i + 1))
+			fi
+			if [ -n "$packing" ]
+			then
+				for x in $packs; do
+					args="$args --unpacked=$x.pack"
+					redundant="$redundant $x"
+				done
+			fi
+			i=$(($i - 1))
+		done
+		if [ -n "$generation" ]; then
+			[ "$generation" -gt "$generations" ] && generation=$generations
+			[ "$generation" -eq 0 ] && generation=1
+		fi
 	fi
+
 	[ -z "$args" ] && args='--unpacked --incremental'
 	;;
 esac
@@ -95,20 +142,23 @@ for name in $names ; do
 		exit 1
 	}
 	rm -f "$PACKDIR/old-pack-$name.pack" "$PACKDIR/old-pack-$name.idx"
+	[ -n "$generation" ] && echo $generation > "$PACKDIR/pack-$name.gen"
 done
 
 if test "$remove_redundant" = t
 then
-	# We know $existing are all redundant.
-	if [ -n "$existing" ]
+	echo "removing redundant packs"
+	# We know $redundant are all redundant.
+	if [ -n "$redundant" ]
 	then
 		sync
 		( cd "$PACKDIR" &&
-		  for e in $existing
+		  for e in $redundant
 		  do
 			case " $fullbases " in
-			*" $e "*) ;;
-			*)	rm -f "$e.pack" "$e.idx" "$e.keep" ;;
+			*" $e "*) echo "ignoring $e" ;;
+			*)	echo "removing $e.pack etc";
+				rm -f "$e.pack" "$e.idx" "$e.keep" ;;
 			esac
 		  done
 		)
diff --git a/templates/hooks--post-commit b/templates/hooks--post-commit
index 8be6f34..669f1fc 100644
--- a/templates/hooks--post-commit
+++ b/templates/hooks--post-commit
@@ -5,4 +5,16 @@
 #
 # To enable this hook, make this file executable.
 
-: Nothing
+threshold=`git-config gc.threshold`
+threshold=${threshold-250}
+
+gd=`git-rev-parse --git-dir`
+found=$(find $gd/objects/?? -type f | head -$threshold | wc -l)
+
+if [ $found -ge $threshold ]
+then
+    echo "At least $threshold loose objects, running generational repack"
+    git-repack -g -d
+else
+    echo "Found only $found loose objects, less than $threshold"
+fi
-- 
1.5.2.1.1131.g3b90

-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic