[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: hosting git on a nfs
From:       Linus Torvalds <torvalds () linux-foundation ! org>
Date:       2008-11-13 21:05:04
Message-ID: alpine.LFD.2.00.0811131252040.3468 () nehalem ! linux-foundation ! org
[Download RAW message or body]



On Thu, 13 Nov 2008, Linus Torvalds wrote:
> 
> If I were still using NFS (I gave up on it years ago exactly because it 
> was so painful for software development - and that was when I was using 
> CVS) I'd surely have done it long since.
> 
> Bit I'll think about it again. 

THIS IS TOTALLY UNTESTED! It's trivial, it's hacky, and it would need to 
be cleaned up before being merged, but I'm not even going to bother 
_trying_ to make it anything cleaner unless somebody can test it on a nice 
NFS setup.

Because it's entirely possible that there are various directory inode 
locks etc that means that doing parallel lookups in the same directory 
won't actually be doing a whole lot. Whatever. It's kind of cute, and it 
really isn't all that many lines of code, and even if it doesn't work due 
to some locking reason, maybe it's worth looking at.

It arbitrarily caps the number of threads to 10, and it has no way to turn 
it on or off. It actually _does_ seem to work in the sense than when 
cached, it can take advantage of the fact that I have a nice multi-core 
thing, but let's face it, it's not really worth it for that reason only. 

Before:

	[torvalds@nehalem linux]$ /usr/bin/time git diff > /dev/null 
	0.03user 0.04system 0:00.07elapsed 100%CPU (0avgtext+0avgdata 0maxresident)k

After:

	0.02user 0.07system 0:00.04elapsed 243%CPU (0avgtext+0avgdata 0maxresident)k
	0inputs+0outputs (0major+2241minor)pagefaults 0swaps

ie it actually did cut elapsed time from 7 hundredths of a second to just 
4. And the CPU usage went from 100% to 243%. Ooooh. Magic.

But it's still hacky as hell. Who has NFS? Can you do the same thing over 
NFS and test it? I'm not going to set up NFS to test this, and as I 
suspected, on a local disk, the cold-cache case makes no difference 
what-so-ever, because whatever seek optimizations can be done are still 
totally irrelevant.

And if there are per-directory locking etc that screws this up, we can 
look at using different heuristics for the lstat() patterns. Right now I 
divvy it up so that thread 0 gets entries 0, 10, 20, 30.. and thread 1 
does entries 1, 11, 21, 31.., but we could easily split it up differently, 
and do 0,1,2,3.. and 1000,1001,1002,1003.. instead. That migth avoid some 
per-directory locks. Dunno.

Anyway, it was kind of fun writing this. The reason it threads so well is 
that all the lstat() code really works on private data already, so there 
are no global data structures that change that need to be worried about. 

So no locking necessary - just fire it up in parallel, and wait for the 
results. We already had everything else in place (ie the per-cache_entry 
flag to say "I've checked this entry on disk").

Of course, if a lot of entries do _not_ match, then this will actually 
generate more work (because we'll do the parallel thing to verify that 
the on-disk version matches, and if it doesn't match, then we'll end up 
re-doing it linearly later more carefully).

			Linus

---
 diff-lib.c |   67 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 files changed, 67 insertions(+), 0 deletions(-)

diff --git a/diff-lib.c b/diff-lib.c
index ae96c64..7d972c9 100644
--- a/diff-lib.c
+++ b/diff-lib.c
@@ -1,6 +1,7 @@
 /*
  * Copyright (C) 2005 Junio C Hamano
  */
+#include <pthread.h>
 #include "cache.h"
 #include "quote.h"
 #include "commit.h"
@@ -54,6 +55,70 @@ static int check_removed(const struct cache_entry *ce, struct stat *st)
 	return 0;
 }
 
+/* Hacky hack-hack start */
+#define MAX_PARALLEL (10)
+static int parallel_lstats = 1;
+
+struct thread_data {
+	pthread_t pthread;
+	struct index_state *index;
+	struct rev_info *revs;
+	int i, step;
+};
+
+static void *preload_thread(void *_data)
+{
+	int i;
+	struct thread_data *p = _data;
+	struct index_state *index = p->index;
+	struct rev_info *revs = p->revs;
+
+	for (i = p->i; i < index->cache_nr; i += p->step) {
+		struct cache_entry *ce = index->cache[i];
+		struct stat st;
+
+		if (ce_stage(ce))
+			continue;
+		if (ce_uptodate(ce))
+			continue;
+		if (!ce_path_match(ce, revs->prune_data))
+			continue;
+		if (lstat(ce->name, &st))
+			continue;
+		if (ie_match_stat(index, ce, &st, 0))
+			continue;
+		ce_mark_uptodate(ce);
+	}
+	return NULL;
+}
+
+static void preload_uptodate(struct rev_info *revs, struct index_state *index)
+{
+	int i;
+	int threads = index->cache_nr / 100;
+	struct thread_data data[MAX_PARALLEL];
+
+	if (threads < 2)
+		return;
+	if (threads > MAX_PARALLEL)
+		threads = MAX_PARALLEL;
+	for (i = 0; i < threads; i++) {
+		struct thread_data *p = data+i;
+		p->index = index;
+		p->revs = revs;
+		p->i = i;
+		p->step = threads;
+		if (pthread_create(&p->pthread, NULL, preload_thread, p))
+			die("unable to create threaded lstat");
+	}
+	for (i = 0; i < threads; i++) {
+		struct thread_data *p = data+i;
+		if (pthread_join(p->pthread, NULL))
+			die("unable to join threaded lstat");
+	}
+}
+/* Hacky hack-hack mostly ends */
+
 int run_diff_files(struct rev_info *revs, unsigned int option)
 {
 	int entries, i;
@@ -68,6 +133,8 @@ int run_diff_files(struct rev_info *revs, unsigned int option)
 	if (diff_unmerged_stage < 0)
 		diff_unmerged_stage = 2;
 	entries = active_nr;
+	if (parallel_lstats)
+		preload_uptodate(revs, &the_index);
 	symcache[0] = '\0';
 	for (i = 0; i < entries; i++) {
 		struct stat st;
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic