[prev in list] [next in list] [prev in thread] [next in thread] 

List:       git
Subject:    Re: [PATCH] Added sub get_owner_file which checks if there's a file
From:       Nagy Balázs <js () iksz ! hu>
Date:       2008-01-30 15:59:14
Message-ID: 47A09ED2.6070407 () iksz ! hu
[Download RAW message or body]

Jakub Narebski wrote:
> Nagy Balázs wrote:
>   
>> Are you talking about I/O of an all-in CGI script?  
>>     
>
> I am talking there between I/O difference between situation
> (configuration) when $projects_list is a directory (default),
> or is a file. If $projects_list is a directory, gitweb scans
> directory structure to find git repositories, which for large
> number of repositories might take time, even with filesystem
> cache, and with depth of searching bound by $project_maxdepth.
> Add to that finding symbolic name of the owner of repository
> directory, or (with the patch) reading a file per repo with repo
> owner.
>   
We have two configurable options here: $projectroot and $projects_list.  
If $projects_list is a directory, we'll end up using a directory to get 
project list info, and using another one to actually handle the 
projects.  In small repo area it's safe to have $projects_list empty.  
This is why I reference $projects_list as a file.

If $projects_list is a file, we'll rely on a file which was generated 
some time ago and can't reflect the latest changes of $projectroot (but 
see below).
>> We can tune the  
>> performance of this script, but changing the GIT_DIR structure just 
>> because of a simple script is a bit overkill to me.
>>
>> What if this script creates the $projects_list file, for example when 
>> the $projectroot's mtime changes?  We can even hold mtime info for every 
>> project's config file.
>>     
>
> I don't understand what you wanted to say here. $projects_list file
> lists only project path (project name) and project owner.
>   
I mean it would be better to add this kind of metadata like description 
and owner's shoesize to config instead of a raw file.  I understand row 
files are easier to read but reading a single cache file adn doing some 
stat()s are much easier.  I can think of $project_lists as a cache file 
name, which can be maintained by gitweb.cgi, and these mtime values 
could be saved to $project_list to verify entries' validity.

All we have to do is to maintain $project_list to be up to date.  The 
best would be to have a separate projectlist maintainer script which 
handles two scenarios:

1| repo addition/deletion
2| repo config changes

I don't have solution for the first scenario which would be a speed 
improvement in gitweb.cgi, this is why I suggest to put $project_list 
updater to a separate script.  The second scenario could be handled by 
gitweb.cgi though, but it would be mere code duplication.

Regards:
-- 
Balazs Nagy
-
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic