[prev in list] [next in list] [prev in thread] [next in thread] 

List:       unison-users
Subject:    Re: [unison-users] Using Unison on millions of files
From:       Matthias-Christian Ott <ott () mirix ! org>
Date:       2013-12-31 15:37:02
Message-ID: 52C2E49E.9000501 () mirix ! org
[Download RAW message or body]

On 12/31/13 04:51, worley@alum.mit.edu wrote:
>> From: Matthias-Christian Ott <ott@mirix.org>
> 
>> I want to use Unison on 1.8E6 files on hard disks. Unfortunately, it is
>> quite slow.
>>
>> I traced Unison with strace and observed that it calls stat on each file
>> to get st_mtime for each file. I measured on one system that the system
>> call overhead of cached stat call is lower than 3 µs and 1.8E6 stat
>> calls take less then 5 s. So spreading the system calls across multiple
>> processors will only marginally improve performance.
>>
>> If the number inodes to be checked big fraction of the filesystem's
>> inodes, a linear inode scan speed up the modification time comparisons.
>> XFS has XFS_IOC_FSBULKSTAT and ext2 or later ext2fs_open_inode_scan.
>> XFS_IOC_FSBULKSTAT requires the CAP_SYS_ADMIN capability and
>> ext2fs_open_inode_scan requires permissions for the underlying block
>> device. Both methods are relatively fast (verified with dump and xfsdump).
> 
> Unfortunately, if the total time for the stat calls is only 5s, then
> speeding up the stat calls can't make Unison run much faster.

As mentioned this the running time of stat(".", &sb), so it is cached.
In reality the stat calls cause random reads and are therefore an order
of magnitude slower. This is why looked at linear inode scans.

Regards,
Matthias-Christian


[Attachment #3 (text/html)]

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" \
"http://www.w3.org/TR/html4/strict.dtd"> <html>
<head>
</head>







<body style="background-color: #fff;">
<span style="display:none">&nbsp;</span>

<!--~-|**|PrettyHtmlStartT|**|-~-->
<div id="ygrp-mlmsg" style="position:relative;">
  <div id="ygrp-msg" style="z-index: 1;">
<!--~-|**|PrettyHtmlEndT|**|-~-->

    <div id="ygrp-text" >
      
      
      <p>On 12/31/13 04:51, worley@alum.mit.edu wrote:<br>
&gt;&gt; From: Matthias-Christian Ott &lt;ott@mirix.org&gt;<br>
&gt; <br>
&gt;&gt; I want to use Unison on 1.8E6 files on hard disks. Unfortunately, it is<br>
&gt;&gt; quite slow.<br>
&gt;&gt;<br>
&gt;&gt; I traced Unison with strace and observed that it calls stat on each file<br>
&gt;&gt; to get st_mtime for each file. I measured on one system that the system<br>
&gt;&gt; call overhead of cached stat call is lower than 3 µs and 1.8E6 stat<br>
&gt;&gt; calls take less then 5 s. So spreading the system calls across multiple<br>
&gt;&gt; processors will only marginally improve performance.<br>
&gt;&gt;<br>
&gt;&gt; If the number inodes to be checked big fraction of the filesystem&#39;s<br>
&gt;&gt; inodes, a linear inode scan speed up the modification time comparisons.<br>
&gt;&gt; XFS has XFS_IOC_FSBULKSTAT and ext2 or later ext2fs_open_inode_scan.<br>
&gt;&gt; XFS_IOC_FSBULKSTAT requires the CAP_SYS_ADMIN capability and<br>
&gt;&gt; ext2fs_open_inode_scan requires permissions for the underlying block<br>
&gt;&gt; device. Both methods are relatively fast (verified with dump and \
xfsdump).<br> &gt; <br>
&gt; Unfortunately, if the total time for the stat calls is only 5s, then<br>
&gt; speeding up the stat calls can&#39;t make Unison run much faster.<br>
<br>
As mentioned this the running time of stat(&quot;.&quot;, &sb), so it is cached.<br>
In reality the stat calls cause random reads and are therefore an order<br>
of magnitude slower. This is why looked at linear inode scans.<br>
<br>
Regards,<br>
Matthias-Christian<br>
<br>
</p>

    </div>
     

    <!--~-|**|PrettyHtmlStart|**|-~-->
    <div style="color: #fff; height: 0;">__._,_.___</div>

          
  
    
    <table cellspacing=4px style="margin-top: 20px; margin-bottom: 10px;">
      <tbody>
        <tr>
          <td style="font-size: 12px; font-family: arial; font-weight: bold; padding: \
7px 5px 5px; color: #FFF; background-color: #F2F2F2; border: 1px solid #EAEAEA "  >  \
<a style="text-decoration: none; color: #2D50FD" \
href="http://groups.yahoo.com/group/unison-users/post;_ylc=X3oDMTJwMWpzaGNtBF9TAzk3MzU \
5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEbXNnSWQDMTEwNzIEc2VjA2Z0cgRzbGsDcnBseQRzdGltZQMxMzg4NTA0NDk2?act=reply&messageNum=11072">Reply \
via web post</a>  </td>
          <td style="font-size: 12px; font-family: arial; padding: 7px 5px 5px; \
color: #FFF; background-color: #F2F2F2; border: 1px solid #EAEAEA; " >  <a \
href="mailto:ott@mirix.org?subject=Re%3A%20%5Bunison-users%5D%20Using%20Unison%20on%20millions%20of%20files" \
style="text-decoration: none; color: #2D50FD;">  Reply to sender            </a> 
          </td>
          <td style="font-size: 12px; font-family: arial; padding: 7px 5px 5px; \
color: #FFF; background-color: #F2F2F2; border: 1px solid #EAEAEA; ">  <a \
href="mailto:unison-users@yahoogroups.com?subject=Re%3A%20%5Bunison-users%5D%20Using%20Unison%20on%20millions%20of%20files" \
style="text-decoration: none; color: #2D50FD">  Reply to group            </a> 
          </td>
          <td style="font-size: 12px; font-family: arial; padding: 7px 5px 5px; \
color: #FFF; background-color: #F2F2F2; border: 1px solid #EAEAEA; " >  <a \
href="http://groups.yahoo.com/group/unison-users/post;_ylc=X3oDMTJkc29pcmp1BF9TAzk3MzU \
5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEc2VjA2Z0cgRzbGsDbnRwYwRzdGltZQMxMzg4NTA0NDk2" \
style="text-decoration: none; color: #2D50FD">Start a New Topic</a>  </td>
          <td style="font-size: 12px; font-family: arial; padding: 7px 5px 5px; \
color: #2D50FD; background-color: #F2F2F2; border: 1px solid #EAEAEA; " >  <a \
href="http://groups.yahoo.com/group/unison-users/message/11068;_ylc=X3oDMTM1NHRnNXE1BF \
9TAzk3MzU5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEbXNnSWQDMTEwNzIEc2VjA2Z0cgRzbGsDdnRwYwRzdGltZQMxMzg4NTA0NDk2BHRwY0lkAzExMDY4" \
style="text-decoration: none; color: #2D50FD;">Messages in this topic</a>  (3)
                      </td>
        </tr>
      </tbody>
    </table>

        
<!------- Start Nav Bar ------>
<!-- |**|begin egp html banner|**| -->
<!-- |**|end egp html banner|**| -->

<!-- |**|begin egp html banner|**| -->
<div id="ygrp-vital" style="background-color: #f2f2f2; font-family: Verdana; \
                font-size: 10px; margin-bottom: 10px; padding: 10px;">
      <span id="vithd" style="font-weight: bold; color: #333; text-transform: \
uppercase; ">Recent Activity:</span>

    <ul style="list-style-type: none; margin: 0; padding: 0; display: inline;">
            <li style="border-right: 1px solid #000; font-weight: 700; display: \
inline; padding: 0 5px; margin-left: 0;">  <span class="cat"><a \
href="http://groups.yahoo.com/group/unison-users/members;_ylc=X3oDMTJlN2g5NGZtBF9TAzk3 \
MzU5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEc2VjA3Z0bARzbGsDdm1icnMEc3RpbWUDMTM4ODUwNDQ5Ng--?o=6" \
style="text-decoration: none;">New Members</a></span>  <span class="ct" style="color: \
#ff7900;">3</span>  </li>
                                              </ul>
    
  <div style="clear: both; padding-top: 2px; color: #1e66ae;">
    <a href="http://groups.yahoo.com/group/unison-users;_ylc=X3oDMTJkY2o0aGJwBF9TAzk3M \
zU5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEc2VjA3Z0bARzbGsDdmdocARzdGltZQMxMzg4NTA0NDk2" \
style="text-decoration: none;">Visit Your Group</a>  </div>
</div>


  
<div id="ft" style="font-family: Arial; font-size: 11px; margin-top: 5px; padding: 0 \
2px 0 0; clear: both;">  <a \
href="http://groups.yahoo.com/;_ylc=X3oDMTJjaWR0NWRmBF9TAzk3MzU5NzE0BGdycElkAzQ3OTc2NwRncnBzcElkAzE3MDUwMDQ3MjYEc2VjA2Z0cgRzbGsDZ2ZwBHN0aW1lAzEzODg1MDQ0OTY-" \
style="float: left;"><img \
src="http://l.yimg.com/ru/static/images/yg/img/email/new_logo/logo-groups-137x15.png" \
height="15" width="137" alt="Yahoo! Groups" style="border: 0;"/></a>  <div \
style="color: #747575; float: right;">Switch to: <a \
href="mailto:unison-users-traditional@yahoogroups.com?subject=Change Delivery Format: \
Traditional" style="text-decoration: none;">Text-Only</a>, <a \
href="mailto:unison-users-digest@yahoogroups.com?subject=Email Delivery: Digest" \
class="margin-rt" style="text-decoration: none;">Daily Digest</a> &bull; <a \
href="mailto:unison-users-unsubscribe@yahoogroups.com?subject=Unsubscribe" \
style="text-decoration: none;">Unsubscribe</a> &bull; <a \
href="http://info.yahoo.com/legal/us/yahoo/utos/terms/" style="text-decoration: \
none;">Terms of Use</a> &bull; <a \
href="mailto:ygroupsnotifications@yahoogroups.com?subject=Feedback on the redesigned \
individual mail v1" style="text-decoration: none;">Send us Feedback </a></div> </div>

<!-- |**|end egp html banner|**| -->

  </div> <!-- ygrp-msg -->

  <!-- Sponsor -->
  <!-- |**|begin egp html banner|**| -->
  <div id="ygrp-sponsor" style="width:160px; float:right; clear:none; margin:0 0 25px \
0; background: #fff;">

<!-- Start Recommendations -->
<div id="ygrp-reco">
     </div>
<!-- End Recommendations -->



  </div>   <!-- |**|end egp html banner|**| -->

  <div style="clear:both; color: #FFF; font-size:1px;">.</div>
</div>

  <img src="http://geo.yahoo.com/serv?s=97359714/grpId=479767/grpspId=1705004726/msgId=11072/stime=1388504496" \
width="1" height="1"> <br>

<div style="color: #fff; height: 0;">__,_._,___</div>
<!--~-|**|PrettyHtmlEnd|**|-~-->

</body>

<!--~-|**|PrettyHtmlStart|**|-~-->
<head>
  <style type="text/css">
  <!--
  #ygrp-mkp {
  border: 1px solid #d8d8d8;
  font-family: Arial;
  margin: 10px 0;
  padding: 0 10px;
}

#ygrp-mkp hr {
  border: 1px solid #d8d8d8;
}

#ygrp-mkp #hd {
  color: #628c2a;
  font-size: 85%;
  font-weight: 700;
  line-height: 122%;
  margin: 10px 0;
}

#ygrp-mkp #ads {
  margin-bottom: 10px;
}

#ygrp-mkp .ad {
  padding: 0 0;
}

#ygrp-mkp .ad p {
  margin: 0;
}

#ygrp-mkp .ad a {
  color: #0000ff;
  text-decoration: none;
}
  #ygrp-sponsor #ygrp-lc {
  font-family: Arial;
}

#ygrp-sponsor #ygrp-lc #hd {
  margin: 10px 0px;
  font-weight: 700;
  font-size: 78%;
  line-height: 122%;
}

#ygrp-sponsor #ygrp-lc .ad {
  margin-bottom: 10px;
  padding: 0 0;
}

  #actions {
    font-family: Verdana;
    font-size: 11px;
    padding: 10px 0;
  }

  #activity {
    background-color: #e0ecee;
    float: left;
    font-family: Verdana;
    font-size: 10px;
    padding: 10px;
  }

  #activity span {
    font-weight: 700;
  }

  #activity span:first-child {
    text-transform: uppercase;
  }

  #activity span a {
    color: #5085b6;
    text-decoration: none;
  }

  #activity span span {
    color: #ff7900;
  }

  #activity span .underline {
    text-decoration: underline;
  }

  .attach {
    clear: both;
    display: table;
    font-family: Arial;
    font-size: 12px;
    padding: 10px 0;
    width: 400px;
  }

  .attach div a {
    text-decoration: none;
  }

  .attach img {
    border: none;
    padding-right: 5px;
  }

  .attach label {
    display: block;
    margin-bottom: 5px;
  }

  .attach label a {
    text-decoration: none;
  }
  
  blockquote {
    margin: 0 0 0 4px;
  }

  .bold {
    font-family: Arial;
    font-size: 13px;
    font-weight: 700;
  }

  .bold a {
    text-decoration: none;
  }

  dd.last p a {
    font-family: Verdana;
    font-weight: 700;
  }

  dd.last p span {
    margin-right: 10px;
    font-family: Verdana;
    font-weight: 700;
  }

  dd.last p span.yshortcuts {
    margin-right: 0;
  }

  div.attach-table div div a {
    text-decoration: none;
  }

  div.attach-table {
    width: 400px;
  }

  div.file-title a, div.file-title a:active, div.file-title a:hover, div.file-title \
a:visited {  text-decoration: none;
  }

  div.photo-title a, div.photo-title a:active, div.photo-title a:hover, \
div.photo-title a:visited {  text-decoration: none;
  }

  div#ygrp-mlmsg #ygrp-msg p a span.yshortcuts {
    font-family: Verdana;
    font-size: 10px;
    font-weight: normal;
  }

  .green {
    color: #628c2a;
  }

  .MsoNormal {
    margin: 0 0 0 0;
  }

  o {
    font-size: 0;
  }

  #photos div {
    float: left;
    width: 72px;
  }

  #photos div div {
    border: 1px solid #666666;
    height: 62px;
    overflow: hidden;
    width: 62px;
  }

  #photos div label {
    color: #666666;
    font-size: 10px;
    overflow: hidden;
    text-align: center;
    white-space: nowrap;
    width: 64px;
  }

  #reco-category {
    font-size: 77%;
  }

  #reco-desc {
    font-size: 77%;
  }

  .replbq {
    margin: 4px;
  }

  #ygrp-actbar div a:first-child {
   /* border-right: 0px solid #000;*/
    margin-right: 2px;
    padding-right: 5px;
  }

  #ygrp-mlmsg {
    font-size: 13px;
    font-family: Arial, helvetica,clean, sans-serif;
    *font-size: small;
    *font: x-small;
  }

  #ygrp-mlmsg table {
    font-size: inherit;
    font: 100%;
  }

  #ygrp-mlmsg select, input, textarea {
    font: 99% Arial, Helvetica, clean, sans-serif;
  }

  #ygrp-mlmsg pre, code {
    font:115% monospace;
    *font-size:100%;
  }

  #ygrp-mlmsg * {
    line-height: 1.22em;
  }

  #ygrp-mlmsg #logo {
    padding-bottom: 10px;
  }


  #ygrp-msg p a {
    font-family: Verdana;
  }

  #ygrp-msg p#attach-count span {
    color: #1E66AE;
    font-weight: 700;
  }

  #ygrp-reco #reco-head {
    color: #ff7900;
    font-weight: 700;
  }

  #ygrp-reco {
    margin-bottom: 20px;
    padding: 0px;
  }

  #ygrp-sponsor #ov li a {
    font-size: 130%;
    text-decoration: none;
  }

  #ygrp-sponsor #ov li {
    font-size: 77%;
    list-style-type: square;
    padding: 6px 0;
  } 

  #ygrp-sponsor #ov ul {
    margin: 0;
    padding: 0 0 0 8px;
  }

  #ygrp-text {
    font-family: Georgia;
  }

  #ygrp-text p {
    margin: 0 0 1em 0;
  }

  #ygrp-text tt {
    font-size: 120%;
  }

  #ygrp-vital ul li:last-child {
    border-right: none !important; 
  } 
  -->
  </style>
</head>

<!--~-|**|PrettyHtmlEnd|**|-~-->
</html>
<!-- end group email -->



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic