'Re: Review Request 107198: Support permanent glSwapBuffer path'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kwin
Subject:    Re: Review Request 107198: Support permanent glSwapBuffer path
From:       Fredrik_Höglund <fredrik () kde ! org>
Date:       2013-02-28 5:22:14
Message-ID: 20130228052214.3183.32533 () vidsolbach ! de
[Download RAW message or body]

[Attachment #2 (multipart/alternative)]

> On Feb. 25, 2013, 1 p.m., Martin Gräßlin wrote:
> > Is there anything special I should watch out for when running it? I have hardly \
> > noticed tearing on my system with the old implementation...
> 
> Thomas Lübking wrote:
> Enable vsync and run glxgears. Maximize the window. See whether you can spot a \
> steady tearline (usually in the upper region) If not, deactivate the subbuffer copy \
> branch (ie. force using copy pixels) and see again. Then test with patched version.
> 
> Ralf Jung wrote:
> Also, try downloading the teartest.mp4 from http://ompldr.org/iYXBldg-hide and run \
> it in mplayer/VLC with various backends and in fullscreen and maximized. 
> Fredrik Höglund wrote:
> By the way, the obvious solution to the copy-from-front-buffer problem is to render \
> to a user allocated renderbuffer and do a full blit to the backbuffer followed by a \
> swap. 
> Ralf Jung wrote:
> However, if we have a full-screen repaint, we absolutely want to do a pageflip to \
> the buffer we rendered to. So we could only use that renderbuffer for partial \
> renders, which would make the accounting for the current state in there quite \
> difficult. 
> Thomas Lübking wrote:
> This would basically get us a ("reliable" as in "present") handcrafted buffer_age \
> implementation with n = 1. 
> Major issues:
> 1. how unproblematic is the invocation of an FBO (featurewise, eg. afaik msaa would \
> rather not work?! - not a particular problem here, though) 2. how much overhead is \
> it gonna introduce a. painting into an FBO instead of GL_BACK (driver depending, i \
> found a mesa commit to bypass an intel issue about n*512 or n*1024 dimensioned \
> FBOs) b. blitting the FBO to GL_BACK
> 
> The charming part is that you only have to paint the diff, the not-so-charming part \
> is the extra fullsize blit (esp. for IGP mem throughput, i assume) 
> To maintain the state you only need to invalidate or increment the age counter for \
> every (FBO bypassing) full scene paint (if invalid and you get a partial update, \
> you'll have to render the full scene into the buffer once to get back a known base) \
>  Fredrik Höglund wrote:
> > However, if we have a full-screen repaint, we absolutely want to do a pageflip to \
> > the buffer we rendered to. So we could only use that renderbuffer for partial \
> > renders, which would make the accounting for the current state in there quite \
> > difficult.
> 
> Well we always swap at the beginning of painting, so the first time we do a partial \
> update we have an opportunity to blit from the backbuffer to the renderbuffer \
> before we swap. But yeah, it's not ideal. 
> > This would basically get us a ("reliable" as in "present") handcrafted buffer_age \
> > implementation with n = 1.
> 
> No it wouldn't. The point of EXT_buffer_age is that there is never any copy.
> 
> > Major issues:
> > 1. how unproblematic is the invocation of an FBO (featurewise, eg. afaik msaa \
> > would rather not work?! - not a particular problem here, though)
> 
> It makes MSAA support easier, because it's just a matter of attaching an MSAA \
> buffer to the FBO. The MSAA buffers are resolved by glBlitFramebuffer(). It also \
> makes it possible to only render with MSAA when there are transformed windows on \
> the screen. 
> > 2. how much overhead is it gonna introduce
> > a. painting into an FBO instead of GL_BACK (driver depending, i found a mesa \
> > commit to bypass an intel issue about n*512 or n*1024 dimensioned FBOs) b. \
> > blitting the FBO to GL_BACK
> 
> I don't know which commit you're referring to, but in theory there should be no \
> difference. 
> > The charming part is that you only have to paint the diff, the not-so-charming \
> > part is the extra fullsize blit (esp. for IGP mem throughput, i assume)
> 
> It's definitely not great from a memory bandwidth point of view, but it's not worse \
> than blitting between the back/front buffers since you copy the same number of \
> pixels. 
> 
> Thomas Lübking wrote:
> > No it wouldn't.
> "basically" ... "handcrafted" ... ;-)
> -> we could re-use the code.
> 
> 
> > I don't know which commit
> 
> http://lists.x.org/archives/xorg/2009-October/047346.html
> ^^^^ yes, i've seen that but the comment sounds HW related and was more general
> Mesa Git commit bcdaed2c0a4e70c3dd7c4648442c97540f3c9f1f:
> 
> 	 /* XXX: At least the i915 seems very upset when the pitch is a multiple
> 	  * of 1024 and sometimes 512 bytes - performance can drop by several
> 	  * times. Go to the next multiple of 64 for now.
> 	  */
> 	 if (!(mt->pitch & 511))
> 	    mt->pitch += 64;
> 
> > since you copy the same number of pixels.
> ... plus the damaged ones (to sync FBO and GL_BACK, or we'd have to paint them \
>                 twice) which can be quite some.
> -> Needs some estimation between "f"(bo) and "e"(xtend)

> Mesa Git commit bcdaed2c0a4e70c3dd7c4648442c97540f3c9f1f:

All GPU's have alignment and pitch requirements for surfaces.
But this is a problem for the driver and not something that kwin needs to concern \
itself with.

> > since you copy the same number of pixels.
> ... plus the damaged ones (to sync FBO and GL_BACK, or we'd have to paint them \
>                 twice) which can be quite some.
> -> Needs some estimation between "f"(bo) and "e"(xtend)

Oh right, I overlooked the fact that the region that's about to be painted is \
excluded from the copy.

By the way, use an enum with readable names. I don't want to have to look up what 'a' \
'c' 'p' 'e' and 'f' are supposed to be every time I look at the code.

- Fredrik

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
http://git.reviewboard.kde.org/r/107198/#review28012
-----------------------------------------------------------

On Feb. 21, 2013, 8:56 p.m., Thomas Lübking wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> http://git.reviewboard.kde.org/r/107198/
> -----------------------------------------------------------
> 
> (Updated Feb. 21, 2013, 8:56 p.m.)
> 
> 
> Review request for kwin, Martin Gräßlin and Ralf Jung.
> 
> 
> Description
> -------
> 
> This refactors the flushBuffer of glxBackend to handle the requirement of swapping \
> and copying independently. A config option allows to swap and then copy damaged \
> areas back to align front and backbuffers, by this use glSwapBuffer all the time \
> (even for minor screen updates) 
> The patch has a minor optimization for the fullscreen painting case to shortcut \
> into a plain buffer swap 
> Ratio:
> glWaitVideoSync is reported to not be supported by the nvidia blob ever since and \
> is no longer on SNA either, what means effecetively kwin does not provide GL \
> v'syncing for those GPUs / drivers. 
> Pending issue:
> the effectframes currently don't perform a clipped repaint (neither does beclock, \
> but that's my problem) so this needs to be changed (for the option being enabled) \
> to fix the various effectframe paints. 
> 
> This addresses bug 307965.
> http://bugs.kde.org/show_bug.cgi?id=307965
> 
> 
> Diffs
> -----
> 
> kwin/composite.cpp e6cb0d4 
> kwin/eglonxbackend.cpp 01d97c0 
> kwin/glxbackend.cpp be11497 
> kwin/options.h b6de1d5 
> kwin/options.cpp 893b1fa 
> kwin/scene.h f06d150 
> kwin/scene.cpp 685254b 
> kwin/scene_opengl.h 7971c83 
> kwin/scene_opengl.cpp 3185c9e 
> 
> Diff: http://git.reviewboard.kde.org/r/107198/diff/
> 
> 
> Testing
> -------
> 
> en- and disabled for OpenGL 2  + 1.3
> 
> 
> Thanks,
> 
> Thomas Lübking
> 
> 

[Attachment #5 (text/html)]

<html>
 <body>
  <div style="font-family: Verdana, Arial, Helvetica, Sans-Serif;">
   <table bgcolor="#f9f3c9" width="100%" cellpadding="8" style="border: 1px #c9c399 \
solid;">  <tr>
     <td>
      This is an automatically generated e-mail. To reply, visit:
      <a href="http://git.reviewboard.kde.org/r/107198/">http://git.reviewboard.kde.org/r/107198/</a>
  </td>
    </tr>
   </table>
   <br />

<blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <p style="margin-top: 0;">On February 25th, 2013, 1 p.m. UTC, <b>Martin \
Gräßlin</b> wrote:</p>  <blockquote style="margin-left: 1em; border-left: 2px solid \
#d0d0d0; padding-left: 10px;">  <pre style="white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">Is there anything special I should watch out for when running it? I have \
hardly noticed tearing on my system with the old implementation...</pre>  \
</blockquote>

 <p>On February 25th, 2013, 2:49 p.m. UTC, <b>Thomas Lübking</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Enable vsync and run \
glxgears. Maximize the window. See whether you can spot a steady tearline (usually in \
the upper region) If not, deactivate the subbuffer copy branch (ie. force using copy \
pixels) and see again. Then test with patched version.</pre>
 </blockquote>

 <p>On February 26th, 2013, 9:16 p.m. UTC, <b>Ralf Jung</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">Also, try downloading \
the teartest.mp4 from http://ompldr.org/iYXBldg-hide and run it in mplayer/VLC with \
various backends and in fullscreen and maximized.</pre>  </blockquote>

 <p>On February 26th, 2013, 11:37 p.m. UTC, <b>Fredrik Höglund</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">By the way, the obvious \
solution to the copy-from-front-buffer problem is to render to a user allocated \
renderbuffer and do a full blit to the backbuffer followed by a swap.</pre>  \
</blockquote>

 <p>On February 27th, 2013, 1:14 p.m. UTC, <b>Ralf Jung</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">However, if we have a \
full-screen repaint, we absolutely want to do a pageflip to the buffer we rendered \
to. So we could only use that renderbuffer for partial renders, which would make the \
accounting for the current state in there quite difficult.</pre>  </blockquote>

 <p>On February 27th, 2013, 3:43 p.m. UTC, <b>Thomas Lübking</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">This would basically get \
us a (&quot;reliable&quot; as in &quot;present&quot;) handcrafted buffer_age \
implementation with n = 1.

Major issues:
1. how unproblematic is the invocation of an FBO (featurewise, eg. afaik msaa would \
rather not work?! - not a particular problem here, though) 2. how much overhead is it \
gonna introduce  a. painting into an FBO instead of GL_BACK (driver depending, i \
found a mesa commit to bypass an intel issue about n*512 or n*1024 dimensioned FBOs)  \
b. blitting the FBO to GL_BACK

The charming part is that you only have to paint the diff, the not-so-charming part \
is the extra fullsize blit (esp. for IGP mem throughput, i assume)

To maintain the state you only need to invalidate or increment the age counter for \
every (FBO bypassing) full scene paint (if invalid and you get a partial update, \
you&#39;ll have to render the full scene into the buffer once to get back a known \
base)</pre>  </blockquote>

 <p>On February 27th, 2013, 7:34 p.m. UTC, <b>Fredrik Höglund</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">&gt; However, if we have \
a full-screen repaint, we absolutely want to do a pageflip to the buffer we rendered \
to. So we could only use that renderbuffer for partial renders, which would make the \
accounting for the current state in there quite difficult.

Well we always swap at the beginning of painting, so the first time we do a partial \
update we have an opportunity to blit from the backbuffer to the renderbuffer before \
we swap. But yeah, it&#39;s not ideal.

&gt; This would basically get us a (&quot;reliable&quot; as in &quot;present&quot;) \
handcrafted buffer_age implementation with n = 1.

No it wouldn&#39;t. The point of EXT_buffer_age is that there is never any copy.

&gt; Major issues:
&gt; 1. how unproblematic is the invocation of an FBO (featurewise, eg. afaik msaa \
would rather not work?! - not a particular problem here, though)

It makes MSAA support easier, because it&#39;s just a matter of attaching an MSAA \
buffer to the FBO. The MSAA buffers are resolved by glBlitFramebuffer(). It also \
makes it possible to only render with MSAA when there are transformed windows on the \
screen.

&gt; 2. how much overhead is it gonna introduce
&gt;    a. painting into an FBO instead of GL_BACK (driver depending, i found a mesa \
commit to bypass an intel issue about n*512 or n*1024 dimensioned FBOs) &gt;    b. \
blitting the FBO to GL_BACK

I don&#39;t know which commit you&#39;re referring to, but in theory there should be \
no difference.

&gt; The charming part is that you only have to paint the diff, the not-so-charming \
part is the extra fullsize blit (esp. for IGP mem throughput, i assume)

It&#39;s definitely not great from a memory bandwidth point of view, but it&#39;s not \
worse than blitting between the back/front buffers since you copy the same number of \
pixels. </pre>
 </blockquote>

 <p>On February 27th, 2013, 8:14 p.m. UTC, <b>Thomas Lübking</b> wrote:</p>
 <blockquote style="margin-left: 1em; border-left: 2px solid #d0d0d0; padding-left: \
10px;">  <pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">&gt; No it wouldn&#39;t. \
                &quot;basically&quot; ... &quot;handcrafted&quot; ... ;-)
-&gt; we could re-use the code.

&gt; I don&#39;t know which commit

http://lists.x.org/archives/xorg/2009-October/047346.html
                                  ^^^^ yes, i&#39;ve seen that but the comment sounds \
HW related and was more general Mesa Git commit \
bcdaed2c0a4e70c3dd7c4648442c97540f3c9f1f:

	 /* XXX: At least the i915 seems very upset when the pitch is a multiple
	  * of 1024 and sometimes 512 bytes - performance can drop by several
	  * times. Go to the next multiple of 64 for now.
	  */
	 if (!(mt-&gt;pitch &amp; 511))
	    mt-&gt;pitch += 64;

&gt; since you copy the same number of pixels.
... plus the damaged ones (to sync FBO and GL_BACK, or we&#39;d have to paint them \
                twice) which can be quite some.
-&gt; Needs some estimation between &quot;f&quot;(bo) and &quot;e&quot;(xtend)</pre>
 </blockquote>

</blockquote>

<pre style="white-space: pre-wrap; white-space: -moz-pre-wrap; white-space: \
-pre-wrap; white-space: -o-pre-wrap; word-wrap: break-word;">&gt; Mesa Git commit \
bcdaed2c0a4e70c3dd7c4648442c97540f3c9f1f:

All GPU&#39;s have alignment and pitch requirements for surfaces.
But this is a problem for the driver and not something that kwin needs to concern \
itself with.

&gt;&gt; since you copy the same number of pixels.
&gt; ... plus the damaged ones (to sync FBO and GL_BACK, or we&#39;d have to paint \
them twice) which can be quite some. &gt; -&gt; Needs some estimation between \
&quot;f&quot;(bo) and &quot;e&quot;(xtend)

Oh right, I overlooked the fact that the region that&#39;s about to be painted is \
excluded from the copy.

By the way, use an enum with readable names. I don&#39;t want to have to look up what \
&#39;a&#39; &#39;c&#39; &#39;p&#39; &#39;e&#39; and &#39;f&#39; are supposed to be \
every time I look at the code. </pre>
<br />

<p>- Fredrik</p>

<br />
<p>On February 21st, 2013, 8:56 p.m. UTC, Thomas Lübking wrote:</p>

<table bgcolor="#fefadf" width="100%" cellspacing="0" cellpadding="8" \
style="background-image: \
url('http://git.reviewboard.kde.org/static/rb/images/review_request_box_top_bg.ab6f3b1072c9.png'); \
background-position: left top; background-repeat: repeat-x; border: 1px black \
solid;">  <tr>
  <td>

<div>Review request for kwin, Martin Gräßlin and Ralf Jung.</div>
<div>By Thomas Lübking.</div>

<p style="color: grey;"><i>Updated Feb. 21, 2013, 8:56 p.m.</i></p>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Description </h1>
 <table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" \
style="border: 1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">This refactors the flushBuffer of glxBackend to handle the requirement \
of swapping and copying independently. A config option allows to swap and then copy \
damaged areas back to align front and backbuffers, by this use glSwapBuffer all the \
time (even for minor screen updates)

The patch has a minor optimization for the fullscreen painting case to shortcut into \
a plain buffer swap

Ratio:
glWaitVideoSync is reported to not be supported by the nvidia blob ever since and is \
no longer on SNA either, what means effecetively kwin does not provide GL \
v&#39;syncing for those GPUs / drivers.

Pending issue:
the effectframes currently don&#39;t perform a clipped repaint (neither does beclock, \
but that&#39;s my problem) so this needs to be changed (for the option being enabled) \
to fix the various effectframe paints. </pre>
  </td>
 </tr>
</table>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Testing </h1>
<table width="100%" bgcolor="#ffffff" cellspacing="0" cellpadding="10" style="border: \
1px solid #b8b5a0">  <tr>
  <td>
   <pre style="margin: 0; padding: 0; white-space: pre-wrap; white-space: \
-moz-pre-wrap; white-space: -pre-wrap; white-space: -o-pre-wrap; word-wrap: \
break-word;">en- and disabled for OpenGL 2  + 1.3</pre>  </td>
 </tr>
</table>

<div style="margin-top: 1.5em;">
 <b style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Bugs: </b>

 <a href="http://bugs.kde.org/show_bug.cgi?id=307965">307965</a>

</div>

<h1 style="color: #575012; font-size: 10pt; margin-top: 1.5em;">Diffs</b> </h1>
<ul style="margin-left: 3em; padding-left: 0;">

 <li>kwin/composite.cpp <span style="color: grey">(e6cb0d4)</span></li>

 <li>kwin/eglonxbackend.cpp <span style="color: grey">(01d97c0)</span></li>

 <li>kwin/glxbackend.cpp <span style="color: grey">(be11497)</span></li>

 <li>kwin/options.h <span style="color: grey">(b6de1d5)</span></li>

 <li>kwin/options.cpp <span style="color: grey">(893b1fa)</span></li>

 <li>kwin/scene.h <span style="color: grey">(f06d150)</span></li>

 <li>kwin/scene.cpp <span style="color: grey">(685254b)</span></li>

 <li>kwin/scene_opengl.h <span style="color: grey">(7971c83)</span></li>

 <li>kwin/scene_opengl.cpp <span style="color: grey">(3185c9e)</span></li>

</ul>

<p><a href="http://git.reviewboard.kde.org/r/107198/diff/" style="margin-left: \
3em;">View Diff</a></p>

  </td>
 </tr>
</table>

  </div>
 </body>
</html>

_______________________________________________
kwin mailing list
kwin@kde.org
https://mail.kde.org/mailman/listinfo/kwin

[prev in list] [next in list] [prev in thread] [next in thread]