'Re: [Mesa-dev] [PATCH 3/5] i965/vec4: Use swizzle() to swizzle immediates during constant propagatio'

[prev in list] [next in list] [prev in thread] [next in thread] 

List:       mesa3d-dev
Subject:    Re: [Mesa-dev] [PATCH 3/5] i965/vec4: Use swizzle() to swizzle immediates during constant propagatio
From:       Matt Turner <mattst88 () gmail ! com>
Date:       2016-02-29 23:16:33
Message-ID: CAEdQ38Ej66424Ok5qdyVy_oBEAGLBrtzM6cMzw1E8S4Rh4MMRQ () mail ! gmail ! com
[Download RAW message or body]

On Mon, Feb 29, 2016 at 2:20 PM, Francisco Jerez <currojerez@riseup.net> wrote:
> Matt Turner <mattst88@gmail.com> writes:
> 
> > On Mon, Feb 29, 2016 at 7:30 AM, Iago Toral <itoral@igalia.com> wrote:
> > > On Fri, 2016-02-26 at 22:02 -0800, Francisco Jerez wrote:
> > > > swizzle() works for vector immediates other than VF.
> > > > ---
> > > > .../drivers/dri/i965/brw_vec4_copy_propagation.cpp    | 19 \
> > > > +------------------ 1 file changed, 1 insertion(+), 18 deletions(-)
> > > > 
> > > > diff --git a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp \
> > > > b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp index \
> > > >                 6bd9928..5c25164 100644
> > > > --- a/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> > > > +++ b/src/mesa/drivers/dri/i965/brw_vec4_copy_propagation.cpp
> > > > @@ -76,22 +76,6 @@ is_channel_updated(vec4_instruction *inst, src_reg \
> > > > *values[4], int ch) inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, \
> > > > ch))); }
> > > > 
> > > > -static unsigned
> > > > -swizzle_vf_imm(unsigned vf4, unsigned swizzle)
> > > > -{
> > > > -   union {
> > > > -      unsigned vf4;
> > > > -      uint8_t vf[4];
> > > > -   } v = { vf4 }, ret;
> > > > -
> > > > -   ret.vf[0] = v.vf[BRW_GET_SWZ(swizzle, 0)];
> > > > -   ret.vf[1] = v.vf[BRW_GET_SWZ(swizzle, 1)];
> > > > -   ret.vf[2] = v.vf[BRW_GET_SWZ(swizzle, 2)];
> > > > -   ret.vf[3] = v.vf[BRW_GET_SWZ(swizzle, 3)];
> > > > -
> > > > -   return ret.vf4;
> > > > -}
> > > > -
> > > > static bool
> > > > is_logic_op(enum opcode opcode)
> > > > {
> > > > @@ -144,8 +128,7 @@ try_constant_propagate(const struct brw_device_info \
> > > > *devinfo, }
> > > > }
> > > > 
> > > > -   if (value.type == BRW_REGISTER_TYPE_VF)
> > > > -      value.ud = swizzle_vf_imm(value.ud, inst->src[arg].swizzle);
> > > > +   value = swizzle(value, inst->src[arg].swizzle);
> > > 
> > > so I guess V/UV was broken before this?
> > 
> > My question is in the same vein -- what does swizzling V/UV even mean?
> > Swizzles operate on 4-component things, but V/UV have 8 components.
> > 
> swizzle() operates on 8-component things 99% of the time.  Its semantics
> are determined by the equivalence of:

Swizzle operates on vec4s. The fact that SIMD8 instructions operate on
2x vec4s doesn't matter as far as I can tell.

This seems analogous to SSE instructions that operate on the top and
bottom halves of a 128-bit register independently. Yes, the
instruction operates on 128-bits, but the operation treats the halves
completely independently.

But all of this is beside the point. I really can't believe we have
these sorts of disagreements.

> 
> OP ..., swizzle(x, swz)
> 
> to:
> 
> MOV tmp, x
> OP ..., tmp.swz
> 
> where the MOV has the same execution type and width as the arbitrary
> instruction OP.  With this in mind there are no restrictions on what x
> can be, as long as it can be read from, you just do what the hardware
> does.
> 
> > Concretely, for an immediate of 0x1234567UV, what does applying a
> > swizzle of .yxzw do? Does it swap two dwords (each containing two
> > values) and produce 0x34125678UV? Or does it operate on both halves
> > individually and produce 0x21346578UV?
> > 
> At the hardware level a swizzle specifies a permutation of 4 which is
> cyclically extended to whatever the vector width is.  So assuming your
> immediate is 0x01234567UV the correct answer is 0x01324576UV.

Sorry, right. I meant 0x01234567UV.

> 
> > Moreover, using one of these immediate types requires a W destination,
> 
> I don't think that's right.

The IVB PRM (3.3.4 Immediate) says

Restriction: When an immediate vector is used in an instruction, the
destination must be 128-bit aligned with destination horizontal stride
equivalent to a word for an immediate integer vector (v) and
equivalent to a dword for an immediate float vector (vf).

So you could use a byte destination with a stride of 2. But my point
is that you can't use them with a destination type of D/UD.

> 
> > which isn't possible in align16 instructions -- so swizzling V/UV
> > seems pointless.
> 
> We use V *and* a W destination in Align16 mode in
> gen6_gs_visitor.cpp:617.  Admittedly the instruction looks rather weird
> and I considered replacing it with something else, but there's no harm

Oh, weird. I definitely was not aware of that.

> in swizzle() handling its whole domain consistently, and I didn't feel
> comfortable about leaving this one case unimplemented and letting the
> program crash whenever a V/UV immediate comes up in some optimization
> pass that happens to use swizzle(), which may happen as long as
> something in the vec4 back-end continues using V/UV.

That's fair given that there is an actual use in the vec4 backend.
_______________________________________________
mesa-dev mailing list
mesa-dev@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/mesa-dev

[prev in list] [next in list] [prev in thread] [next in thread]