[prev in list] [next in list] [prev in thread] [next in thread]
List: linux1394-devel
Subject: Re: [PATCH v2] firewire: Enable physical DMA above 4GB
From: Peter Hurley <peter () hurleysoftware ! com>
Date: 2013-04-30 10:50:44
Message-ID: 1367319044.3494.62.camel () thor ! lan
[Download RAW message or body]
On Mon, 2013-04-29 at 22:17 +0200, Stefan Richter wrote:
> On Apr 29 Peter Hurley wrote:
> > On Sun, 2013-04-28 at 19:43 +0200, Stefan Richter wrote:
> > > On Mar 27 Peter Hurley wrote:
> > > > Quadlet reads to memory above 4GB is painfully slow when serviced
> > > > by the AR DMA context. In addition, the CPU(s) may be locked-up,
> > > > preventing any transfer at all.
> [...]
> > > Question to the 1st paragraph of your changelog: Aren't both reasons that
> > > you give purely theoretical at the moment?
> >
> > Definitely not theoretical. I have a user-space tool that makes a core
> > image over Firewire. Orders-of-magnitude slower without this patch.
> [...]
> > > Is there any other actual effect of this patch?
> >
> > I don't understand this question. The patch does exactly what it
> > purports to do; namely, enable physical DMA above 4GB.
>
> We have two applications of physical DMA on Linux at the moment:
>
> - The SBP-2 initiator. This one does not benefit from a higher Physical
> Upper Bound, because all the buffers (SCSI I/O buffers, ORB buffers,
> s/g tables) are mapped into a 4 GB range anyway.
> Correct me if I'm mistaken.
>
> - CONFIG_PROVIDE_OHCI1394_DMA_INIT and CONFIG_FIREWIRE_OHCI_REMOTE_DMA.
> The former of these could be made to benefit from raised Physical
> Upper Bound like the latter, although I am not sure whether there is
> interesting stuff at high addresses during boot. The latter benefits
> from your patch, but _not_ as a matter of better throughput, but rather
> as a matter of doesn't work -> works.
>
> Are you possibly alluding to a third application which is similar to
> CONFIG_FIREWIRE_OHCI_REMOTE_DMA but is able to fall back to AR-req +
> AT-resp DMA when getting requests above Physical Upper Bound? If such an
> application exists, then let's say so in the changelog because this would
> be news not only to me, and it would clarify why you are talking about
> performance rather than enablement.
Ah, I see now why you're asking.
Yes, I had a simple but fragile patchset that handled AR read requests
to phys memory above 4GB. I doubt I'll ever submit it because it's slow,
delicate and of limited utility.
I'll remove the reference to that in changelog for the v3 patch.
> > > Questions to the 2nd paragraph: dma_get_required_mask(dev)+1 is not
> > > exactly end-of-memory; it is the smallest power-of-two which is greater or
> > > equal to end-of-memory. For example, on a PC equipped with 2 GB RAMand
> > > running a x86-32 kernel I get 0x8000'0000 (2 G), but on a PC with 16 GB
> > > RAM and x86-64 kernel I get 0x8'0000'0000 (32 G).
> >
> > Yes, that's true. In fact, on some arches, dma_get_required_mask()
> > simply returns a mask of all addressable virtual memory.
> >
> > > So, shouldn't the
> > > changelog say perhaps "Write the PhyUpperBound register to cover all
> > > available memory (up to 128 TB of it)."?
> >
> > Ok.
> >
> > > But why are you reading
> > > dma_get_required_mask in the first place rather than just setting it to
> > > 128 TB straight away?
> >
> > If you feel it's superfluous now, that's fine. In the future, I plan to
> > make a pitch/patches for the dma engine to expose the actual end of
> > physical memory.
> >
> > If you'd prefer, I can re-submit the equivalent change in that patchset
> > instead. However, it's easier to convince arch maintainers of necessary
> > changes if drivers already exhibit the required use case.
>
> At least as far as the OHCI-1394 spec says, physUpperBoundOffset (if
> implemented) can be set higher than we ever even want to. So the only
> motivation for downwards optimizing the value which we write there is to
> reduce possible conflict with AR based applications, isn't it? If so, and
> given firewire-core's current address handler design, then this downwards
> optimization needs to happen in fw_core_init actually.
>
> Until these things are worked out, I do prefer that you omit the call to
> dma_get_required_mask in firewire-ohci.
Ok.
Regards,
Peter Hurley
------------------------------------------------------------------------------
Introducing AppDynamics Lite, a free troubleshooting tool for Java/.NET
Get 100% visibility into your production application - at no cost.
Code-level diagnostics for performance bottlenecks with <2% overhead
Download for free and get started troubleshooting in minutes.
http://p.sf.net/sfu/appdyn_d2d_ap1
_______________________________________________
mailing list linux1394-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux1394-devel
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic