[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-fpga
Subject: Re: [PATCH v4 3/3] fpga: dfl: fme: add power management support
From: Guenter Roeck <linux () roeck-us ! net>
Date: 2019-06-29 16:21:29
Message-ID: 7149d96b-da6d-8e03-997f-0611c1654058 () roeck-us ! net
[Download RAW message or body]
On 6/28/19 5:33 PM, Wu Hao wrote:
> On Fri, Jun 28, 2019 at 10:55:14AM -0700, Guenter Roeck wrote:
> > On Thu, Jun 27, 2019 at 12:53:38PM +0800, Wu Hao wrote:
> > > This patch adds support for power management private feature under
> > > FPGA Management Engine (FME). This private feature driver registers
> > > a hwmon for power (power1_input), thresholds information, e.g.
> > > (power1_max / crit / max_alarm / crit_alarm) and also read-only sysfs
> > > interfaces for other power management information. For configuration,
> > > user could write threshold values via above power1_max / crit sysfs
> > > interface under hwmon too.
> > >
> > > Signed-off-by: Luwei Kang <luwei.kang@intel.com>
> > > Signed-off-by: Xu Yilun <yilun.xu@intel.com>
> > > Signed-off-by: Wu Hao <hao.wu@intel.com>
> > > ---
> > > v2: create a dfl_fme_power hwmon to expose power sysfs interfaces.
> > > move all sysfs interfaces under hwmon
> > > consumed --> hwmon power1_input
> > > threshold1 --> hwmon power1_cap
> > > threshold2 --> hwmon power1_crit
> > > threshold1_status --> hwmon power1_cap_status
> > > threshold2_status --> hwmon power1_crit_status
> > > xeon_limit --> hwmon power1_xeon_limit
> > > fpga_limit --> hwmon power1_fpga_limit
> >
> > How do those limits differ from the other limits ?
> > We do have powerX_cap and powerX_cap_max, and from the context
> > it appears that you could possibly at least use power1_cap_max
> > (and power1_cap instead of power1_max) instead of
> > power1_fpga_limit.
>
> Thanks a lot for the review and comments.
>
> Actually xeon/fpga_limit are introduced for different purpose. It shows
> the power limit of CPU and FPGA, that may be useful in some integrated
> solution, e.g. CPU and FPGA shares power. We should never these
> interfaces as throttling thresholds.
>
Ok, your call. Just keep in mind that non-standard attributes won't show
up with the sensors command, and won't be visible for libsensors.
> >
> > > ltr --> hwmon power1_ltr
> > > v3: rename some hwmon sysfs interfaces to follow hwmon ABI.
> > > power1_cap --> power1_max
> > > power1_cap_status --> power1_max_alarm
> > > power1_crit_status --> power1_crit_alarm
> >
> > power1_cap is standard ABI, and since the value is enforced by HW,
> > it should be usable.
>
> As you see, in thermal management, threshold1 and threshold2 are
> mapped to temp1_max_alarm and temp1_crit_alarm. So we feel that if
> it will be friendly to user that we keep using max_alarm and crit_alarm
> in power management for threshold1 and threshold2 too.
>
> Do you think if we can keep this, or it's better to switch back to
> power1_cap?
>
Again, your call.
>
> >
> > > update sysfs doc for above sysfs interface changes.
> > > replace scnprintf with sprintf in sysfs interface.
> > > v4: use HWMON_CHANNEL_INFO.
> > > update date in sysfs doc.
> > > ---
> > > Documentation/ABI/testing/sysfs-platform-dfl-fme | 67 +++++++
> > > drivers/fpga/dfl-fme-main.c | 221 +++++++++++++++++++++++
> > > 2 files changed, 288 insertions(+)
> > >
> > > diff --git a/Documentation/ABI/testing/sysfs-platform-dfl-fme \
> > > b/Documentation/ABI/testing/sysfs-platform-dfl-fme index 2cd17dc..a669548 \
> > > 100644
> > > --- a/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > +++ b/Documentation/ABI/testing/sysfs-platform-dfl-fme
> > > @@ -127,6 +127,7 @@ Contact: Wu Hao <hao.wu@intel.com>
> > > Description: Read-Only. Read this file to get the name of hwmon device, it
> > > supports values:
> > > 'dfl_fme_thermal' - thermal hwmon device name
> > > + 'dfl_fme_power' - power hwmon device name
> > >
> > > What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/temp1_input
> > > Date: June 2019
> > > @@ -183,3 +184,69 @@ Description: Read-Only. Read this file to get the policy \
> > > of hardware threshold1 (see 'temp1_max'). It only supports two values \
> > > (policies): 0 - AP2 state (90% throttling)
> > > 1 - AP1 state (50% throttling)
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_input
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-Only. It returns current FPGA power consumption in uW.
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_max
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-Write. Read this file to get current hardware power
> > > + threshold1 in uW. If power consumption rises at or above
> > > + this threshold, hardware starts 50% throttling.
> > > + Write this file to set current hardware power threshold1 in uW.
> > > + As hardware only accepts values in Watts, so input value will
> > > + be round down per Watts (< 1 watts part will be discarded).
> > > + Write fails with -EINVAL if input parsing fails or input isn't
> > > + in the valid range (0 - 127000000 uW).
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-Write. Read this file to get current hardware power
> > > + threshold2 in uW. If power consumption rises at or above
> > > + this threshold, hardware starts 90% throttling.
> > > + Write this file to set current hardware power threshold2 in uW.
> > > + As hardware only accepts values in Watts, so input value will
> > > + be round down per Watts (< 1 watts part will be discarded).
> > > + Write fails with -EINVAL if input parsing fails or input isn't
> > > + in the valid range (0 - 127000000 uW).
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_max_alarm
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-only. It returns 1 if power consumption is currently at or
> > > + above hardware threshold1 (see 'power1_max'), otherwise 0.
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_crit_alarm
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-only. It returns 1 if power consumption is currently at or
> > > + above hardware threshold2 (see 'power1_crit'), otherwise 0.
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_xeon_limit
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-Only. It returns power limit for XEON in uW.
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_fpga_limit
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-Only. It returns power limit for FPGA in uW.
> > > +
> > > +What: /sys/bus/platform/devices/dfl-fme.0/hwmon/hwmonX/power1_ltr
> > > +Date: June 2019
> > > +KernelVersion: 5.3
> > > +Contact: Wu Hao <hao.wu@intel.com>
> > > +Description: Read-only. Read this file to get current Latency Tolerance
> > > + Reporting (ltr) value. This ltr impacts the CPU low power
> > > + state in integrated solution.
> >
> > Does that attribute add any value without any kind of unit or an explanation
> > of its meaning ? What is userspace supposed to do with that information ?
> > Without context, it is just a meaningless number.
>
> I should add more description here, will fix it in the next version.
>
> >
> > Also, it appears that the information is supposed to be passed to power
> > management via the set_latency_tolerance() callback. If so, it would be
> > reported there. Would it possibly make more sense to use that interface ?
>
> If I remember correctly set_latency_tolerance is used to communicate a tolerance
> to device, but actually this is a read-only value, to indicate latency tolerance
> requirement for memory access from FPGA device, as you know FPGA could be
> programmed for different workloads, and different workloads may have different
> latency requirements, if workloads in FPGA don't have any need for immediate
> memory access, then it would be safe for deeper sleep state of system memory.
>
Hmm, you are correct. Yes, this attribute could definitely benefit from a more
detailed explanation.
Thanks,
Guenter
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic