[prev in list] [next in list] [prev in thread] [next in thread] 

List:       kde-core-devel
Subject:    Re: OCS Providers File - High Numbers of Requests
From:       Shantanu Tushar Jha <shantanu () kde ! org>
Date:       2021-09-24 6:33:38
Message-ID: CABQ4Km-=gjTZRD3qR1YePJ6ppsGoZKupMNn-ZErzrs4fuqRYGQ () mail ! gmail ! com
[Download RAW message or body]

Hi,

Just an idea - do we have enough logs on the server to see the requests
history by date? If so, one can identify if there was a spike of the
requests after a particular set of dates (which can in turn give us a hint
about which release might contain a bug that's making more calls than
expected). Of course this is not useful if it's apparent that the request
count increased gradually instead of a spike.

Shantanu

On Thu, 23 Sept 2021 at 22:13, Nicolás Alvarez <nicolas.alvarez@gmail.com>
wrote:

> El jue, 23 de sep. de 2021 a la(s) 08:55, Aleix Pol (aleixpol@kde.org)
> escribió:
> >
> > On Thu, Sep 23, 2021 at 11:52 AM Ben Cooksley <bcooksley@kde.org> wrote:
> > >
> > > Hi all,
> > >
> > > It has recently come to our attention that the number of queries being
> handled for the endpoint https://autoconfig.kde.org/ocs/providers.xml on
> a day to day basis has gotten to the point where it is causing issues with
> server responsiveness to other traffic. This is perhaps best summarised by
> the following:
> > >
> > > root@nicoda /var/log/apache2 # ls -lah ...
> > > -rw-r----- 1 root adm 458M Sep 23 06:25 autoconfig.kde.org.log.1
> > > -rw-r----- 1 root adm 381M Sep 23 06:25 networkcheck.kde.org.log.1
> > > -rw-r----- 1 root adm 143M Sep 23 06:25 www.kde.org.log.1
> > >
> > > root@nicoda /var/log/apache2 # cat autoconfig.kde.org.log.1 | wc -l
> > > 4,222,343
> > >
> > > Based on those numbers we're looking at 48-49 requests per second (on
> average - peaks are much higher by many magnitudes), which seems extremely
> excessive given that this file is only supposed to be retrieved by KDE
> software when GHNS functionality is triggered. That is supported by the
> substantial size difference it has over networkcheck.kde.org - which is
> used by plasma-nm and NetworkManager (on Neon) to check for whether they
> have a working internet connection - which i'd expect to be the site
> receiving the most traffic.
> > >
> > > As such, I therefore suspect we have bug(s) in software that makes use
> of GHNS functionality.
> > >
> > > It would therefore be appreciated if we could please review the
> software in question to determine whether it is operating correctly. Given
> that it usually runs in the background on user systems, i'd especially
> appreciate it if a detailed review could be conducted on Discover and other
> software that conducts package management operations or assists in managing
> updates.
> > >
> > > Unfortunately all these applications submit a fairly useless user
> agent (Mozilla/5.0) so it is impossible for Sysadmin to ascertain any
> further information. If we could get information on the software that is
> originating the request added to the user agent to assist in investigating
> these issues in the future that would be extremely helpful.
> > >
> > > Thanks,
> > > Ben
> >
> > That's correct. Discover fetches them at startup. It's necessary to be
> > able to check if there are updates on KNS-provided resources.
> >
> > Incidentally,  I was looking into this yesterday incidentally. We
> > could see if caching is broken somehow. A request will still be needed
> > though to check if the cache is out of date.
>
> Caching seems to be working, since the vast majority of the requests
> are returning 304 Not Modified.
>
> However in *many* cases I see a single IP making multiple requests in
> the same second, and doing it again the next minute. Here's one IP
> address picked randomly:
>
> [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:25:41 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:27:57 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:27:58 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:27:58 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:32 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:28:59 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 200
> [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:30:11 +0000] "GET /ocs/providers.xml HTTP/1.1" 200
> [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:30:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:19 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
> [22/Sep/2021:06:31:38 +0000] "GET /ocs/providers.xml HTTP/1.1" 304
>
> This continues for hours. And it's not an isolated case; again, the IP
> I searched for was a random one from the log.
>
> There are 120 IP addresses that *each* made more than 10,000 requests
> in a 24h period.
>
> I tried a few GHNS things on my own system and I couldn't reproduce it...
>
> --
> Nicolás
>


-- 
Shantanu Tushar
shantanu.io

[Attachment #3 (text/html)]

<div dir="ltr"><div>Hi,</div><div><br></div><div>Just an idea - do we have enough \
logs on the server to see the requests history by date? If so, one can identify if \
there was a spike of the requests after a particular set of dates (which can in turn \
give us a hint about which release might contain a bug that&#39;s making more calls \
than expected). Of course this is not useful if it&#39;s apparent that the request \
count increased gradually instead of a \
spike.</div><div><br></div><div>Shantanu<br></div></div><br><div \
class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 23 Sept 2021 at 22:13, \
Nicolás Alvarez &lt;<a \
href="mailto:nicolas.alvarez@gmail.com">nicolas.alvarez@gmail.com</a>&gt; \
wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px \
0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">El jue, 23 de sep. de \
2021 a la(s) 08:55, Aleix Pol (<a href="mailto:aleixpol@kde.org" \
target="_blank">aleixpol@kde.org</a>) escribió:<br> &gt;<br>
&gt; On Thu, Sep 23, 2021 at 11:52 AM Ben Cooksley &lt;<a \
href="mailto:bcooksley@kde.org" target="_blank">bcooksley@kde.org</a>&gt; wrote:<br> \
&gt; &gt;<br> &gt; &gt; Hi all,<br>
&gt; &gt;<br>
&gt; &gt; It has recently come to our attention that the number of queries being \
handled for the endpoint <a href="https://autoconfig.kde.org/ocs/providers.xml" \
rel="noreferrer" target="_blank">https://autoconfig.kde.org/ocs/providers.xml</a> on \
a day to day basis has gotten to the point where it is causing issues with server \
responsiveness to other traffic. This is perhaps best summarised by the \
following:<br> &gt; &gt;<br>
&gt; &gt; root@nicoda /var/log/apache2 # ls -lah ...<br>
&gt; &gt; -rw-r----- 1 root adm 458M Sep 23 06:25 autoconfig.kde.org.log.1<br>
&gt; &gt; -rw-r----- 1 root adm 381M Sep 23 06:25 networkcheck.kde.org.log.1<br>
&gt; &gt; -rw-r----- 1 root adm 143M Sep 23 06:25 www.kde.org.log.1<br>
&gt; &gt;<br>
&gt; &gt; root@nicoda /var/log/apache2 # cat autoconfig.kde.org.log.1 | wc -l<br>
&gt; &gt; 4,222,343<br>
&gt; &gt;<br>
&gt; &gt; Based on those numbers we&#39;re looking at 48-49 requests per second (on \
average - peaks are much higher by many magnitudes), which seems extremely excessive \
given that this file is only supposed to be retrieved by KDE software when GHNS \
functionality is triggered. That is supported by the substantial size difference it \
has over <a href="http://networkcheck.kde.org" rel="noreferrer" \
target="_blank">networkcheck.kde.org</a> - which is used by plasma-nm and \
NetworkManager (on Neon) to check for whether they have a working internet connection \
- which i&#39;d expect to be the site receiving the most traffic.<br> &gt; &gt;<br>
&gt; &gt; As such, I therefore suspect we have bug(s) in software that makes use of \
GHNS functionality.<br> &gt; &gt;<br>
&gt; &gt; It would therefore be appreciated if we could please review the software in \
question to determine whether it is operating correctly. Given that it usually runs \
in the background on user systems, i&#39;d especially appreciate it if a detailed \
review could be conducted on Discover and other software that conducts package \
management operations or assists in managing updates.<br> &gt; &gt;<br>
&gt; &gt; Unfortunately all these applications submit a fairly useless user agent \
(Mozilla/5.0) so it is impossible for Sysadmin to ascertain any further information. \
If we could get information on the software that is originating the request added to \
the user agent to assist in investigating these issues in the future that would be \
extremely helpful.<br> &gt; &gt;<br>
&gt; &gt; Thanks,<br>
&gt; &gt; Ben<br>
&gt;<br>
&gt; That&#39;s correct. Discover fetches them at startup. It&#39;s necessary to \
be<br> &gt; able to check if there are updates on KNS-provided resources.<br>
&gt;<br>
&gt; Incidentally,   I was looking into this yesterday incidentally. We<br>
&gt; could see if caching is broken somehow. A request will still be needed<br>
&gt; though to check if the cache is out of date.<br>
<br>
Caching seems to be working, since the vast majority of the requests<br>
are returning 304 Not Modified.<br>
<br>
However in *many* cases I see a single IP making multiple requests in<br>
the same second, and doing it again the next minute. Here&#39;s one IP<br>
address picked randomly:<br>
<br>
[22/Sep/2021:06:25:41 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:25:41 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:25:41 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:25:41 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:27:57 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:27:58 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:27:58 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:32 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:32 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:32 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:32 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:59 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:59 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:59 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:28:59 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:30:11 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 200<br>
[22/Sep/2021:06:30:11 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:30:11 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 200<br>
[22/Sep/2021:06:30:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:30:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:30:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:30:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:19 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:19 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:19 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:19 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
[22/Sep/2021:06:31:38 +0000] &quot;GET /ocs/providers.xml HTTP/1.1&quot; 304<br>
<br>
This continues for hours. And it&#39;s not an isolated case; again, the IP<br>
I searched for was a random one from the log.<br>
<br>
There are 120 IP addresses that *each* made more than 10,000 requests<br>
in a 24h period.<br>
<br>
I tried a few GHNS things on my own system and I couldn&#39;t reproduce it...<br>
<br>
-- <br>
Nicolás<br>
</blockquote></div><br clear="all"><br>-- <br><div dir="ltr" \
class="gmail_signature"><div dir="ltr">Shantanu Tushar<br><a \
href="http://shantanu.io" target="_blank">shantanu.io</a></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic