[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tomcat-user
Subject:    Re: Tomcat 9 -> Intermittent 404 (3-4 fails in 20-30 million requests daily sometimes )
From:       Christopher Schultz <chris () christopherschultz ! net>
Date:       2023-10-19 2:39:17
Message-ID: 32e6c3eb-47d3-4ce0-bdba-a4b57d30a727 () christopherschultz ! net
[Download RAW message or body]

Anurag,

On 10/17/23 10:01, Anurag Kumar wrote:
> Thanks, Christopher, for looking into this issue.

Wait until I actually help before thanking me. I'm mostly trying to get 
more information so people smarter than I am can maybe help you. ;)

> Tomcat version:
> Server version: Apache Tomcat/9.0.74
> Server built: Apr 13, 2023 08:10:39 UTC
> Server number: 9.0.74.0

Would it be at all possible to upgrade to 9.0.latest on one or more of 
your cluster members? You'd have to read the changelog to see if there 
are any incompatibilities but I suspect you should be okay. There are 
constant improvements, and its possible that 6 months (since 9.0.74) has 
improved something that makes a difference for you.

> We became aware of this issue a few days ago when it was reported by a 
> customer due to a critical internal API failure, where the possibility 
> of unexpected characters was none. Upon investigating the Splunk logs, 
> we discovered that this issue had been occurring for at least the past 3 
> months, based on the available three-month log data.
> 
> We have a single servlet mapped for all URL patterns, and we log the 
> requests from this servlet. Internally, we always return a 200 response 
> code with the appropriate error page and never throw a 404 response.

Do you ever re-deploy your application while Tomcat is running? Which 
file/log contains the 404 responses?

> Here is our Connector configuration:
> 
> <Connector protocol="HTTP/1.1" port="8443" proxyPort="443" 
> scheme="https" secure="true" executor="httpsThreadPool" 
> acceptCount="250" SSLEnabled="false" connectionTimeout="20000" 
> URIEncoding="utf-8" enableLookups="false" 
> relaxedQueryChars="{}|&lt;&gt;&quot;" compression="on"/>
> 
> 
> These 404 issues have been observed on requests created from Chrome, 
> HttpURLConnection in Java, and AsyncHttpClient in Java. Our servers are 
> behind an Amazon Load Balancer (ALB), and while ALB operates on HTTP2, 
> our Tomcat servers are configured for HTTP1.
> 
> This issue has been reported on all nine different clusters running the 
> same Tomcat version. Our test environment closely mirrors the production 
> environment, but we have been unable to reproduce the issue so far, even 
> after increasing the number of requests.
> 
> It's challenging to identify any specific patterns as the occurrences 
> appear to be distributed randomly and happen with very simple GET 
> requests. There was one instance where I was able to reproduce the issue 
> in production with a straightforward GET request after making 45,000 
> calls, but it was never reproduced afterwards through my automation.

:/

> Request capture on ALB:-
> image.png

This list strips images out. Please send plain-text only.

-chris

> On Mon, Oct 16, 2023 at 6:16 PM Christopher Schultz 
> <chris@christopherschultz.net <mailto:chris@christopherschultz.net>> wrote:
> 
>     Anurag,
> 
>     On 10/15/23 04:48, Anurag Kumar wrote:
>      >
>      > Hi, we are experiencing intermittent 404 errors with both GET and
>     POST
>      > calls. These errors are quite rare and have proven difficult to
>      > reproduce in our testing environment. However, on our production
>     system,
>      > we encounter 3-4 cases daily out of 20-30 million requests where
>     a 404
>      > error appears in the Tomcat access logs, and the corresponding call
>      > fails to reach the mapped servlet. Interestingly, the same calls
>     work
>      > perfectly just a few milliseconds before and after on the same node.
>      > This inconsistency is causing significant issues, especially when
>      > critical API calls fail and are not automatically retried.
>      >
>      > Is there any open issue related to this problem that we should be
>     aware of?
> 
>     None that I know of personally.
> 
>     Can you post your exact Tomcat version, your <Connector> configuration
>     with any secrets removed and a little more background on the type of
>     traffic you are seeing (e.g. HTTP/1.1 v h2, TLS or not, etc.). Are you
>     able to tell if these failed requests are part of any kind of pipelined
>     requests (HTTP Keep-Alive) or h2 single channels?
> 
>     Understanding the network topology may be relevant, though its unlikely
>     that any lb/rp is doing this, as you can see the logs on the Tomcat
>     node. But it may change the way the requests are being handled based
>     upon the type of connection between the lb/rp and Tomcat.
> 
>     Have you double-checked that the URIs are clean and don't contain
>     anything unexpected such as lookalike characters, etc.? I suspect this
>     is not an issue since you said "critical API calls fail" which leads me
>     to understand that you have legitimate customers reporting these
>     failures, instead of just investigating unexpected entries in your log
>     files.
> 
>     Is your testing environment reasonably similar to production? What
>     would
>     happen if you were to reply a whole day's worth of production-requests
>     through your testing environment?
> 
>     Is there any pattern whatsoever in the failed requests? If you look at
>     every failed request for all time, are they randomly distributed
>     throughout your URI space, or do you find that some URIs are
>     over-represented in your failure data? You may have so few failures
>     that
>     you can't draw any conclusions.
> 
>     -chris
> 
>     ---------------------------------------------------------------------
>     To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
>     <mailto:users-unsubscribe@tomcat.apache.org>
>     For additional commands, e-mail: users-help@tomcat.apache.org
>     <mailto:users-help@tomcat.apache.org>
> 

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscribe@tomcat.apache.org
For additional commands, e-mail: users-help@tomcat.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic