[prev in list] [next in list] [prev in thread] [next in thread] 

List:       tomcat-dev
Subject:    Re: FYI: jk 1.2.20 core on iSeries v5R3
From:       Eric Wertman <wertmane () comcast ! net>
Date:       2007-01-31 23:45:05
Message-ID: 45C12A01.8000403 () comcast ! net
[Download RAW message or body]

Connector:

<Connector port="8010" protocol="AJP/1.3" enableLookups="false" 
maxThreads="800" minThreads="100" bufferSize="8192" backlog="256" />

I've tried all sorts of permutations of this... None of the values make 
much difference.   My ab test is -n 1000 -c 200.

---------------------------------------

workers.properties:

# PlatformServer
worker.PlatformServer.type=ajp13
worker.PlatformServer.host=localhost
worker.PlatformServer.port=8010
worker.PlatformServer.connection_pool_timeout=60
worker.PlatformServer.socket_timeout=60
worker.PlatformServer.retries=5

I've been through a lot of different settings here too.  I've tried some 
almost insane values for maxThreads
---------------------------------------

httpd.conf (Using virtual servers, forgot to mention that before)

<IfModule jk_module>
JkWorkersFile conf/workers.properties
JkShmFile     logs/mod_jk.shm
JkShmSize     8192
JkLogFile     logs/mod_jk.log
JkLogLevel    debug
</IfModule>

---------------------------------------

mod_jk startup at debug level.  I only included it because it shows what 
mod_jk thinks of my connector.

[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (239): 
creating worker PlatformServer
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (144): 
about to create instance PlatformServer of ajp13
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_worker.c (157): 
about to validate and init PlatformServer
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(1971): worker PlatformServer contact is 'localhost:8010'
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2098): setting endpoint options:
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2101): keepalive:        0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2105): timeout:          60
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2109): buffer size:      0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2113): pool timeout:     60
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2117): connect timeout:  0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2121): reply timeout:    0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2125): prepost timeout:  0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2129): recovery options: 0
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2133): retries:          5
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2137): max packet size:  8192
[Wed Jan 31 18:26:15 2007] [2281674:0000] [debug] jk_ajp_common.c 
(2008): setting connection pool size to 25 with min 13

---------------------------------------

Can't reproduce at debug level.  This is info (got it on the first try).

[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: 41
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x3837 from 127.0.0.1:8010
[Wed Jan 31 18:31:10 2007] [2539586:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Wed Jan 31 18:31:10 2007] [2539586:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer
[Wed Jan 31 18:31:11 2007] [2539586:0000] [info]  mod_jk.c (401): Write 
without start, starting with defaults
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c (970): 
wrong message format 0x0400 from 127.0.0.1:8010
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
(1566): (PlatformServer) Tomcat is down or refused connection. No 
response has been sent to the client (yet)
[Wed Jan 31 18:31:12 2007] [2539586:0000] [error] jk_ajp_common.c 
(1504): Unknown AJP protocol code: 02
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1877): (PlatformServer) receiving from tomcat failed, recoverable 
operation attempt=0
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1916): (PlatformServer) sending request to tomcat failed,  recoverable 
operation attempt=1
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  jk_ajp_common.c 
(1842): (PlatformServer) request failed, because of server error without 
recovery in send loop attempt=0
[Wed Jan 31 18:31:12 2007] [2539586:0000] [info]  mod_jk.c (2142): 
Service error=-5 for worker=PlatformServer

---------------------------------------

Here's the Tomcat log:

Jan 31, 2007 6:31:10 PM org.apache.jk.common.HandlerRequest invoke
INFO: Unknown message 0
Jan 31, 2007 6:31:12 PM org.apache.jk.common.HandlerRequest invoke
INFO: Unknown message 0

---------------------------------------


And here's the only thing I know about dbx:

dbx ../bin/httpd core
Type 'help' for help.
warning: The core file is not a fullcore. Some info may not be available.
[using memory image in core]
reading symbolic information ...


Segmentation fault in sig_coredump at 0x10037590
0x10037590 (sig_coredump+0x84) 80410014        lwz   r2,0x14(r1)
(dbx) where
sig_coredump() at 0x10037590
malloc_y.malloc_y(0x2000, 0x0, 0x1fe8, 0x30411700, 0xf879614d, 
0x7858c51, 0x17, 0x0) at 0xd03002c4
malloc_common.malloc_common_53_36(??) at 0xd02fd8b8
jk_pool_dyn_alloc() at 0xdc05a4c0
jk_pool_alloc() at 0xdc05a5b4
jk_b_set_buffer_size() at 0xdc06c0fc
ajp_service() at 0xdc064c28
jk_handler() at 0xdc050e60
ap_run_handler() at 0x10003a24
ap_invoke_handler() at 0x10004810
ap_process_request() at 0x1002fd88
ap_process_http_connection() at 0x1004d0dc
ap_run_process_connection() at 0x10013dc0
ap_process_connection() at 0x100144f0
process_socket() at 0x1000b03c
worker_thread() at 0x1000a8e0
dummy_worker() at 0xdbdb8384

---------------------------------------

I'll have a look and see if I can't get more information out of tomcat.. 
there must be a way to increase the logging there.  I am using 
mpm=worker.  If you still think it's worth the time I will re-compile 
with prefork and take another shot.

Thanks again for your help.

Eric

Rainer Jung wrote:
> Hi Eric,
>
> maybe your observation is in fact related to Henri's, since you also 
> get core dumps (segmentation faults), which is quite unusual. See 
> further comments inline.
>
> Eric Wertman wrote:
>> Sorry to jump in, I'm new here.  I started watching this list because 
>> of a problem I'm having with the mod_jk 1.2.20 as well.  I'm not 
>> getting core files, but I do have problems that I can't reproduce at 
>> log level trace or debug.
>
> Apache logs segmentation faults. You might be able to produce a core 
> (which would be very helpful) by setting the coredumpdirectory in 
> apache and maybe tweaking your os config. One point that might help to 
> produce cores, would be to not run apache as root.
>
>> I'm running it on AIX 5.3 (ml05).  I compiled the apache 2.2.4 and 
>> apr-1.2.8 along with the mod_jk, and using tomcat 5.5.20 with an IBM 
>> 1.5 JRE.
>>
>> I compiled them all using the IBM  cc_r compiler.  I've actually 
>> tried a number of different flags trying to resolve my problem, 
>> assuming initially that I was doing something incorrectly.  The 
>> behavior persists, though, and my compilations have all been clean.  
>> Apache without mod_jk doesn't give me any problems.
>>
>> Superficially it seems to work fine.  Once I run tests with ab, it 
>> starts to get a little ugly.  I get a number of failed requests, and 
>> these types of errors:
>>
>>
>>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1504): Unknown AJP protocol code: 41
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (970): wrong message format 0x3837 from 127.0.0.1:8010
>
> hose mean, taht something is fundamentally wrong in the answer mopd_jk 
> receives from tomcat. It doesn't really look like AJP/1.3. Is your MPM 
> worker or prefork? Could you try again with prefork, in case it is 
> worker?
>
> Do you get error message on the tomcat side?
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>> response has been sent to the client (yet)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1842): (PlatformServer) request failed, because of server error 
>> without recovery in send loop attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>> Service error=-5 for worker=PlatformServer
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (401): 
>> Write without start, starting with defaults
>
> Ths one is very unusual, and it is the log statement, that is the only 
> one, that could have produced Henri's core dump, although we still do 
> not know why. It correlates to a protocol error.
>
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (970): wrong message format 0x031a from 127.0.0.1:8010
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1592): (PlatformServer) Tomcat is down or network problems. Part of 
>> the response has already been sent to the cli
>> ent
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1504): Unknown AJP protocol code: D8
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1842): (PlatformServer) request failed, because of server error 
>> without recovery in send loop attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  mod_jk.c (2142): 
>> Service error=-5 for worker=PlatformServer
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=2
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -14 with errno=14
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (948): (PlatformServer) can't receive the response message from 
>> tomcat, network problems or tomcat is down (127.0.
>> 0.1:8010), err=-9
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1566): (PlatformServer) Tomcat is down or refused connection. No 
>> response has been sent to the client (yet)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1877): (PlatformServer) receiving from tomcat failed, recoverable 
>> operation attempt=0
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (1292): (PlatformServer) failed resending request body (812760852)
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=3
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1916): (PlatformServer) sending request to tomcat failed,  
>> recoverable operation attempt=1
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -9 with errno=9
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [info]  jk_ajp_common.c 
>> (1201): (PlatformServer) error sending request. Will try another 
>> pooled connection
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (915): sendfull returned -14 with errno=14
>> [Tue Jan 30 03:03:53 2007] [2203720:0000] [error] jk_ajp_common.c 
>> (948): (PlatformServer) can't receive the response message from 
>> tomcat, network problems or tomcat is down (127.0.
>>
>> And these in the apache error_log:
>>
>> [Tue Jan 30 03:02:28 2007] [info] Shared memory session cache 
>> initialised
>> [Tue Jan 30 03:02:28 2007] [info] Init: Initializing (virtual) 
>> servers for SSL
>> [Tue Jan 30 03:02:28 2007] [info] Server: Apache/2.2.4, Interface: 
>> mod_ssl/2.2.4, Library: OpenSSL/0.9.8d
>> [Tue Jan 30 03:02:28 2007] [notice] Apache/2.2.4 (Unix) mod_jk/1.2.20 
>> mod_ssl/2.2.4 OpenSSL/0.9.8d configured -- resuming normal operations
>> [Tue Jan 30 03:02:28 2007] [info] Server built: Jan 29 2007 23:44:30
>> [Tue Jan 30 03:02:28 2007] [debug] worker.c(1740): AcceptMutex: 
>> sysvsem (default: sysvsem)
>> [Tue Jan 30 03:03:54 2007] [notice] child pid 2203720 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:00 2007] [debug] worker.c(1083): the listener 
>> thread didn't exit
>> [Tue Jan 30 03:04:01 2007] [debug] worker.c(1083): the listener 
>> thread didn't exit
>> [Tue Jan 30 03:04:13 2007] [debug] worker.c(1354): taking over 
>> scoreboard slot from 1552520 (quiescing)
>> [Tue Jan 30 03:04:19 2007] [notice] child pid 2576392 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:23 2007] [notice] child pid 2310294 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:04:26 2007] [notice] child pid 2576394 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 03:29:30 2007] [notice] child pid 1552524 exit signal 
>> Segmentation fault (11)
>> [Tue Jan 30 04:09:22 2007] [notice] child pid 2584796 exit signal 
>> Segmentation fault (11)
>>
>>
>> Again,  I thought I would be able to track them down by changing the 
>> log level to "trace", but the problem disappears at that log level.  
>> I had assumed it was slowing the connection rate down enough to 
>> mitigate the problem.  Not sure if any of that helps.. if you can 
>> help me or want more info let me know.
>
> With JkLogLevel debug (which logs more than info but less than trace) 
> we would see, which packets you receive from tomcat. Maybe debug still 
> shows the problem?
>
> Could you please post your config (Jk* from apache, workers.roperties 
> and Connector from tomcat).
>
>>
>> Thanks!
>>
>> Eric
>
> Regards,
>
> Rainer
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
> For additional commands, e-mail: dev-help@tomcat.apache.org
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@tomcat.apache.org
For additional commands, e-mail: dev-help@tomcat.apache.org

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic