[prev in list] [next in list] [prev in thread] [next in thread]
List: linux-ha-dev
Subject: Re: [Linux-ha-dev] ERROR: socket_resume_io_write() on Sun
From: Guochun Shi <gshi () ncsa ! uiuc ! edu>
Date: 2004-12-20 0:22:30
Message-ID: 5.1.0.14.2.20041219182121.05e41898 () pop ! ncsa ! uiuc ! edu
[Download RAW message or body]
not yet
but I plan to do so if it does not break anything
-Guochun
At 10:54 PM 12/19/2004 +0100, you wrote:
>gshi, did this patch also go into CVS?
>
>On Fri, 17 Dec 2004 14:36:20 -0600, Guochun Shi <gshi@ncsa.uiuc.edu> wrote:
>> Mark,
>>
>> The previous patch breaks client leave callback, please apply the following patch instead:
>>
>>
>> Index: hb_api.c
>> ===================================================================
>> RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> retrieving revision 1.124
>> diff -u -2 -r1.124 hb_api.c
>> --- hb_api.c 14 Dec 2004 22:12:31 -0000 1.124
>> +++ hb_api.c 17 Dec 2004 20:31:28 -0000
>> @@ -415,4 +415,8 @@
>>
>> if (client->removereason && !client->isindispatch) {
>> + if (ANYDEBUG){
>> + cl_log(LOG_DEBUG, "api_remove_client_pid: client is "
>> + "%s", client->client_id);
>> + }
>> api_remove_client_pid(client->pid
>> , client->removereason);
>> @@ -1526,5 +1530,4 @@
>>
>>
>> - api_send_client_status(req, LEAVESTATUS, reason);
>>
>> --total_client_count;
>> @@ -1552,20 +1555,28 @@
>>
>>
>> -#if MAKEITCRASH
>> /* Drop the source - that will destroy the 'chan' */
>> if (client->gsource) {
>> G_main_del_IPC_Channel(client->gsource);
>> }
>> -#endif
>> -
>> - /* Zap! */
>> - memset(client, 0, sizeof(*client));
>> - ha_free(client); client = NULL;
>> - return;
>> +
>> + break;
>> }
>> prev = client;
>> }
>> - cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> - , (long) req->pid);
>> +
>> +
>> + if (req == client){
>> +
>> + api_send_client_status(req, LEAVESTATUS, reason);
>> +
>> + /* Zap! */
>> + memset(client, 0, sizeof(*client));
>> + ha_free(client); client = NULL;
>> + }else{
>> + cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> + , (long) req->pid);
>> + }
>> +
>> + return;
>> }
>>
>>
>> -Guochun
>>
>>
>> At 12:46 PM 12/16/2004 -0600, you wrote:
>> >At 10:27 AM 12/16/2004 +0100, you wrote:
>> >>Hello,
>> >>
>> >>> Alan Robertson wrote:
>> >>> > Guochun Shi wrote:
>> >>> >> try this patch:
>> >>> >>
>> >>> >> Index: hb_api.c
>> >>> >> ===================================================================
>> >>> >> RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> >>> >> retrieving revision 1.94.2.3
>> >>> >> diff -u -r1.94.2.3 hb_api.c
>> >>> >> --- hb_api.c 11 Sep 2004 20:52:32 -0000 1.94.2.3
>> >>> >> +++ hb_api.c 15 Dec 2004 21:19:33 -0000
>> >>> >> @@ -1042,7 +1042,7 @@
>> >>> >> }
>> >>> >> if (strcmp(status, LEAVESTATUS) == 0) {
>> >>> >> /* Make sure they know they're signed off... */
>> >>> >> - api_send_client_msg(client, msg);
>> >>> >> + /*api_send_client_msg(client, msg);*/
>> >>> >
>> >>> >
>> >>> > We need to send them that message - provided that they're still
>> >>> > connected. But, we don't have to print that message - if we're now
>> >>> > really disconnected.
>> >>> >
>> >>
>> >>OK, now i also tried this patch. This one reduced the amount of error lines from
>> >>2 to 1. Here is a new extract from the debug log (i patched the cl_log to
>> >>display all messages in the debug log, thx to andrew).
>> >>
>> >>heartbeat: 2004/12/16_10:17:42 debug: }/*ProcessAnAPIRequest*/;
>> >>heartbeat: 2004/12/16_10:17:42 debug: return 0;
>> >>heartbeat: 2004/12/16_10:17:42 debug: }/*APIclients_input_dispatch*/
>> >>heartbeat: 2004/12/16_10:17:42 debug: G_remove_client(pid=13785,
>> >>reason='signoff' gsource=0x1dfd88) {
>> >>heartbeat: 2004/12/16_10:17:42 debug: process_clustermsg: node [cluster-db0]
>> >>heartbeat: 2004/12/16_10:17:42 ERROR: socket_resume_io_write() failure
>> >>heartbeat: 2004/12/16_10:17:42 debug: Queueing remote resource request (hook =
>> >>0x6ff70) hbapi-clstat
>> >>heartbeat: 2004/12/16_10:17:42 info: MSG: Dumping message with 12 fields
>> >>
>> >>I also should mention, that this machine is a 4 CPU machine, maybe its related.
>> >
>> >
>> >At least we have less error messages :)
>> >
>> >This error comes from send_cluster_msg(), where the message is looped back to
>> >process_clustermsg() to send to the client since the client status is still IPC_CONNECTED.
>> >
>> >Try the following patch:
>> >
>> >Index: hb_api.c
>> >===================================================================
>> >RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> >retrieving revision 1.124
>> >diff -u -r1.124 hb_api.c
>> >--- hb_api.c 14 Dec 2004 22:12:31 -0000 1.124
>> >+++ hb_api.c 16 Dec 2004 18:31:55 -0000
>> >@@ -1524,8 +1524,7 @@
>> > client_proc_t* prev = NULL;
>> > client_proc_t* client;
>> >
>> >-
>> >- api_send_client_status(req, LEAVESTATUS, reason);
>> >+
>> >
>> > --total_client_count;
>> >
>> >@@ -1551,12 +1550,10 @@
>> > }
>> >
>> >
>> >-#if MAKEITCRASH
>> > /* Drop the source - that will destroy the 'chan' */
>> > if (client->gsource) {
>> > G_main_del_IPC_Channel(client->gsource);
>> > }
>> >-#endif
>> >
>> > /* Zap! */
>> > memset(client, 0, sizeof(*client));
>> >@@ -1565,6 +1562,9 @@
>> > }
>> > prev = client;
>> > }
>> >+
>> >+ api_send_client_status(req, LEAVESTATUS, reason);
>> >+
>> > cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> > , (long) req->pid);
>> > }
>> >
>> >
>> >
>> >
>> >
>> >-Guochun
>> >
>> >
>> >
>> >
>> >
>> >
>> >_______________________________________________________
>> >Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> >http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> >Home Page: http://linux-ha.org/
>>
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>_______________________________________________________
>Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic