[prev in list] [next in list] [prev in thread] [next in thread] 

List:       linux-ha-dev
Subject:    Re: [Linux-ha-dev] ERROR: socket_resume_io_write() on Sun
From:       Guochun Shi <gshi () ncsa ! uiuc ! edu>
Date:       2004-12-20 0:22:30
Message-ID: 5.1.0.14.2.20041219182121.05e41898 () pop ! ncsa ! uiuc ! edu
[Download RAW message or body]

not yet
but I plan to do so if it does not break anything

-Guochun

At 10:54 PM 12/19/2004 +0100, you wrote:
>gshi, did this patch also go into CVS?
>
>On Fri, 17 Dec 2004 14:36:20 -0600, Guochun Shi <gshi@ncsa.uiuc.edu> wrote:
>> Mark,
>> 
>> The previous patch breaks client leave callback, please apply the following patch instead:
>> 
>> 
>> Index: hb_api.c
>> ===================================================================
>> RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> retrieving revision 1.124
>> diff -u -2 -r1.124 hb_api.c
>> --- hb_api.c    14 Dec 2004 22:12:31 -0000      1.124
>> +++ hb_api.c    17 Dec 2004 20:31:28 -0000
>> @@ -415,4 +415,8 @@
>> 
>>                         if (client->removereason && !client->isindispatch) {
>> +                               if (ANYDEBUG){
>> +                                       cl_log(LOG_DEBUG, "api_remove_client_pid: client is "
>> +                                              "%s", client->client_id);
>> +                               }
>>                                 api_remove_client_pid(client->pid
>>                                                       , client->removereason);
>> @@ -1526,5 +1530,4 @@
>> 
>> 
>> -       api_send_client_status(req, LEAVESTATUS, reason);
>> 
>>         --total_client_count;
>> @@ -1552,20 +1555,28 @@
>> 
>> 
>> -#if MAKEITCRASH
>>                         /* Drop the source - that will destroy the 'chan' */
>>                         if (client->gsource) {
>>                                 G_main_del_IPC_Channel(client->gsource);
>>                         }
>> -#endif
>> -
>> -                       /* Zap! */
>> -                       memset(client, 0, sizeof(*client));
>> -                       ha_free(client); client = NULL;
>> -                       return;
>> +
>> +                       break;
>>                 }
>>                 prev = client;
>>         }
>> -       cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> -       ,       (long) req->pid);
>> +
>> +
>> +       if (req == client){
>> +
>> +               api_send_client_status(req, LEAVESTATUS, reason);
>> +
>> +               /* Zap! */
>> +               memset(client, 0, sizeof(*client));
>> +               ha_free(client); client = NULL;
>> +       }else{
>> +               cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> +                      ,        (long) req->pid);
>> +       }
>> +
>> +       return;
>>  }
>> 
>> 
>> -Guochun
>> 
>> 
>> At 12:46 PM 12/16/2004 -0600, you wrote:
>> >At 10:27 AM 12/16/2004 +0100, you wrote:
>> >>Hello,
>> >>
>> >>> Alan Robertson wrote:
>> >>> > Guochun Shi wrote:
>> >>> >> try this patch:
>> >>> >>
>> >>> >> Index: hb_api.c
>> >>> >> ===================================================================
>> >>> >> RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> >>> >> retrieving revision 1.94.2.3
>> >>> >> diff -u -r1.94.2.3 hb_api.c
>> >>> >> --- hb_api.c    11 Sep 2004 20:52:32 -0000      1.94.2.3
>> >>> >> +++ hb_api.c    15 Dec 2004 21:19:33 -0000
>> >>> >> @@ -1042,7 +1042,7 @@
>> >>> >>         }
>> >>> >>         if (strcmp(status, LEAVESTATUS) == 0) {
>> >>> >>                 /* Make sure they know they're signed off... */
>> >>> >> -               api_send_client_msg(client, msg);
>> >>> >> +               /*api_send_client_msg(client, msg);*/
>> >>> >
>> >>> >
>> >>> > We need to send them that message - provided that they're still
>> >>> >   connected. But, we don't have to print that message - if we're now
>> >>> > really disconnected.
>> >>> >
>> >>
>> >>OK, now i also tried this patch. This one reduced the amount of error lines from
>> >>2 to 1. Here is a new extract from the debug log (i patched the cl_log to
>> >>display all messages in the debug log, thx to andrew).
>> >>
>> >>heartbeat: 2004/12/16_10:17:42 debug: }/*ProcessAnAPIRequest*/;
>> >>heartbeat: 2004/12/16_10:17:42 debug: return 0;
>> >>heartbeat: 2004/12/16_10:17:42 debug: }/*APIclients_input_dispatch*/
>> >>heartbeat: 2004/12/16_10:17:42 debug: G_remove_client(pid=13785,
>> >>reason='signoff' gsource=0x1dfd88) {
>> >>heartbeat: 2004/12/16_10:17:42 debug: process_clustermsg: node [cluster-db0]
>> >>heartbeat: 2004/12/16_10:17:42 ERROR: socket_resume_io_write() failure
>> >>heartbeat: 2004/12/16_10:17:42 debug: Queueing remote resource request (hook =
>> >>0x6ff70) hbapi-clstat
>> >>heartbeat: 2004/12/16_10:17:42 info: MSG: Dumping message with 12 fields
>> >>
>> >>I also should mention, that this machine is a 4 CPU machine, maybe its related.
>> >
>> >
>> >At least we have less error messages :)
>> >
>> >This error comes from send_cluster_msg(), where the message is looped back to
>> >process_clustermsg() to send to the client since the client status is still IPC_CONNECTED.
>> >
>> >Try the following patch:
>> >
>> >Index: hb_api.c
>> >===================================================================
>> >RCS file: /home/cvs/linux-ha/linux-ha/heartbeat/hb_api.c,v
>> >retrieving revision 1.124
>> >diff -u -r1.124 hb_api.c
>> >--- hb_api.c    14 Dec 2004 22:12:31 -0000      1.124
>> >+++ hb_api.c    16 Dec 2004 18:31:55 -0000
>> >@@ -1524,8 +1524,7 @@
>> >        client_proc_t*  prev = NULL;
>> >        client_proc_t*  client;
>> >
>> >-
>> >-       api_send_client_status(req, LEAVESTATUS, reason);
>> >+
>> >
>> >        --total_client_count;
>> >
>> >@@ -1551,12 +1550,10 @@
>> >                        }
>> >
>> >
>> >-#if MAKEITCRASH
>> >                        /* Drop the source - that will destroy the 'chan' */
>> >                        if (client->gsource) {
>> >                                G_main_del_IPC_Channel(client->gsource);
>> >                        }
>> >-#endif
>> >
>> >                        /* Zap! */
>> >                        memset(client, 0, sizeof(*client));
>> >@@ -1565,6 +1562,9 @@
>> >                }
>> >                prev = client;
>> >        }
>> >+
>> >+       api_send_client_status(req, LEAVESTATUS, reason);
>> >+
>> >        cl_log(LOG_ERR, "api_remove_client_int: could not find pid [%ld]"
>> >        ,       (long) req->pid);
>> > }
>> >
>> >
>> >
>> >
>> >
>> >-Guochun
>> >
>> >
>> >
>> >
>> >
>> >
>> >_______________________________________________________
>> >Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> >http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> >Home Page: http://linux-ha.org/
>> 
>> _______________________________________________________
>> Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>> Home Page: http://linux-ha.org/
>>
>_______________________________________________________
>Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
>http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
>Home Page: http://linux-ha.org/

_______________________________________________________
Linux-HA-Dev: Linux-HA-Dev@lists.linux-ha.org
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic