[prev in list] [next in list] [prev in thread] [next in thread]
List: kde-core-devel
Subject: More crash-proof KCrash (was Re: Konq crash backtraces)
From: Lubos Lunak <l.lunak () suse ! cz>
Date: 2006-01-11 16:40:43
Message-ID: 200601111740.43565.l.lunak () suse ! cz
[Download RAW message or body]
On Saturday 07 January 2006 13:13, Lubos Lunak wrote:
> Dne pátek 06 leden 2006 22:31 Thiago Macieira napsal(a):
> > Dik Takken wrote:
> > >It just happened again...
> > >
> > >Konqueror crashed without leaving a single trace. I compiled KDE 3.5
> > > with full debug info, which gives me useful backtraces when a crash
> > > occurs. Konqueror is a weird exception. One of its windows can just
> > > disappear in front of your nose, without DrKonqi popping up to get a
> > > backtrace.
> > >
> > Distributions are shipping glibc with some malloc checks, which cause it
> > to abort the program when a double-free or corruption occurs, but without
> > raising SIGABRT. There must be a way of making glibc raise the signal,
> > but I don't know how. It should be enabled by default in KDE code so that
> > the crash handler can catch them.
>
> Not really. The error checking does raise SIGABRT, it's just that our
> crash handler depends on malloc() working and locks up already in
> DCOPClient::emergencyClose(). If we want to get a crash handler for heap
> corruptions too, we need to stay away from using the corrupted malloc. I
> can give it a shot.
Ok, it wasn't as trivial, but I have a patch. The changes are:
- no qstrdup() of static strings or actually any strings , no other things
that result in malloc() like using QCString, i18n() or whatever. Results in
some ugly code like KAboutData::translateInternalProgramName() but oh well.
- no DCOPClient::emergencyClose() - what's the point anyway, I'm quite sure it
doesn't get called when I do Ctrl+C and KDE seems to work just fine, so why
should it be called when crashing - it calls malloc() and who knows what else
- no fork() - this one doesn't actually call malloc(), but there are some
functions called at pre-fork time that do something with malloc locks (with
glibc at least) - that leads to lockups as well. Since no fork() means no
executing of other executables, the only other option I see is feeding the
data to some other already running process via a socket. Fortunately we
incidentally have already this kdeinit thingy running that's supposed to
launch executables when it gets told via a socket, so I copy&pasted some code
from there. That means some more syscalls, but that should be still more
safe. There's still a fallback to use fork() in case using kdeinit fails for
any reason.
This way KCrash should be more reliable and we should (hopefully) always get
DrKonqi now. And I can confirm that after one gets used to DrKonqi just
having an application disappearing without a reason really doesn't feel nice.
I'd like to commit this to 3.5 branch, so it'd be nice if somebody could at
least peek at it.
PS: One more benefit would be that we could now make drkonqi a kdeinit module
in order to make our crashes faster.
PPS: A good way of testing is just doing a double free().
--
Lubos Lunak
KDE developer
---------------------------------------------------------------------
SuSE CR, s.r.o. e-mail: l.lunak@suse.cz , l.lunak@kde.org
Drahobejlova 27 tel: +420 2 9654 2373
190 00 Praha 9 fax: +420 2 9654 2374
Czech Republic http://www.suse.cz/
["kdecore.patch" (text/x-diff)]
--- kdecore/kcrash.h.sav 2005-09-29 21:31:56.000000000 +0200
+++ kdecore/kcrash.h 2006-01-11 16:50:42.000000000 +0100
@@ -117,6 +117,10 @@ class KDECORE_EXPORT KCrash
* Pointer to the emergency save function.
*/
static HandlerType _emergencySaveFunction;
+
+ private:
+ static void startDrKonqi( const char* argv[], int argc );
+ static void startDirectly( const char* argv[], int argc );
};
#endif
--- kdecore/kcrash.cpp.sav 2005-09-29 21:31:56.000000000 +0200
+++ kdecore/kcrash.cpp 2006-01-11 16:50:39.000000000 +0100
@@ -38,6 +38,9 @@
#include <sys/time.h>
#include <sys/resource.h>
#include <sys/wait.h>
+#include <sys/un.h>
+#include <sys/socket.h>
+#include <errno.h>
#include <qwindowdefs.h>
#include <kglobal.h>
@@ -47,6 +50,8 @@
#include <kapplication.h>
#include <dcopclient.h>
+#include <../kinit/klauncher_cmds.h>
+
#if defined Q_WS_X11
#include <X11/Xlib.h>
#endif
@@ -128,19 +133,15 @@ KCrash::defaultCrashHandler (int sig)
crashRecursionCounter++; //
}
- // Close dcop connections
- DCOPClient::emergencyClose();
// Close all remaining file descriptors except for stdin/stdout/stderr
struct rlimit rlp;
getrlimit(RLIMIT_NOFILE, &rlp);
for (int i = 3; i < (int)rlp.rlim_cur; i++)
close(i);
- bool shuttingDown = false;
- // don't load drkonqi during shutdown
- if ( !shuttingDown )
- {
+ // this code is leaking, but this should not hurt cause we will do a
+ // exec() afterwards. exec() is supposed to clean up.
if (crashRecursionCounter < 3)
{
if (appName)
@@ -152,105 +153,81 @@ KCrash::defaultCrashHandler (int sig)
fprintf(stderr, "KCrash: Application '%s' crashing...\n", appName ? appName : "<unknown>");
#endif
- pid_t pid = fork();
-
- if (pid <= 0) {
- // this code is leaking, but this should not hurt cause we will do a
- // exec() afterwards. exec() is supposed to clean up.
- char * argv[24]; // don't forget to update this
+ const char * argv[24]; // don't forget to update this
int i = 0;
// argument 0 has to be drkonqi
- argv[i++] = qstrdup("drkonqi");
+ argv[i++] = "drkonqi";
#if defined Q_WS_X11
// start up on the correct display
- argv[i++] = qstrdup("-display");
+ argv[i++] = "-display";
if ( qt_xdisplay() )
argv[i++] = XDisplayString(qt_xdisplay());
else
argv[i++] = getenv("DISPLAY");
#elif defined(Q_WS_QWS)
// start up on the correct display
- argv[i++] = qstrdup("-display");
+ argv[i++] = "-display";
argv[i++] = getenv("QWS_DISPLAY");
#endif
// we have already tested this
- argv[i++] = qstrdup("--appname");
- argv[i++] = qstrdup(appName);
+ argv[i++] = "--appname";
+ argv[i++] = appName;
if (KApplication::loadedByKdeinit)
- argv[i++] = qstrdup("--kdeinit");
+ argv[i++] = "--kdeinit";
// only add apppath if it's not NULL
if (appPath) {
- argv[i++] = qstrdup("--apppath");
- argv[i++] = qstrdup(appPath);
+ argv[i++] = "--apppath";
+ argv[i++] = appPath;
}
// signal number -- will never be NULL
- QCString tmp;
- tmp.setNum(sig);
- argv[i++] = qstrdup("--signal");
- argv[i++] = qstrdup(tmp.data());
+ char sigtxt[ 10 ];
+ sprintf( sigtxt, "%d", sig );
+ argv[i++] = "--signal";
+ argv[i++] = sigtxt;
- // pid number -- only include if this is the child
- // the debug stuff will be disabled if we was not able to fork
- if (pid == 0) {
- tmp.setNum(getppid());
- argv[i++] = qstrdup("--pid");
- argv[i++] = qstrdup(tmp.data());
- }
+ char pidtxt[ 10 ];
+ sprintf( pidtxt, "%d", getpid());
+ argv[i++] = "--pid";
+ argv[i++] = pidtxt;
const KInstance *instance = KGlobal::_instance;
const KAboutData *about = instance ? instance->aboutData() : 0;
if (about) {
- if (!about->version().isNull()) {
- argv[i++] = qstrdup("--appversion");
- argv[i++] = qstrdup(about->version().utf8());
+ if (about->internalVersion()) {
+ argv[i++] = "--appversion";
+ argv[i++] = about->internalVersion();
}
- if (!about->programName().isNull()) {
- argv[i++] = qstrdup("--programname");
- argv[i++] = qstrdup(about->programName().utf8());
+ if (about->internalProgramName()) {
+ argv[i++] = "--programname";
+ argv[i++] = about->internalProgramName();
}
- if (!about->bugAddress().isNull()) {
- argv[i++] = qstrdup("--bugaddress");
- argv[i++] = qstrdup(about->bugAddress().utf8());
+ if (about->internalBugAddress()) {
+ argv[i++] = "--bugaddress";
+ argv[i++] = about->internalBugAddress();
}
}
if ( kapp && !kapp->startupId().isNull()) {
- argv[i++] = qstrdup("--startupid");
- argv[i++] = qstrdup(kapp->startupId());
+ argv[i++] = "--startupid";
+ argv[i++] = kapp->startupId().data();
}
if ( safer )
- argv[i++] = qstrdup("--safer");
+ argv[i++] = "--safer";
// NULL terminated list
- argv[i++] = NULL;
-
- setgid(getgid());
- setuid(getuid());
-
- execvp("drkonqi", argv);
-
- // we could clean up here
- // i = 0;
- // while (argv[i])
- // free(argv[i++]);
- }
- else
- {
-
- alarm(0); // Seems we made it....
+ argv[i] = NULL;
- // wait for child to exit
- waitpid(pid, NULL, 0);
+ startDrKonqi( argv, i );
_exit(253);
- }
+
}
else {
fprintf(stderr, "Unknown appname\n");
@@ -261,8 +238,279 @@ KCrash::defaultCrashHandler (int sig)
{
fprintf(stderr, "Unable to start Dr. Konqi\n");
}
- }
#endif //Q_OS_UNIX
_exit(255);
}
+
+#ifdef Q_OS_UNIX
+
+// Since we can't fork() in the crashhandler, we cannot execute any external code
+// (there can be functions registered to be performed before fork(), for example
+// handling of malloc locking, which doesn't work when malloc crashes because of heap corruption).
+
+static int write_socket(int sock, char *buffer, int len);
+static int read_socket(int sock, char *buffer, int len);
+static int openSocket();
+
+void KCrash::startDrKonqi( const char* argv[], int argc )
+{
+ int socket = openSocket();
+ if( socket < -1 )
+ {
+ startDirectly( argv, argc );
+ return;
+ }
+ klauncher_header header;
+ header.cmd = LAUNCHER_EXEC_NEW;
+ const int BUFSIZE = 8192; // make sure this is big enough
+ char buffer[ BUFSIZE + 10 ];
+ int pos = 0;
+ long argcl = argc;
+ memcpy( buffer + pos, &argcl, sizeof( argcl ));
+ pos += sizeof( argcl );
+ for( int i = 0;
+ i < argc;
+ ++i )
+ {
+ int len = strlen( argv[ i ] ) + 1; // include terminating \0
+ if( pos + len > BUFSIZE )
+ {
+ fprintf( stderr, "BUFSIZE in KCrash not big enough!\n" );
+ startDirectly( argv, argc );
+ return;
+ }
+ memcpy( buffer + pos, argv[ i ], len );
+ pos += len;
+ }
+ long env = 0;
+ memcpy( buffer + pos, &env, sizeof( env ));
+ pos += sizeof( env );
+ long avoid_loops = 0;
+ memcpy( buffer + pos, &avoid_loops, sizeof( avoid_loops ));
+ pos += sizeof( avoid_loops );
+ header.arg_length = pos;
+ write_socket(socket, (char *) &header, sizeof(header));
+ write_socket(socket, buffer, pos);
+ if( read_socket( socket, (char *) &header, sizeof(header)) < 0
+ || header.cmd != LAUNCHER_OK )
+ {
+ startDirectly( argv, argc );
+ return;
+ }
+ long pid;
+ read_socket(socket, buffer, header.arg_length);
+ pid = *((long *) buffer);
+
+ alarm(0); // Seems we made it....
+
+ for(;;)
+ {
+ if( kill( pid, 0 ) < 0 )
+ _exit(253);
+ sleep( 1 ); // the debugger should stop this process anyway
+ }
+}
+
+// If we can't reach kdeinit we can still at least try to fork()
+void KCrash::startDirectly( const char* argv[], int )
+{
+ fprintf( stderr, "KCrash cannot reach kdeinit, launching directly.\n" );
+ pid_t pid = fork();
+ if (pid <= 0)
+ {
+ setgid(getgid());
+ setuid(getuid());
+ execvp("drkonqi", const_cast< char** >( argv ));
+ }
+ else
+ {
+ alarm(0); // Seems we made it....
+ // wait for child to exit
+ waitpid(pid, NULL, 0);
+ _exit(253);
+ }
+}
+
+// From now on this code is copy&pasted from kinit/wrapper.c :
+
+extern char **environ;
+
+static char *getDisplay()
+{
+ const char *display;
+ char *result;
+ char *screen;
+ char *colon;
+ char *i;
+/*
+ don't test for a value from qglobal.h but instead distinguish
+ Qt/X11 from Qt/Embedded by the fact that Qt/E apps have -DQWS
+ on the commandline (which in qglobal.h however triggers Q_WS_QWS,
+ but we don't want to include that here) (Simon)
+#ifdef Q_WS_X11
+ */
+#if !defined(QWS)
+ display = getenv("DISPLAY");
+#else
+ display = getenv("QWS_DISPLAY");
+#endif
+ if (!display || !*display)
+ {
+ display = ":0";
+ }
+ result = (char*)malloc(strlen(display)+1);
+ if (result == NULL)
+ return NULL;
+
+ strcpy(result, display);
+ screen = strrchr(result, '.');
+ colon = strrchr(result, ':');
+ if (screen && (screen > colon))
+ *screen = '\0';
+ while((i = strchr(result, ':')))
+ *i = '_';
+ return result;
+}
+
+/*
+ * Write 'len' bytes from 'buffer' into 'sock'.
+ * returns 0 on success, -1 on failure.
+ */
+static int write_socket(int sock, char *buffer, int len)
+{
+ ssize_t result;
+ int bytes_left = len;
+ while ( bytes_left > 0)
+ {
+ result = write(sock, buffer, bytes_left);
+ if (result > 0)
+ {
+ buffer += result;
+ bytes_left -= result;
+ }
+ else if (result == 0)
+ return -1;
+ else if ((result == -1) && (errno != EINTR) && (errno != EAGAIN))
+ return -1;
+ }
+ return 0;
+}
+
+/*
+ * Read 'len' bytes from 'sock' into 'buffer'.
+ * returns 0 on success, -1 on failure.
+ */
+static int read_socket(int sock, char *buffer, int len)
+{
+ ssize_t result;
+ int bytes_left = len;
+ while ( bytes_left > 0)
+ {
+ result = read(sock, buffer, bytes_left);
+ if (result > 0)
+ {
+ buffer += result;
+ bytes_left -= result;
+ }
+ else if (result == 0)
+ return -1;
+ else if ((result == -1) && (errno != EINTR) && (errno != EAGAIN))
+ return -1;
+ }
+ return 0;
+}
+
+static int openSocket()
+{
+ kde_socklen_t socklen;
+ int s;
+ struct sockaddr_un server;
+#define MAX_SOCK_FILE 255
+ char sock_file[MAX_SOCK_FILE + 1];
+ const char *home_dir = getenv("HOME");
+ const char *kde_home = getenv("KDEHOME");
+ char *display;
+
+ sock_file[0] = sock_file[MAX_SOCK_FILE] = 0;
+
+ if (!kde_home || !kde_home[0])
+ {
+ kde_home = "~/.kde/";
+ }
+
+ if (kde_home[0] == '~')
+ {
+ if (!home_dir || !home_dir[0])
+ {
+ fprintf(stderr, "Warning: $HOME not set!\n");
+ return -1;
+ }
+ if (strlen(home_dir) > (MAX_SOCK_FILE-100))
+ {
+ fprintf(stderr, "Warning: Home directory path too long!\n");
+ return -1;
+ }
+ kde_home++;
+ strncpy(sock_file, home_dir, MAX_SOCK_FILE);
+ }
+ strncat(sock_file, kde_home, MAX_SOCK_FILE - strlen(sock_file));
+
+ /** Strip trailing '/' **/
+ if ( sock_file[strlen(sock_file)-1] == '/')
+ sock_file[strlen(sock_file)-1] = 0;
+
+ strncat(sock_file, "/socket-", MAX_SOCK_FILE - strlen(sock_file));
+ if (gethostname(sock_file+strlen(sock_file), MAX_SOCK_FILE - strlen(sock_file) - 1) != 0)
+ {
+ perror("Warning: Could not determine hostname: ");
+ return -1;
+ }
+ sock_file[sizeof(sock_file)-1] = '\0';
+
+ /* append $DISPLAY */
+ display = getDisplay();
+ if (display == NULL)
+ {
+ fprintf(stderr, "Error: Could not determine display.\n");
+ return -1;
+ }
+
+ if (strlen(sock_file)+strlen(display)+strlen("/kdeinit_")+2 > MAX_SOCK_FILE)
+ {
+ fprintf(stderr, "Warning: Socket name will be too long.\n");
+ return -1;
+ }
+ strcat(sock_file, "/kdeinit_");
+ strcat(sock_file, display);
+ free(display);
+
+ if (strlen(sock_file) >= sizeof(server.sun_path))
+ {
+ fprintf(stderr, "Warning: Path of socketfile exceeds UNIX_PATH_MAX.\n");
+ return -1;
+ }
+
+ /*
+ * create the socket stream
+ */
+ s = socket(PF_UNIX, SOCK_STREAM, 0);
+ if (s < 0)
+ {
+ perror("Warning: socket() failed: ");
+ return -1;
+ }
+
+ server.sun_family = AF_UNIX;
+ strcpy(server.sun_path, sock_file);
+ socklen = sizeof(server);
+ if(connect(s, (struct sockaddr *)&server, socklen) == -1)
+ {
+ perror("Warning: connect() failed: ");
+ close(s);
+ return -1;
+ }
+ return s;
+}
+
+#endif // Q_OS_UNIX
--- kdecore/kaboutdata.cpp.sav 2005-09-29 21:31:56.000000000 +0200
+++ kdecore/kaboutdata.cpp 2006-01-11 17:34:36.000000000 +0100
@@ -106,6 +106,7 @@ KAboutData::KAboutData( const char *appN
const char *bugsEmailAddress
) :
mProgramName( programName ),
+ mTranslatedProgramName( 0 ),
mVersion( version ),
mShortDescription( shortDescription ),
mLicenseKey( licenseType ),
@@ -131,6 +132,7 @@ KAboutData::~KAboutData()
{
if (mLicenseKey == License_File)
delete [] mLicenseText;
+ delete[] mTranslatedProgramName;
delete d;
}
@@ -179,6 +181,10 @@ void
KAboutData::setProgramName( const char* programName )
{
mProgramName = programName;
+ delete[] mTranslatedProgramName;
+ mTranslatedProgramName = 0;
+ if( KGlobal::locale())
+ mTranslatedProgramName = qstrdup( i18n( mProgramName ).utf8());
}
void
@@ -253,6 +259,27 @@ KAboutData::programName() const
return QString::null;
}
+const char*
+KAboutData::internalProgramName() const
+{
+ if (mTranslatedProgramName)
+ return mTranslatedProgramName;
+ else
+ return mProgramName;
+}
+
+// KCrash should call as few things as possible and should avoid e.g. malloc()
+// because it may deadlock. Since i18n() needs it, when KLocale is available
+// the i18n() call will be done here in advance.
+void
+KAboutData::translateInternalProgramName() const
+{
+ delete[] mTranslatedProgramName;
+ mTranslatedProgramName = 0;
+ if( KGlobal::locale())
+ mTranslatedProgramName = qstrdup( i18n( mProgramName ).utf8());
+}
+
QImage
KAboutData::programLogo() const
{
--- kdecore/kaboutdata.h.sav 2005-09-29 21:31:55.000000000 +0200
+++ kdecore/kaboutdata.h 2006-01-11 17:34:20.000000000 +0100
@@ -464,6 +464,15 @@ class KDECORE_EXPORT KAboutData
QString programName() const;
/**
+ * @internal
+ */
+ const char* internalProgramName() const;
+ /**
+ * @internal
+ */
+ void translateInternalProgramName() const;
+
+ /**
* Returns the program logo image.
* @return the program logo data or null image if there is
* no custom application logo defined.
@@ -478,6 +487,11 @@ class KDECORE_EXPORT KAboutData
QString version() const;
/**
+ * @internal
+ */
+ const char* internalVersion() const { return mVersion; }
+
+ /**
* Returns a short, translated description.
* @return the short description (translated). Can be
* QString::null if not set.
@@ -496,6 +510,11 @@ class KDECORE_EXPORT KAboutData
* @return the email address where to report bugs.
*/
QString bugAddress() const;
+
+ /**
+ * @internal
+ */
+ const char* internalBugAddress() const { return mBugEmailAddress; }
/**
* Returns a list of authors.
@@ -591,6 +610,7 @@ class KDECORE_EXPORT KAboutData
private:
const char *mAppName;
const char *mProgramName;
+ mutable const char *mTranslatedProgramName;
const char *mVersion;
const char *mShortDescription;
int mLicenseKey;
--- kdecore/kglobal.cpp.sav 2005-09-29 21:31:56.000000000 +0200
+++ kdecore/kglobal.cpp 2006-01-11 17:33:57.000000000 +0100
@@ -28,7 +28,7 @@
#include "kglobal.h"
#include <kapplication.h>
-
+#include <kaboutdata.h>
#include <kdebug.h>
#include <kconfig.h>
#include <klocale.h>
@@ -94,6 +94,8 @@ KLocale *KGlobal::locale()
// will set _locale if it works - otherwise 0 is returned
KLocale::initInstance();
+ if( _instance->aboutData())
+ _instance->aboutData()->translateInternalProgramName();
}
return _locale;
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic