[prev in list] [next in list] [prev in thread] [next in thread]
List: subversion-dev
Subject: [Patch] Workaround for SQLITE open() race condition
From: Stefan Fuhrmann <stefan.fuhrmann () wandisco ! com>
Date: 2014-01-15 17:01:35
Message-ID: CA+t0gk33OXaa7LW61si8UHqMnqN-=YfNpNqKZYk=z4KSQzj9TQ () mail ! gmail ! com
[Download RAW message or body]
Hi,
We see random failure on our Windows build bot
when it tries to open different working copies at
the same time. This is a simple retry patch for that.
On IRC, Bert spoke out in favour of debugging and
fixing sqlite itself - hence I won't commit the patch
right now. However, I've also been unable to reproduce
the problem under LINUX so far and can't really help
with the effort.
-- Stefan^2.
[[[
Attempt to address the "unable to open database file" error we
see at least on our test suite with multi-threaded SQLITE code.
http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html
hints at a problem with temporary files used by SQLITE internally,
i.e. the root cause lying in the SQLITE code itself.
Hence, we add a workaround for now that simply retries the open()
operation for a reasonable amount of time.
* subversion/libsvn_subr/sqlite.c
(BUSY_TIMEOUT): Document that we use this timeout for failing
open() calls as well.
(internal_open): Retry if we couldn't open the DB for some
potentially transient reason.
]]]
--- subversion/libsvn_subr/sqlite.c (revision 1555441)
+++ subversion/libsvn_subr/sqlite.c (working copy)
@@ -174,10 +174,11 @@ struct svn_sqlite__value_t
} while (0)
-/* Time (in milliseconds) to wait for sqlite locks before giving up. */
+/* Time (in milliseconds) to wait for sqlite locks before giving up.
+ We use the same timeout for handling other concurrency issues in
+ sqlite's open function. */
#define BUSY_TIMEOUT 10000
-
/* Convenience wrapper around exec_sql2(). */
#define exec_sql(db, sql) exec_sql2((db), (sql), SQLITE_OK)
@@ -838,6 +839,28 @@ internal_open(sqlite3 **db3, const char
do this manually. */
/* ### SQLITE_CANTOPEN */
int err_code = sqlite3_open_v2(path, db3, flags, NULL);
+
+ /* SQLITE seems to have race condition that prevent separate threads
+ from opening separate DBs at the same time. */
+ if (err_code == SQLITE_CANTOPEN)
+ {
+ /* Retry for approx. 10 seconds while we get the standard "can't
+ open" error return. */
+ apr_time_t start = apr_time_now();
+ while ( (err_code == SQLITE_CANTOPEN)
+ && (apr_time_now() - start < BUSY_TIMEOUT * 1000))
+ {
+ /* The db object is in an undefined state - clean it up.
+ We don't catch the error here, since we only care about
the
+ open error at this point. */
+ sqlite3_close(*db3);
+
+ /* Retry. */
+ err_code = sqlite3_open_v2(path, db3, flags, NULL);
+ }
+ }
+
+ /* Now back to normal error handling. */
if (err_code != SQLITE_OK)
{
/* Save the error message before closing the SQLite handle. */
[Attachment #3 (text/html)]
<div dir="ltr"><div>Hi,<br><br>We see random failure on our Windows build \
bot<br></div>when it tries to open different working copies at<br>the same time. This \
is a simple retry patch for that.<br><div><div><br></div><div> On IRC, Bert spoke out \
in favour of debugging and<br>fixing sqlite itself - hence I won't commit the \
patch<br></div><div>right now. However, I've also been unable to \
reproduce<br></div><div>the problem under LINUX so far and can't really help<br> \
with the effort.<br><br></div><div>-- Stefan^2.<br></div><div><br>[[[<br>Attempt to \
address the "unable to open database file" error we<br>see at least on our \
test suite with multi-threaded SQLITE code.<br><br><a \
href="http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html">http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html</a><br>
hints at a problem with temporary files used by SQLITE internally,<br>i.e. the root \
cause lying in the SQLITE code itself.<br><br>Hence, we add a workaround for now that \
simply retries the open()<br>operation for a reasonable amount of time.<br> <br>* \
subversion/libsvn_subr/sqlite.c<br> (BUSY_TIMEOUT): Document that we use this \
timeout for failing<br> open() calls as well.<br> (internal_open): \
Retry if we couldn't open the DB for some<br> potentially \
transient reason.<br> ]]]<br><br>--- subversion/libsvn_subr/sqlite.c (revision \
1555441)<br>+++ subversion/libsvn_subr/sqlite.c (working copy)<br>@@ -174,10 \
+174,11 @@ struct svn_sqlite__value_t<br> } while (0)<br> <br> <br>-/* Time (in \
milliseconds) to wait for sqlite locks before giving up. */<br> +/* Time (in \
milliseconds) to wait for sqlite locks before giving up.<br>+ We use the same \
timeout for handling other concurrency issues in<br>+ sqlite's open function. \
*/<br> #define BUSY_TIMEOUT 10000<br> <br>-<br> /* Convenience wrapper around \
exec_sql2(). */<br> #define exec_sql(db, sql) exec_sql2((db), (sql), SQLITE_OK)<br> \
<br>@@ -838,6 +839,28 @@ internal_open(sqlite3 **db3, const char<br> do this \
manually. */<br> /* ### SQLITE_CANTOPEN */<br> int err_code = \
sqlite3_open_v2(path, db3, flags, NULL);<br>+<br>+ /* SQLITE seems to have race \
condition that prevent separate threads<br>+ from opening separate DBs at the \
same time. */<br>+ if (err_code == SQLITE_CANTOPEN)<br> + {<br>+ \
/* Retry for approx. 10 seconds while we get the standard "can't<br>+ \
open" error return. */<br>+ apr_time_t start = apr_time_now();<br>+ \
while ( (err_code == SQLITE_CANTOPEN)<br> + && \
(apr_time_now() - start < BUSY_TIMEOUT * 1000))<br>+ {<br>+ \
/* The db object is in an undefined state - clean it up.<br>+ We \
don't catch the error here, since we only care about the<br> + \
open error at this point. */<br>+ sqlite3_close(*db3);<br>+<br>+ \
/* Retry. */<br>+ err_code = sqlite3_open_v2(path, db3, flags, \
NULL);<br>+ }<br>+ }<br> +<br>+ /* Now back to normal error \
handling. */<br> if (err_code != SQLITE_OK)<br> {<br> /* Save \
the error message before closing the SQLite handle. */<br></div></div></div>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic