[prev in list] [next in list] [prev in thread] [next in thread] 

List:       subversion-dev
Subject:    [Patch] Workaround for SQLITE open() race condition
From:       Stefan Fuhrmann <stefan.fuhrmann () wandisco ! com>
Date:       2014-01-15 17:01:35
Message-ID: CA+t0gk33OXaa7LW61si8UHqMnqN-=YfNpNqKZYk=z4KSQzj9TQ () mail ! gmail ! com
[Download RAW message or body]

Hi,

We see random failure on our Windows build bot
when it tries to open different working copies at
the same time. This is a simple retry patch for that.

On IRC, Bert spoke out in favour of debugging and
fixing sqlite itself - hence I won't commit the patch
right now. However, I've also been unable to reproduce
the problem under LINUX so far and can't really help
with the effort.

-- Stefan^2.

[[[
Attempt to address the "unable to open database file" error we
see at least on our test suite with multi-threaded SQLITE code.

http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html
hints at a problem with temporary files used by SQLITE internally,
i.e. the root cause lying in the SQLITE code itself.

Hence, we add a workaround for now that simply retries the open()
operation for a reasonable amount of time.

* subversion/libsvn_subr/sqlite.c
  (BUSY_TIMEOUT): Document that we use this timeout for failing
                  open() calls as well.
  (internal_open): Retry if we couldn't open the DB for some
                   potentially transient reason.
]]]

--- subversion/libsvn_subr/sqlite.c    (revision 1555441)
+++ subversion/libsvn_subr/sqlite.c    (working copy)
@@ -174,10 +174,11 @@ struct svn_sqlite__value_t
 } while (0)


-/* Time (in milliseconds) to wait for sqlite locks before giving up. */
+/* Time (in milliseconds) to wait for sqlite locks before giving up.
+   We use the same timeout for handling other concurrency issues in
+   sqlite's open function. */
 #define BUSY_TIMEOUT 10000

-
 /* Convenience wrapper around exec_sql2(). */
 #define exec_sql(db, sql) exec_sql2((db), (sql), SQLITE_OK)

@@ -838,6 +839,28 @@ internal_open(sqlite3 **db3, const char
          do this manually. */
       /* ### SQLITE_CANTOPEN */
       int err_code = sqlite3_open_v2(path, db3, flags, NULL);
+
+      /* SQLITE seems to have race condition that prevent separate threads
+         from opening separate DBs at the same time. */
+      if (err_code == SQLITE_CANTOPEN)
+        {
+          /* Retry for approx. 10 seconds while we get the standard "can't
+             open" error return. */
+          apr_time_t start = apr_time_now();
+          while (   (err_code == SQLITE_CANTOPEN)
+                 && (apr_time_now() - start < BUSY_TIMEOUT * 1000))
+            {
+              /* The db object is in an undefined state - clean it up.
+                 We don't catch the error here, since we only care about
the
+                 open error at this point. */
+              sqlite3_close(*db3);
+
+              /* Retry. */
+              err_code = sqlite3_open_v2(path, db3, flags, NULL);
+            }
+        }
+
+      /* Now back to normal error handling. */
       if (err_code != SQLITE_OK)
         {
           /* Save the error message before closing the SQLite handle. */

[Attachment #3 (text/html)]

<div dir="ltr"><div>Hi,<br><br>We see random failure on our Windows build \
bot<br></div>when it tries to open different working copies at<br>the same time. This \
is a simple retry patch for that.<br><div><div><br></div><div> On IRC, Bert spoke out \
in favour of debugging and<br>fixing sqlite itself - hence I won&#39;t commit the \
patch<br></div><div>right now. However, I&#39;ve also been unable to \
reproduce<br></div><div>the problem under LINUX so far and can&#39;t really help<br> \
with the effort.<br><br></div><div>-- Stefan^2.<br></div><div><br>[[[<br>Attempt to \
address the &quot;unable to open database file&quot; error we<br>see at least on our \
test suite with multi-threaded SQLITE code.<br><br><a \
href="http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html">http://www.mail-archive.com/sqlite-users@sqlite.org/msg81284.html</a><br>
 hints at a problem with temporary files used by SQLITE internally,<br>i.e. the root \
cause lying in the SQLITE code itself.<br><br>Hence, we add a workaround for now that \
simply retries the open()<br>operation for a reasonable amount of time.<br> <br>* \
subversion/libsvn_subr/sqlite.c<br>  (BUSY_TIMEOUT): Document that we use this \
timeout for failing<br>                  open() calls as well.<br>  (internal_open): \
Retry if we couldn&#39;t open the DB for some<br>                   potentially \
transient reason.<br> ]]]<br><br>--- subversion/libsvn_subr/sqlite.c    (revision \
1555441)<br>+++ subversion/libsvn_subr/sqlite.c    (working copy)<br>@@ -174,10 \
+174,11 @@ struct svn_sqlite__value_t<br> } while (0)<br> <br> <br>-/* Time (in \
milliseconds) to wait for sqlite locks before giving up. */<br> +/* Time (in \
milliseconds) to wait for sqlite locks before giving up.<br>+   We use the same \
timeout for handling other concurrency issues in<br>+   sqlite&#39;s open function. \
*/<br> #define BUSY_TIMEOUT 10000<br> <br>-<br>  /* Convenience wrapper around \
exec_sql2(). */<br> #define exec_sql(db, sql) exec_sql2((db), (sql), SQLITE_OK)<br> \
<br>@@ -838,6 +839,28 @@ internal_open(sqlite3 **db3, const char<br>          do this \
manually. */<br>       /* ### SQLITE_CANTOPEN */<br>  int err_code = \
sqlite3_open_v2(path, db3, flags, NULL);<br>+<br>+      /* SQLITE seems to have race \
condition that prevent separate threads<br>+         from opening separate DBs at the \
same time. */<br>+      if (err_code == SQLITE_CANTOPEN)<br> +        {<br>+          \
/* Retry for approx. 10 seconds while we get the standard &quot;can&#39;t<br>+        \
open&quot; error return. */<br>+          apr_time_t start = apr_time_now();<br>+     \
while (   (err_code == SQLITE_CANTOPEN)<br> +                 &amp;&amp; \
(apr_time_now() - start &lt; BUSY_TIMEOUT * 1000))<br>+            {<br>+             \
/* The db object is in an undefined state - clean it up.<br>+                 We \
don&#39;t catch the error here, since we only care about the<br> +                 \
open error at this point. */<br>+              sqlite3_close(*db3);<br>+<br>+         \
/* Retry. */<br>+              err_code = sqlite3_open_v2(path, db3, flags, \
NULL);<br>+            }<br>+        }<br> +<br>+      /* Now back to normal error \
handling. */<br>       if (err_code != SQLITE_OK)<br>         {<br>           /* Save \
the error message before closing the SQLite handle. */<br></div></div></div>



[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic