[prev in list] [next in list] [prev in thread] [next in thread] 

List:       sbcl-devel
Subject:    Re: [Sbcl-devel] [Sbcl-commits] master: Implement fcb-threads test for win32
From:       Stas Boukarev <stassats () gmail ! com>
Date:       2020-08-16 16:02:05
Message-ID: CAF63=12pWsgSmcG3PuFow=eSUQXT0ABnMp7AdSGsuT0KMmUrGw () mail ! gmail ! com
[Download RAW message or body]

That's consistent with my view of sb-alien, that it's a mess, a slow mess.

On Sun, Aug 16, 2020 at 8:14 AM Douglas Katzman <dougk@google.com> wrote:
> 
> The octet decoding error points to a fundamental bug in WITH-ALIEN that is \
> orthogonal to foreign callbacks. Minimal test:
> 
> (defun test (args-pointer)
> 
> (with-alien ((g471 c-string :LOCAL
> 
> (deref (sap-alien (truly-the system-area-pointer args-pointer)
> 
> (* C-STRING)))))
> 
> (format t "after the deref~%")
> 
> (opaque-identity g471)
> 
> (format t "after the actual use~%")
> 
> ))
> 
> 
> which, with the aid of some extra printing injected into \
> #'sb-impl::read-from-c-string/utf-8 and #'sb-impl::output-to-c-string/utf-8 says: 
> read-c-string/utf8 0x7f9d5be04120 -> 0x100271044f
> 
> output-to-c-string/utf8 0x100271044f -> 0x10027104c0
> 
> after the deref
> 
> read-c-string/utf8 0x10027104c0 -> 0x10027104ff
> 
> after the actual use
> 
> 
> The second read-c-string with the yellow highlight is reading from an \
> (unsigned-byte 8) vector on the lisp heap, but WITH-ALIEN does not insert a \
> WITH-PINNED-OBJECTS around it.  Incidentally it seems terribly inefficient which is \
> irrelevant considering that broken-ness is by far the greater concern. In other \
> words, firstly we assume that if the lisp string is modified, the changes should \
> propagate back to the C string, hence the output-to-c-string/utf-8 call, even \
> though at that point there was no chance for user code to modify the lisp string; \
> and secondly, if we assume that the authoritative string is the (unsigned-byte 8) \
> vector, then when re-reading from it, it needs to be pinned, and it isn't. 
> On Thu, Aug 13, 2020 at 8:56 PM Douglas Katzman <dougk@google.com> wrote:
> > 
> > and I've gotten this error despite that all the C strings are constant and in C \
> > data space. I can't imagine how this is possible.
> > 
> > > > > Running :CALL-ME-FROM-MANY-THREADS-AND-GC
> > Trial 1: GC'd 96 times
> > Trial 2: GC'd 79 times
> > Trial 1: GC'd 44 times
> > Trial 1: GC'd 953 times
> > Trial 2: GC'd 479 times
> > Trial 3: GC'd 485 times
> > Unhandled SB-INT:C-STRING-DECODING-ERROR in thread #<SB-THREAD:FOREIGN-THREAD \
> > "callback" RUNNING {D9705FB9}>:
> > > UTF-8 c-string decoding error:
> > the octet sequence #(217 11) cannot be decoded.
> > 
> > Backtrace for: #<SB-THREAD:FOREIGN-THREAD "callback" RUNNING {D9705FB9}>
> > 0: (SB-DEBUG::DEBUGGER-DISABLED-HOOK #<SB-INT:C-STRING-DECODING-ERROR {D9720011}> \
> >                 #<unused argument> :QUIT T)
> > 1: (SB-DEBUG::RUN-HOOK *INVOKE-DEBUGGER-HOOK* #<SB-INT:C-STRING-DECODING-ERROR \
> >                 {D9720011}>)
> > 2: (INVOKE-DEBUGGER #<SB-INT:C-STRING-DECODING-ERROR {D9720011}>)
> > 3: (ERROR SB-INT:C-STRING-DECODING-ERROR :EXTERNAL-FORMAT :UTF-8 :OCTETS #(217 \
> >                 11))
> > 4: (SB-IMPL::READ-FROM-C-STRING/UTF-8 #.(SB-SYS:INT-SAP #XD9713008) CHARACTER)
> > 
> > On Thu, Aug 13, 2020 at 8:32 PM Stas Boukarev <stassats@gmail.com> wrote:
> > > 
> > > Getting
> > > 
> > > 2020-08-13T23:47:19.2787865Z ::: UNEXPECTED-FAILURE
> > > > CALL-ME-FROM-MANY-THREADS-AND-GC due to SIMPLE-ERROR:
> > > 2020-08-13T23:47:19.2795868Z         "The assertion
> > > 2020-08-13T23:47:19.2796621Z          (EQL (ALIEN-FUNCALL TESTFUN
> > > (ALIEN-SAP TESTCB) N-THREADS N-CALLS) 1)
> > > 2020-08-13T23:47:19.2797171Z          failed with
> > > 2020-08-13T23:47:19.2797648Z          (ALIEN-FUNCALL TESTFUN
> > > (ALIEN-SAP TESTCB) N-THREADS N-CALLS) = 0."
> > > 
> > > on linux-x86
> > > 
> > > And also a heap exhaustion on darwin.
> > > 
> > > On Fri, Aug 14, 2020 at 2:29 AM Stas Boukarev <stassats@gmail.com> wrote:
> > > > 
> > > > 32-bit windows is still intermittently crashing, but I can't reproduce \
> > > > locally. 
> > > > On Fri, Aug 14, 2020 at 12:08 AM Douglas Katzman <dougk@google.com> wrote:
> > > > > 
> > > > > Thanks for the fix. I couldn't figure out that "tls_impersonate" thing. +1 \
> > > > > to the rename 
> > > > > On Thu, Aug 13, 2020 at 4:10 PM Stas Boukarev <stassats@gmail.com> wrote:
> > > > > > 
> > > > > > I seem to have it working with the latest commit.
> > > > > > 
> > > > > > On Thu, Aug 13, 2020 at 11:09 PM Douglas Katzman <dougk@google.com> \
> > > > > > wrote:
> > > > > > > 
> > > > > > > yes, the breaking change is the one which removed pthread_getspecific.
> > > > > > > I think in the windows platform, once a call from C to lisp is made, \
> > > > > > > the thread remains a lisp thread until it exits, across many transfers \
> > > > > > > out of lisp into C and back. I probably didn't know that at the time I \
> > > > > > > made this change. So at worst this only started failing yesterday or \
> > > > > > > the day before. I'll think about whether windows and non-windows should \
> > > > > > > be more similar. I also can't see where it ever calls \
> > > > > > > free_thread_struct when it finally exits, but maybe I'm not looking \
> > > > > > > hard enough. 
> > > > > > > 
> > > > > > > 
> > > > > > > On Thu, Aug 13, 2020 at 3:00 PM Stas Boukarev <stassats@gmail.com> \
> > > > > > > wrote:
> > > > > > > > 
> > > > > > > > So, it fails because there's an exception but there's no ->vm_thread
> > > > > > > > in the current thread.
> > > > > > > > callback_wrapper_trampoline calls     pthread_np_notice_thread();
> > > > > > > > which does something with
> > > > > > > > RegisterWaitForSingleObject(&pth->wait_handle,
> > > > > > > > pth->handle,
> > > > > > > > pthreads_win32_unnotice,
> > > > > > > > pth,
> > > > > > > > INFINITE,
> > > > > > > > WT_EXECUTEONLYONCE);
> > > > > > > > 
> > > > > > > > pthreads_win32_unnotice seems to be the place it's crashing, as
> > > > > > > > tls_impersonate(pth) returns 0.
> > > > > > > > 
> > > > > > > > 
> > > > > > > > On Wed, Aug 12, 2020 at 10:16 PM Stas Boukarev <stassats@gmail.com> \
> > > > > > > > wrote:
> > > > > > > > > 
> > > > > > > > > I think the path forward is to figure out somehow what's going on \
> > > > > > > > > now, not going back.
> > > > > > > > > 
> > > > > > > > > On Wed, Aug 12, 2020 at 10:14 PM Douglas Katzman <dougk@google.com> \
> > > > > > > > > wrote:
> > > > > > > > > > 
> > > > > > > > > > Ok, I've caught a failure with no output.
> > > > > > > > > > It's super sensitive to load on the physical machine. It passes \
> > > > > > > > > > for me when no GC occurs while the foreign threads are running, \
> > > > > > > > > > and fails otherwise. 
> > > > > > > > > > So the next thing to try to understand is whether this is \
> > > > > > > > > > strictly worse than the state of things prior to my spate of \
> > > > > > > > > > changes. In the worst case scenario, I would have to go all the \
> > > > > > > > > > way back to July 8th ("Speed up foreign-callback entry") to see \
> > > > > > > > > > if the test passed. But my guess is that everything since then \
> > > > > > > > > > has been forward progress and there have not been regressions. \
> > > > > > > > > > Would you be inclined to agree? 
> > > > > > > > > > On Wed, Aug 12, 2020 at 2:10 PM Stas Boukarev \
> > > > > > > > > > <stassats@gmail.com> wrote:
> > > > > > > > > > > 
> > > > > > > > > > > On Wed, Aug 12, 2020 at 8:48 PM Douglas Katzman \
> > > > > > > > > > > <dougk@google.com> wrote:
> > > > > > > > > > > > 
> > > > > > > > > > > > Do you think there was a point in time at which had the \
> > > > > > > > > > > > fcb-threads test existed in its current state it would have \
> > > > > > > > > > > > passed on win32 for you?
> > > > > > > > > > > If I were to bet I would say that it wouldn't be passing.
> > > > > > > > > > > 
> > > > > > > > > > > > I'm trying to figure out how to identify a code regression if \
> > > > > > > > > > > > there was a regression, but there has never been a test. The \
> > > > > > > > > > > > windows installation I'm using is from \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > > \
> > > > > > > > > > > I have a similar windows 10 vm downloaded from microsoft a \
> > > > > > > > > > > couple of years ago, but it also fails on the github CIs and on \
> > > > > > > > > > > the scymtym's CIs.


_______________________________________________
Sbcl-devel mailing list
Sbcl-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/sbcl-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic