[prev in list] [next in list] [prev in thread] [next in thread] 

List:       openjdk-serviceability-dev
Subject:    Re: RFR 8170541: serviceability/jdwp/AllModulesCommandTest.java fails intermittently on Windows and 
From:       Daniil Titov <daniil.x.titov () oracle ! com>
Date:       2018-02-28 5:08:33
Message-ID: B68339E1-0C4C-4339-B7B2-1129EF35834A () oracle ! com
[Download RAW message or body]

Thank you, David!

Best regards,
Daniil

On 2/26/18, 8:20 PM, "David Holmes" <david.holmes@oracle.com> wrote:

    The two-step send came in with:
    
    https://bugs.openjdk.java.net/browse/JDK-6401245
    
    "Small JDWP packets with the socket transport causes slow debugging on 
    linux 2.6.15 kernel and newer"
    
    David
    -----
    
    On 27/02/2018 9:29 AM, serguei.spitsyn@oracle.com wrote:
    > On 2/26/18 15:06, Chris Plummer wrote:
    >> On 2/26/18 3:00 PM, daniil.x.titov@oracle.com wrote:
    >>>
    >>>
    >>> On 2/26/18 12:16 PM, Chris Plummer wrote:
    >>>> On 2/26/18 11:51 AM, daniil.x.titov@oracle.com wrote:
    >>>>> Hi David and Sergei,
    >>>>>
    >>>>> On 2/20/18 10:16 PM, serguei.spitsyn@oracle.com wrote:
    >>>>>> Hi David,
    >>>>>>
    >>>>>>
    >>>>>> On 2/20/18 20:02, David Holmes wrote:
    >>>>>>> Hi Daniil,
    >>>>>>>
    >>>>>>> Good find on this!
    >>>>>>>
    >>>>>>> What does the actual spec say about the length of things and how 
    >>>>>>> they may be split across multiple packets? Are we guaranteed that 
    >>>>>>> at most two packets will be involved?
    >>>>>
    >>>>> The JDWP spec 
    >>>>> (https://docs.oracle.com/javase/9/docs/specs/jdwp/jdwp-spec.html) 
    >>>>> says nothing about splitting JDWP reply packets at all but the 
    >>>>> implementation limits the max number of the sent packets to two 
    >>>>> packets max. The implementation is dated back to the initial load 
    >>>>> that happened in 2007 and the information about the related Jira 
    >>>>> issue is missing.
    >>>>>
    >>>>> open/src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c
    >>>>>
    >>>>> 836    data = packet->type.cmd.data;
    >>>>> 837    /* Do one send for short packets, two for longer ones */
    >>>>> 838    if (data_len <= MAX_DATA_SIZE) {
    >>>>> 839        memcpy(header + JDWP_HEADER_SIZE, data, data_len);
    >>>>> 840        if (send_fully(socketFD, (char *)&header, 
    >>>>> JDWP_HEADER_SIZE + data_len) !=
    >>>>> 841            JDWP_HEADER_SIZE + data_len) {
    >>>>> 842            RETURN_IO_ERROR("send failed");
    >>>>> 843        }
    >>>>> 844    } else {
    >>>>> 845        memcpy(header + JDWP_HEADER_SIZE, data, MAX_DATA_SIZE);
    >>>>> 846        if (send_fully(socketFD, (char *)&header, 
    >>>>> JDWP_HEADER_SIZE + MAX_DATA_SIZE) !=
    >>>>> 847            JDWP_HEADER_SIZE + MAX_DATA_SIZE) {
    >>>>> 848            RETURN_IO_ERROR("send failed");
    >>>>> 849        }
    >>>>> 850        /* Send the remaining data bytes right out of the data 
    >>>>> area. */
    >>>>> 851        if (send_fully(socketFD, (char *)data + MAX_DATA_SIZE,
    >>>>> 852                       data_len - MAX_DATA_SIZE) != data_len - 
    >>>>> MAX_DATA_SIZE) {
    >>>>> 853            RETURN_IO_ERROR("send failed");
    >>>>> 854        }
    >>>>> 855    }
    >>>>>
    >>>> Curious. First packet is limited to MAX_DATA_SIZE, 2nd packet has no 
    >>>> size limit. What's the point then of splitting it then? Is there a 
    >>>> desire to get the header transmitted in a smaller packet.
    >>>>
    >>>> Chris
    >>>
    >>> It looks as the goal was to somehow improve the responsiveness in 
    >>> case of the large data but I am not sure about this. I could not 
    >>> locate any traces in Jira related to this implementation.
    >> I was thinking it might be something like that too. Get the header 
    >> across the wire quickly. Maybe the user just wants the header (with 
    >> size info) initially, and will allocate a large buffer for the rest if 
    >> necessary.
    > 
    > It was my guess too.
    > At least, it is the best explanation for this design that looks 
    > reasonable to me.
    > 
    > 
    >> Chris
    >>> Probably Serguei has some info what is the history behind this design.
    > 
    > I don't know the history here.
    > This was implemented in very early days, most likely, before JDK 1.5 or 
    > even 1.4.2.
    > 
    > Thanks,
    > Serguei
    > 
    >>>>>>>
    >>>>>>>  68     protected byte[] readJdwpString(DataInputStream ds) 
    >>>>>>> throws IOException {
    >>>>>>>   69         byte[] str = null;
    >>>>>>>   70         int len = ds.readInt();
    >>>>>>>   71         if (len > 0) {
    >>>>>>>   72             str = new byte[len];
    >>>>>>>   73             ds.read(str, 0, len);
    >>>>>>>   74         }
    >>>>>>>
    >>>>>>> might we get a short-read of the string if it is split across 
    >>>>>>> multiple packets?
    >>>>>>
    >>>>> This and all other reads happen not directly from the socket input 
    >>>>> stream but rather from the  DataInputStream object that is 
    >>>>> constructed in JdwpReply.initFromStream(InputStream) method. With 
    >>>>> the proposed fix we do ensure that the created DataInputStream 
    >>>>> object contains data from both packets in cases when the reply was 
    >>>>> split in two packets.
    >>>>>
    >>>>>> Nice catch!
    >>>>>> Even though this fix is enough to resolve this problem now, there 
    >>>>>> is a chance,
    >>>>>> it can fail in the future when more modules are added to the 
    >>>>>> platform.
    >>>>>>
    >>>>>>
    >>>>>>> I'm wondering if all these reads should be loops, ensuring we 
    >>>>>>> read the expected amount of data.
    >>>>>>>
    >>>>> Since the implementation of the socket transport limits the max 
    >>>>> number of packets the reply might be split in to two packets I 
    >>>>> don't think we really need it here.
    >>>>>>> One further comment - not sure why we need the print out for when 
    >>>>>>> we do read multiple packets?
    >>>>>>> That would seem to be a debugging aid.
    >>>>>>
    >>>>>> Yes, it helps to understand what happens.
    >>>>>> Many tests have a lack of tracing which makes it harder to debug 
    >>>>>> and understand failures.
    >>>>> That is correct.  This additional tracing was added to help to 
    >>>>> understand the possible failures in the future.
    >>>>>>
    >>>>>>
    >>>>>> Thanks,
    >>>>>> Serguei
    >>>>>>
    >>>>>>>
    >>>>>>> Thanks,
    >>>>>>> David
    >>>>>>>
    >>>>> Thanks,
    >>>>> Daniil
    >>>>>
    >>>>>>> On 21/02/2018 10:14 AM, Daniil Titov wrote:
    >>>>>>>> Hi Serguei,
    >>>>>>>>
    >>>>>>>> A new version of the webrev that has these strings reformatted 
    >>>>>>>> is at http://cr.openjdk.java.net/~dtitov/8170541/webrev.02/
    >>>>>>>>
    >>>>>>>> Thank you!
    >>>>>>>>
    >>>>>>>> Best regards,
    >>>>>>>>
    >>>>>>>> Daniil
    >>>>>>>>
    >>>>>>>> *From: *"serguei.spitsyn@oracle.com" <serguei.spitsyn@oracle.com>
    >>>>>>>> *Date: *Tuesday, February 20, 2018 at 3:00 PM
    >>>>>>>> *To: *Daniil Titov <daniil.x.titov@oracle.com>, 
    >>>>>>>> "serviceability-dev@openjdk.java.net" 
    >>>>>>>> <serviceability-dev@openjdk.java.net>
    >>>>>>>> *Subject: *Re: RFR 8170541: 
    >>>>>>>> serviceability/jdwp/AllModulesCommandTest.java fails 
    >>>>>>>> intermittently on Windows and Solaris
    >>>>>>>>
    >>>>>>>> Hi Daniil,
    >>>>>>>>
    >>>>>>>> Interesting issue...
    >>>>>>>> Thank you for finding to the root cause so quickly!
    >>>>>>>>
    >>>>>>>> The fix looks good.
    >>>>>>>> Could I ask you to reformat these lines to make the L54 shorter ?:
    >>>>>>>>
    >>>>>>>>    54                 System.out.println("[" + 
    >>>>>>>> getClass().getName() + "] Only " + bytesRead + " bytes of " + 
    >>>>>>>> dataLength +
    >>>>>>>>
    >>>>>>>>    55                         " were read in the first packet. 
    >>>>>>>> Reading the rest...");
    >>>>>>>>
    >>>>>>>> Thanks,
    >>>>>>>> Serguei
    >>>>>>>>
    >>>>>>>>
    >>>>>>>> On 2/20/18 09:24, Daniil Titov wrote:
    >>>>>>>>
    >>>>>>>>     Please review the changes that fix intermittent failure of
    >>>>>>>>     serviceability/jdwp/AllModulesCommandTest.java test.
    >>>>>>>>
    >>>>>>>>     The problem here is that for a large data the JDWP agent
    >>>>>>>>     (socketTransport_writePacket() method in
    >>>>>>>> src/jdk.jdwp.agent/share/native/libdt_socket/socketTransport.c )
    >>>>>>>>     sends 2 packets and in some cases only the first packet is 
    >>>>>>>> received
    >>>>>>>>     at the time when the test reads the reply from the JDWP 
    >>>>>>>> agent. Since
    >>>>>>>>     the test does not check that all data is received in the first
    >>>>>>>>     packet the correlation between commands and replies became 
    >>>>>>>> broken
    >>>>>>>>     (the unread second packet is read by the next command and 
    >>>>>>>> the reply
    >>>>>>>>     for the next command is read by the next after next command 
    >>>>>>>> and so on).
    >>>>>>>>
    >>>>>>>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8170541
    >>>>>>>>
    >>>>>>>>     Webrev: http://cr.openjdk.java.net/~dtitov/8170541/webrev.01
    >>>>>>>>
    >>>>>>>>     The tests ran successfully with Mach5.
    >>>>>>>>
    >>>>>>>>     Best regards,
    >>>>>>>>
    >>>>>>>>     Daniil
    >>>>>>>>
    >>>>>>>>
    >>>>>>>>
    >>>>>>
    >>>>>
    >>>>
    >>>>
    >>>
    >>
    >>
    > 
    


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic