[prev in list] [next in list] [prev in thread] [next in thread] 

List:       freebsd-hackers
Subject:    Is this procstat -v output valid/expected? Explanation?
From:       Mark Millard <markmi () dsl-only ! net>
Date:       2017-03-31 3:54:45
Message-ID: D526FEE8-E234-4219-8850-70EB7C689946 () dsl-only ! net
[Download RAW message or body]

The following is based on a test-case program that:

A) Allocates lots of 14KiByte "regions" with malloc,
   initializing each byte of each region to a
   never-zero pattern of bytes. "Lots" uses up
   most of 256 MiBytes across the regions. Once
   initialized none of these bytes are written
   again (not even by the later child process).

B) Tests the byte patterns (SIGABRT if the pattern
   test fails).

C) Forks.

D) The parent waits for child; child sleeps 60 sec.

   Note: In the full test forcing swapping of
         both is involved during the sleep but
         that is not being done here.

E) The child checks the byte patterns and exits
   (SIGABRT if the pattern test fails). The child
   does not write to any of the allocation regions.

F) The (former) parent checks the byte patterns
   and exits (SIGABRT if the pattern test fails).
   It does not write to any of the allocated
   regions during this activity.

[The context happens to be arm64.]

Note the two instances of "67306" in the below from
what will become the parent process:

# procstat -v 6310
  PID              START                END PRT  RES PRES REF SHD FLAG TP PATH
 6310            0x10000            0x11000 r--    1   51   3   1 CN-- vn /root/c_tests/swaptesting2
 6310            0x20000            0x21000 r-x    1   51   3   1 CN-- vn /root/c_tests/swaptesting2
 6310            0x30000            0x40000 rw-   16    0   1   0 C--- vn /root/c_tests/swaptesting2
 6310            0x40000            0x41000 r--    1   38   2   0 ---- df 
 6310            0x41000            0x75000 rw-   37   38   2   0 ---- df 
 6310         0x40030000         0x4004b000 r-x   27   29  59  27 CN-- vn /libexec/ld-elf.so.1
 6310         0x4004b000         0x40052000 rw-    7    7   1   0 ---- df 
 6310         0x4005a000         0x4005b000 rw-    1    0   1   0 C--- vn /libexec/ld-elf.so.1
 6310         0x4005b000         0x4005c000 rw-    1    1   1   0 ---- df 
 6310         0x4005c000         0x401b4000 r-x  344  376  59  27 CN-- vn /lib/libc.so.7
 6310         0x401b4000         0x401c3000 ---    0    0   1   0 ---- df 
 6310         0x401c3000         0x401cf000 rw-   12    0   1   0 C--- vn /lib/libc.so.7
 6310         0x401cf000         0x40202000 rw-   22 67306   2   0 ---- df 
 6310         0x40400000         0x50e00000 rw- 67284 67306   2   0 ---- df 
 6310     0xfffffffdf000     0xfffffffff000 rw-    3    3   1   0 ---D df 
 6310     0xfffffffff000    0x1000000000000 r-x    1    1  35   0 ---- ph 

Later after the fork (so child sleeping and parent waiting)
it as turned into:

# procstat -v 6310
  PID              START                END PRT  RES PRES REF SHD FLAG TP PATH
 6310            0x10000            0x11000 r--    1   51   5   1 CN-- vn /root/c_tests/swaptesting2
 6310            0x20000            0x21000 r-x    1   51   5   1 CN-- vn /root/c_tests/swaptesting2
 6310            0x30000            0x40000 rw-   16    0   1   0 C--- vn /root/c_tests/swaptesting2
 6310            0x40000            0x41000 r--    1    1   2   0 CN-- df 
 6310            0x41000            0x75000 rw-   37   37   2   0 CN-- df 
 6310         0x40030000         0x4004b000 r-x   27   29  60  27 CN-- vn /libexec/ld-elf.so.1
 6310         0x4004b000         0x40052000 rw-    7    0   1   0 C--- df 
 6310         0x4005a000         0x4005b000 rw-    1    0   2   0 CN-- vn /libexec/ld-elf.so.1
 6310         0x4005b000         0x4005c000 rw-    1    0   1   0 C--- df 
 6310         0x4005c000         0x401b4000 r-x  344  376  60  27 CN-- vn /lib/libc.so.7
 6310         0x401b4000         0x401c3000 ---    0    0   2   0 CN-- df 
 6310         0x401c3000         0x401cf000 rw-   12    0   2   0 CN-- vn /lib/libc.so.7
 6310         0x401cf000         0x40202000 rw-   22   22   2   0 CN-- df 
 6310         0x40400000         0x50e00000 rw- 67284 67284   2   0 CN-- df 
 6310     0xfffffffdf000     0xfffffffff000 rw-    3    0   1   0 C--D df 
 6310     0xfffffffff000    0x1000000000000 r-x    1    1  36   0 ---- ph 

The child never shows the large PRES figure for the range:

0x401cf000         0x40202000

But for the size of that range the earlier PRES==67306 seems odd,
as if it spans the following:

0x40400000         0x50e0000

In fact 22+67284==67306.


Another point that I noticed that the I found SHD
stays zero on the memory area spanning the allocations
(0x40400000 0x50e00000) and more:
(This was during the child's sleep.)

# procstat -v 6313
  PID              START                END PRT  RES PRES REF SHD FLAG TP PATH
 6313            0x10000            0x11000 r--    1   51   5   1 CN-- vn /root/c_tests/swaptesting2
 6313            0x20000            0x21000 r-x    1   51   5   1 CN-- vn /root/c_tests/swaptesting2
 6313            0x30000            0x40000 rw-   16    0   1   0 C--- vn /root/c_tests/swaptesting2
 6313            0x40000            0x41000 r--    1    1   2   0 CN-- df 
 6313            0x41000            0x75000 rw-   37   37   2   0 CN-- df 
 6313         0x40030000         0x4004b000 r-x   27   29  60  27 CN-- vn /libexec/ld-elf.so.1
 6313         0x4004b000         0x40052000 rw-    7    0   1   0 C--- df 
 6313         0x4005a000         0x4005b000 rw-    1    0   2   0 CN-- vn /libexec/ld-elf.so.1
 6313         0x4005b000         0x4005c000 rw-    1    0   1   0 C--- df 
 6313         0x4005c000         0x401b4000 r-x  344  376  60  27 CN-- vn /lib/libc.so.7
 6313         0x401b4000         0x401c3000 ---    0    0   2   0 CN-- df 
 6313         0x401c3000         0x401cf000 rw-   12    0   2   0 CN-- vn /lib/libc.so.7
 6313         0x401cf000         0x40202000 rw-   22   22   2   0 CN-- df 
 6313         0x40400000         0x50e00000 rw- 67284 67284   2   0 CN-- df 
 6313     0xfffffffdf000     0xfffffffff000 rw-    3    0   1   0 C--D df 
 6313     0xfffffffff000    0x1000000000000 r-x    1    1  36   0 ---- ph 

For:

0x40400000         0x50e00000
(and more)

my first thought was that forking would shadow for copy-on-write
and so the shadow page count would be non-zero in one or both
of the parent vs. child. But Ive never seen procstat -v report
such a figure for the range holding the allocations.

The REF==2 also seems odd: it lasts from before the fork through
after it as well, both parent and child processes still existing.
It would seem that the REF's are not per-process.


Context details:


# uname -paKU
FreeBSD pine64 12.0-CURRENT FreeBSD 12.0-CURRENT  r315914M  arm64 aarch64 1200027 1200027


FYI: the source code is. . .
(Ignore comments tied to swapping and
its/the "problem" for this question.)

# more swap_testing2.c
// swap_testing2.c

// Built via (c++ was clang++ 4.0 in my case):
//
// cc -g -std=c11 -Wpedantic -o swaptesting2 swap_testing2.c
// -O0 and -O2 also gets the problem.

#include <unistd.h>     // for fork(), sleep(.)
#include <sys/types.h>  // for pid_t
#include <sys/wait.h>   // for wait(.)

#include <signal.h>     // for raise(.), SIGABRT

extern void test_setup(void);         // Sets up the memory byte patterns.
extern void test_check(void);         // Tests the memory byte patterns.
extern void partial_test_check(void); // Tests just [0] of dyn_regions[0]

int main(void) {
    test_setup();
    test_check(); // Before fork() [passes]

    pid_t pid = fork();
    int wait_status = 0;;

    // After fork; before waitsleep/swap-out.

    //if (0==pid) partial_test_check();
                     // Even the above is sufficient by
                     // itself to prevent failure for
                     // region_size 1u through
                     // 4u*1024u!
                     // But 4u*1024u+1u and above fail
                     // with this access to memory.
                     // The failing test is of
                     // (*dyn_regions[0]).array[4096u].
                     // This test never fails here.

    if (0<pid) partial_test_check(); // This never prevents
                                     // later failures (and
                                     // never fails here).

    if (0<pid) { wait(&wait_status); }

    if (-1!=wait_status && 0<=pid) {
        if (0==pid) {
            sleep(60);

            // During this manually force this process to
            // swap out. I use something like:

            // stress -m 1 --vm-bytes 1800M

            // in another shell and ^C'ing it after top
            // shows the swapped status desired. 1800M
            // just happened to work on the Pine64+ 2GB
            // that I was using. I watch with top -PCwaopid .
        }

        test_check(); // After wait/sleep [fails for small-enough region_sizes]

        // raise(SIGABRT);
    }
}

// The memory and test code follows.

#include <stddef.h>     // for size_t, NULL
#include <stdlib.h>     // for malloc(.), free(.)

#define region_size (14u*1024u)
                        // Bad dyn_regions patterns, parent and child
                        // processes:
                        //     256u,  2u*1024u,  4u*1024u, 8u*1024u,
                        // 9u*1024u, 12u*1024u, 14u*1024u
                        // (but see the partial_test_check() call
                        //  notes above).

                        // Works:
                        // 14u*1024u+1u, 15u*1024u, 16u*1024u,
                        // 32u*1024u, 256u*1024u*1024u
#define num_regions (256u*1024u*1024u/region_size)

typedef volatile unsigned char value_type;

struct region_struct { value_type array[region_size]; };
typedef struct region_struct region;

static region * volatile dyn_regions[num_regions] = {NULL,};

static value_type value(size_t v) { return (value_type)((v&0xFEu)|0x1u); }
                  // value now avoids the zero value since the failures
                  // are zeros.

void test_setup(void) {
    for(size_t i=0u; i<num_regions; i++) {
        dyn_regions[i] = malloc(sizeof(region));
        if (!dyn_regions[i]) raise(SIGABRT);

        for(size_t j=0u; j<region_size; j++) {
            (*dyn_regions[i]).array[j] = value(j);
        }
    }
}

void partial_test_check(void) {
    if (value(0u)!=(*dyn_regions[0u]).array[0]) raise(SIGABRT);
}

static volatile size_t first_failure_idx = 0u;
static volatile size_t first_failure_pos = 0u;
static volatile size_t after_bad_idx   = 0u;
static volatile size_t after_bad_pos   = 0u;

void test_check(void) {
    first_failure_idx = first_failure_pos = 0u;
    while (first_failure_idx < num_regions) {
        while (  first_failure_pos < region_size
              && (  value(first_failure_pos)
                 == (*dyn_regions[first_failure_idx]).array[first_failure_pos]
                 )
              ) {
            first_failure_pos++;
        }

        if (region_size != first_failure_pos) break;

        first_failure_idx++;
        first_failure_pos = 0u;
    }

    if (num_regions == first_failure_idx) return;

    after_bad_idx = first_failure_idx;
    after_bad_pos = first_failure_pos;

    while (after_bad_idx < num_regions) {
        while (  after_bad_pos < region_size
              && (  value(after_bad_pos)
                 != (*dyn_regions[after_bad_idx]).array[after_bad_pos]
                 )
              ) {
            after_bad_pos++;
        }

        if(region_size != after_bad_pos) break;

        after_bad_idx++;
        after_bad_pos = 0u;
    }

    raise(SIGABRT);
}



===
Mark Millard
markmi at dsl-only.net

_______________________________________________
freebsd-hackers@freebsd.org mailing list
https://lists.freebsd.org/mailman/listinfo/freebsd-hackers
To unsubscribe, send any mail to "freebsd-hackers-unsubscribe@freebsd.org"
[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic