[prev in list] [next in list] [prev in thread] [next in thread] 

List:       postgresql-general
Subject:    Re: [GENERAL] Linux max on shared buffers?
From:       "Glen Parker" <glenebob () nwlink ! com>
Date:       2002-07-20 20:17:21
[Download RAW message or body]

Here's a rediculous hack of Martijn's program that runs on windows
(win2K in my case), using the sorta-mmap-like calls in windows.

Several runs on my box produced errors at offsets 0x048D and 0x159E.

Glen Parker.


> 
> Whoopsie. Here's the program :)
> 
> On Sun, Jul 21, 2002 at 12:19:43AM +1000, Martijn van 
> Oosterhout wrote:
> > On Sat, Jul 20, 2002 at 09:09:59AM -0400, Tom Lane wrote:
> > > Martijn van Oosterhout <kleptog@svana.org> writes:
> > > > Well, you would have to deal with the fact that writing 
> changes to a mmap()
> > > > is allowed, but you have no guarentee when it will be 
> finally written. Given
> > > > WAL I would suggest using mmap() for reading only and 
> using write() to
> > > > update the file.
> > > 
> > > This is surely NOT workable; every mmap man page I've 
> looked at is very
> > > clear that you cannot expect predictable behavior if you use both
> > > filesystem and mmap access to the same file.  For 
> instance, HP says
> > > 
> > >      It is also unspecified whether write references to a 
> memory region
> > >      mapped with MAP_SHARED are visible to processes 
> reading the file and
> > >      whether writes to a file are visible to processes 
> that have mapped the
> > >      modified portion of that file, except for the effect 
> of msync().
> > > 
> > > So unless you want to msync after every write I do not 
> think this can fly.
> > 
> > Well ofcourse. The entire speed improvment is based on the 
> fact that mmap()
> > is giving you a window into the system disk cache. If the 
> OS isn't built
> > that way then it's not going to work. It does work on Linux 
> and is fairly
> > easy to test for. I've even attached a simple program to try it out.
> > 
> > Ofcourse it's not complete. You'd need to try multiple 
> processes to see what
> > happens, but I'd be interested how diverse the mmap() 
> implementations are.
> > 
> > > > If in that process the kernel needed
> > > > to throw out another page, who cares?
> > > 
> > > We do, because we have to control write ordering.
> > 
> > Which is why you use write() to control that
> > 
> > > > It is different. I beleive you would still need some 
> form of shared memory
> > > > to co-ordinate write()s.
> > > 
> > > The whole idea becomes less workable the more we look at it.
> > 
> > I guess this is one of those cases where working code would 
> be need to
> > convince anybody. In the hypothetical case someone had 
> time, the approprite
> > place to add this would be src/backend/storage/buffer, 
> since all buffer
> > loads go through there, right?
> > 
> > The only other question is whether there is anyway to know 
> when a buffer
> > will be modified. I get the impression sometimes bits are 
> twiddled without
> > the buffer being marked dirty.
> 
> -- 
> Martijn van Oosterhout   <kleptog@svana.org>   
> http://svana.org/kleptog/
> > There are 10 kinds of people in 
> the world, those that can do binary
> > arithmetic and those that can't.
> 

["test_mmap.c" (application/octet-stream)]

#include <io.h>
#include <stdio.h>
#include <string.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <stdlib.h>
#include <windows.h>

char *filename = "\\temp\\test_mmap.dat";

void err_exit(int ecode)
{
	getchar();	
	exit(ecode);
}

int test_mmap( int fd, unsigned char *ptr, int offset )
{
  char buffer[16], before[16];
  int i;

  /* Fill buffer with some pseudo random bytes */
  for( i = 0; i < 4; i++ )
  {
    *(int*)(buffer + i*4) = rand();
  }
  
  printf( "Offset: 0x%04X  ", offset );
  memcpy( before, ptr+offset, 16 );           /* Save current value */

  lseek( fd, offset, SEEK_SET );              /* Write() to file */
  write( fd, buffer, 16 );
  
  if( memcmp( before, buffer, 16 ) == 0  ||   /* Check mmap() changed */
      memcmp( ptr+offset, buffer, 16 ) != 0 )
  {
    printf( "Error\n" );
  }
  else
  {
    printf( "OK\n" );
  }
}


/* Make a 64K file to test */
void make_file()
{
  FILE *file = fopen( filename, "wb" );
  char buffer[256];
  int i;
  
  if( file == NULL )
  {
    perror( "Couldn't create test file" );
    err_exit(1);
  }
  
  memset( buffer, 0, sizeof(buffer) );
  for( i=0; i<256; i++ )
    fwrite( buffer, sizeof(buffer), 1, file );
    
  fclose( file );
}

#define PROT_READ 0
#define MAP_SHARED 0

void* mmap(void  *start, size_t length, int prot, int flags, int fd, off_t offset)
{
	HANDLE		fm;

	fm = CreateFileMapping((HANDLE)_get_osfhandle(fd), NULL, PAGE_READWRITE, 0, length, NULL);
	if (! fm) {
		return NULL;
	}
	return MapViewOfFile(fm, FILE_MAP_WRITE, 0, 0, length);
}

int main()
{
  unsigned int offset = 0x12345678;
  int i, fd;
  unsigned char *ptr;
  
  make_file();
  
  fd = open( filename, O_RDWR );
  ptr = mmap( NULL, 65536, PROT_READ, MAP_SHARED, fd, 0 );

  for( i=0; i<64; i++ )    /* We go 64 times so each offset is modified more that once */
  {
    test_mmap( fd, ptr, offset & 0xFFFF );
    offset = (offset << 1) | (offset >> 31);  /* Rotate offset */
  }

	printf("done..."); fflush(stdout);
	getchar();
}


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic