[prev in list] [next in list] [prev in thread] [next in thread] 

List:       taglib-devel
Subject:    Re: question and method about taglib performance
From:       Indy Sams <indy () driftsolutions ! com>
Date:       2012-09-22 21:42:23
Message-ID: 3372115.20120922174223 () driftsolutions ! com
[Download RAW message or body]

Hello,

I haven't done any real benchmarks, but I thought I would contribute an IOStream \
implementation using mmap on Linux and MapViewOfFile on Windows for peoples' use and \
testing (attached to email). Usage is at the end of the file and only reading is \
implemented, not any write functions.

I may have overlooked it, but I didn't see an equivalent of FileRef that accepts an \
IOStream. I had to create a format-specific File object (eg. TagLib::MPEG::File) \
instead of being able to use a generic tag interface like FileRef.

P.S. Also noticed in the 1.8 download it still has "#define TAGLIB_MINOR_VERSION 7" \
in taglib.h instead of "#define TAGLIB_MINOR_VERSION 8"

Wednesday, September 19, 2012, 5:05:10 AM, you wrote:

YWCEC>  
YWCEC> Hi, Dear,
YWCEC>      We used taglib do mp3 file parsing, the number of parse files we need is \
very bigger, about 5000. We need to do this task as quick as possible. So we meet a \
problem about performace.  YWCEC>  
YWCEC>      We checked the issue and found that much time was wasted when file fopen \
and fread. We think this is caused by fileIO. So we used mmap instread of \
fopen/fread/fwrite when OS is linux, and got the result that the performance was \
increased about 25%. YWCEC>  
YWCEC>       But, as we all know, when using file mmap, the  memory which the process \
need is equal to the size of file. For example, when the filesize is 1G, the process \
need 1G memory for file YWCEC> mmap. This is intolerable. So we add a condition. When \
filesize is smaller than 10M, we use mmap, others we use fopen which the taglib use \
now.  YWCEC>  
YWCEC>       So, What about the method of using mmap, is it acceptable? We had done \
this change and passed the tests which taglib provide.  YWCEC>      
YWCEC>       Waiting for your news, thanks very much
YWCEC>  
YWCEC> B&R
YWCEC> yaowj
YWCEC>  
YWCEC>  


Best regards,
 Indy Sams
 mailto:indy@driftsolutions.com


["taglib.mmap.cpp" (text/plain)]

namespace TagLib {
class MP3_TagIO: public IOStream {
	private:
#if defined(_WIN32)
		FileName * fn;
#else
		FileName fn;
#endif
		char * mapptr;
		long offset, filelen;
		public:
		MP3_TagIO(const char * ffn) {
			fn = NULL;
			mapptr = NULL;
			offset = filelen = 0;

#if defined(_WIN32)	
			HANDLE hFile = CreateFile(ffn, GENERIC_READ, FILE_SHARE_READ, NULL, OPEN_EXISTING, \
0, NULL);  if (hFile != INVALID_HANDLE_VALUE) {
				DWORD dwSize = GetFileSize(hFile, NULL);
				if (dwSize <= 15728640) {
					filelen = dwSize;
					//note: in production code you would need to randomize the mapping name and \
                check for a pre-existing mapping to make sure you don't conflict with \
                something else
					HANDLE hMap = CreateFileMapping(hFile, NULL, PAGE_READONLY, 0, 0, \
"taglib_test");  if (hMap != NULL) {
						mapptr = (char *)MapViewOfFile(hMap, FILE_MAP_READ, 0, 0, 0);
						if (mapptr != NULL) {
							fn = new FileName(ffn);
						}
						CloseHandle(hMap);				
					}
				} else {
					dwSize = dwSize;
				}
				CloseHandle(hFile);
			}
#else
			int fd = open(fn, O_RDONLY);
			if (fd != -1) {
				struct stat sb;
				if (fstat(fd, &sb) == 0) {
					filelen = sb.st_size;
					if (filelen <= 15728640) {
						mapptr = (char *)mmap(NULL, filelen, PROT_READ, MAP_FILE|MAP_PRIVATE, fd, 0);
						if (mapptr != MAP_FAILED) {
							fn = strdup(ffn);
						} else {
							mapptr = NULL;
						}
					}
				}
				close(fd);
			}
#endif
		}

    /*!
     * Destroys this IOStream instance.
     */
    virtual ~MP3_TagIO() {
#if defined(_WIN32)	
			if (mapptr) {
				UnmapViewOfFile(mapptr);
			}
			if (fn) {
				delete fn;
			}
#else
			if (mapptr) {
				munmap(mapptr, filelen);
			}
			if (fn) {
				free((void *)fn);
			}
#endif
		}

    /*!
     * Returns the stream name in the local file system encoding.
     */
    virtual FileName name() const {
#if defined(_WIN32)
			return *fn;
#else
			return fn;
#endif
    }

    /*!
     * Reads a block of size \a length at the current get pointer.
     */
    virtual ByteVector readBlock(ulong len) {
			if (offset >= filelen) {
				return ByteVector::null;
			}
			if (filelen - offset < len) {
				len = filelen - offset;
			}
			return ByteVector::fromCString(mapptr + offset, len);
		}

    /*!
     * Attempts to write the block \a data at the current get pointer.  If the
     * file is currently only opened read only -- i.e. readOnly() returns true --
     * this attempts to reopen the file in read/write mode.
     *
     * \note This should be used instead of using the streaming output operator
     * for a ByteVector.  And even this function is significantly slower than
     * doing output with a char[].
     */
		virtual void writeBlock(const ByteVector &data) {}

    /*!
     * Insert \a data at position \a start in the file overwriting \a replace
     * bytes of the original content.
     *
     * \note This method is slow since it requires rewriting all of the file
     * after the insertion point.
     */
		virtual void insert(const ByteVector &data, ulong start = 0, ulong replace = 0) {}

    /*!
     * Removes a block of the file starting a \a start and continuing for
     * \a length bytes.
     *
     * \note This method is slow since it involves rewriting all of the file
     * after the removed portion.
     */
		virtual void removeBlock(ulong start = 0, ulong length = 0) {}

    /*!
     * Returns true if the file is read only (or if the file can not be opened).
     */
		virtual bool readOnly() const { return true; }

    /*!
     * Since the file can currently only be opened as an argument to the
     * constructor (sort-of by design), this returns if that open succeeded.
     */
    virtual bool isOpen() const {
			return (mapptr != NULL) ? true:false;
		}

    /*!
     * Move the I/O pointer to \a offset in the stream from position \a p.  This
     * defaults to seeking from the beginning of the stream.
     *
     * \see Position
     */
    virtual void seek(long off, Position p = Beginning) {
			switch (p) {
				case IOStream::Beginning:
					offset = off;
					break;
				case IOStream::Current:
					offset = offset + off;
					break;
				case IOStream::End:
					offset = filelen + off;
					break;
			}
			if (offset < 0) {
				offset = 0;
			}
		}

    /*!
     * Reset the end-of-stream and error flags on the stream.
     */
    virtual void clear() {
			offset = 0;
		}

    /*!
     * Returns the current offset within the stream.
     */
    virtual long tell() const {
			return offset;
		}

    /*!
     * Returns the length of the stream.
     */
    virtual long length() {
			return filelen;
		}

    /*!
     * Truncates the stream to a \a length.
     */
		virtual void truncate(long length) {}
 };
 };

/**
	Usage of this class for reading MP3 tags:
	
	TagLib::MP3_TagIO io(fn);
	if (io.isOpen()) {
		TagLib::MPEG::File iTag(&io, TagLib::ID3v2::FrameFactory::instance());
		// whatever you want to do with tag data
	}
	
**/



_______________________________________________
taglib-devel mailing list
taglib-devel@kde.org
https://mail.kde.org/mailman/listinfo/taglib-devel


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic