[prev in list] [next in list] [prev in thread] [next in thread] 

List:       activemq-dev
Subject:    [jira] [Comment Edited] (AMQ-5300) Inifinite loop when attempting to replay levelDB logs to rebuild 
From:       "Pablo Lozano (JIRA)" <jira () apache ! org>
Date:       2015-01-30 4:54:35
Message-ID: JIRA.12730988.1406778460000.212518.1422593675312 () Atlassian ! JIRA
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/AMQ-5300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14298185#comment-14298185 \
] 

Pablo Lozano edited comment on AMQ-5300 at 1/30/15 4:53 AM:
------------------------------------------------------------

Hi Good day,

It seems this issue is not fixed as I have been able to replicated on Replicated \
Level DB and the lastest 5.11 snapshot. The levelDB corrupts even at the point that \
every time I kill current master the next slave to take the position starts doing the \
infinite loop. The only way to fix this is to delete the leveldb data from all \
ActiveMQ instances which obviously lets me without messages. I find this issue to be \
quite critical as it occurs even on graceful shut downs of ActiveMQ.

I have a attached a copy of the logs and my levelDB directory. (If all messages on \
the Queue look the same is because they are, for testing purposes i send the same \
message over and over) \
https://drive.google.com/file/d/0B6ANh1aTzRg3S2Q2SVRUY0ZhT0k/view?usp=sharing


My settings are:
Ubuntu 14.04 64bit
leveldb jin linux-x64

            <replicatedLevelDB
                    directory="${mailSystem.activeMQ.rebDB}"
                    replicas="3"
                    sync="local_mem"
                    logSize="25413000"
                    indexCompression="none"
                    zkAddress="lstkmy90430:2181,lstkmy36606:2181,lstkmy52108:2181"
                    zkPath="/activemq/leveldb-stores"
                    />




was (Author: altaflux):
Hi Good day,

It seems this issue is not fixed as I have been able to replicated on Replicated \
Level DB and the lastest 5.11 snapshot. The levelDB corrupts even at the point that \
every time I kill current master the next slave to take the position starts doing the \
infinite loop. The only way to fix this is to delete the leveldb data from all \
ActiveMQ instances which obviously lets me without messages. I find this issue to be \
quite critical as it occurs even on graceful shut downs of ActiveMQ.

I have a attached a copy of the logs and my levelDB directory. (If all messages on \
the Queue look the same is because they are, for testing purposes i send the same \
message over and over) \
https://drive.google.com/file/d/0B6ANh1aTzRg3S2Q2SVRUY0ZhT0k/view?usp=sharing


My settings are:


            <replicatedLevelDB
                    directory="${mailSystem.activeMQ.rebDB}"
                    replicas="3"
                    sync="local_mem"
                    logSize="25413000"
                    indexCompression="none"
                    zkAddress="lstkmy90430:2181,lstkmy36606:2181,lstkmy52108:2181"
                    zkPath="/activemq/leveldb-stores"
                    />



> Inifinite loop when attempting to replay levelDB logs to rebuild index
> ----------------------------------------------------------------------
> 
> Key: AMQ-5300
> URL: https://issues.apache.org/jira/browse/AMQ-5300
> Project: ActiveMQ
> Issue Type: Bug
> Components: activemq-leveldb-store
> Affects Versions: 5.9.1, 5.10.0
> Environment: Linux
> Reporter: Vu Le
> Assignee: Gary Tully
> Fix For: 5.11.0
> 
> 
> While searching for a workaround for issue AMQ-5284, I came across this issue.
> To work around the serialization issue (AMQ-5284), I deleted the index snapshots \
> from the LevelDB datastore. This will replay the logs to regenerate the index. \
> However, if a log rotation has already occurred, you will get an infinite loop upon \
> restart. Here are the steps to reproduce what I am seeing:
> Configure ActiveMQ 5.10.0 to use a LevelDB data store with the log size of about \
> 1MB. {code}
> <persistenceAdapter>
> <levelDB directory="${activemq.data}/leveldb" logSize="1000000" />
> </persistenceAdapter>
> {code}
> Then I started up the broker and published 10,000 persistent messages to a queue, \
> causing the log files to rotate (twice in my case). I see the following files in \
> the data store folder: {code}
> -rw-rw-r--. 1 user users 1000071 Jul 30 11:15 0000000000000000.log
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:16 00000000001e84d0.index
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:17 00000000001e84d0.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> -rw-rw-r--. 1 user users       0 Jul 30 11:11 lock
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> I then consume 5,000 messages, which causes the first log to be deleted since it is \
> no longer being referenced. I see the following log statements: {code}
> 2014-07-30 11:29:14,960 | DEBUG | Log no longer referenced: 0 | \
> org.apache.activemq.leveldb.LevelDBClient | Thread-2 2014-07-30 11:29:14,967 | \
> DEBUG | Deleting log at 0 | org.apache.activemq.leveldb.LevelDBClient | Thread-2 \
> {code} And I see the remaining files in the data store folder (notice the \
> 0000000000000000.log is gone): {code}
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> -rw-rw-r--. 1 user users 1000011 Jul 30 11:29 00000000001e84d0.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:29 00000000002dc71b.index
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:29 00000000002dc71b.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> -rw-rw-r--. 1 user users       0 Jul 30 11:11 lock
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> At this point, I shut down the broker and here is the listing of what's left in the \
> data store: {code}
> -rw-rw-r--. 1 user users 1000009 Jul 30 11:16 00000000000f4287.log
> -rw-rw-r--. 1 user users 1000011 Jul 30 11:29 00000000001e84d0.log
> -rw-rw-r--. 1 user users 1000000 Jul 30 11:29 00000000002dc71b.log
> drwxrwxr-x. 2 user users    4096 Jul 30 11:36 0000000000301737.index
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 dirty.index
> drwxrwxr-x. 2 user users    4096 Jul 30 11:11 plist.index
> -rw-rw-r--. 1 user users      24 Jul 30 11:11 store-version.txt
> {code}
> I then delete the index folder within the data store (in my case \
> "0000000000301737.index"). I am doing this to force a replay of the logs to \
> regenerate the index (due to the serialization issue I ran into). And finally, this \
> is the message I am getting once I start the broker back up (infinite loop of this \
> same message, and I have to shut down the broker): {code}
> 2014-07-30 11:40:27,415 | WARN  | No reader available for position: 0, log_infos: \
> {1000071=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000000f4287.log,1000071,1000009), \
> 2000080=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000001e84d0.log,2000080,1000011), \
> 3000091=LogInfo(/home/user/apache-activemq-5.10.0/data/leveldb/00000000002dc71b.log,3000091,0)} \
> | org.apache.activemq.leveldb.RecordLog | main {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic