[prev in list] [next in list] [prev in thread] [next in thread]
List: activemq-dev
Subject: [jira] [Commented] (AMQ-3725) Kahadb error during SAN failover delayed write - Allow kahaDB to recov
From: "Dejan Bosanac (JIRA)" <jira () apache ! org>
Date: 2013-10-31 17:11:29
Message-ID: JIRA.12543079.1329505041202.1683.1383239489630 () arcas
[Download RAW message or body]
[ https://issues.apache.org/jira/browse/AMQ-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810442#comment-13810442 \
]
Dejan Bosanac commented on AMQ-3725:
------------------------------------
Hi,
I just pushed a change that should help with this scenario. I also tested ti with the \
USB drive and it seems that now KahaDB recovers properly. I also used a bit longer \
resume period (30 sec)
<ioExceptionHandler>
<defaultIOExceptionHandler stopStartConnectors="true" \
resumeCheckSleepPeriod="30000" /> </ioExceptionHandler>
so I can have a bit more time to unplug and plug the drive back in before broker \
tries to recreate the folder.
I started a new snapshot build, but I'm not sure if it's gonna be built soon. You can \
build it locally anyways. Test it out and let me know if the problem is still there.
> Kahadb error during SAN failover delayed write - Allow kahaDB to recover in a \
> similar manner as the JDBC store using the IOExceptionHandler
> -------------------------------------------------------------------------------------------------------------------------------------------
>
> Key: AMQ-3725
> URL: https://issues.apache.org/jira/browse/AMQ-3725
> Project: ActiveMQ
> Issue Type: Bug
> Components: Message Store
> Affects Versions: 5.5.1
> Reporter: Jason Sherman
> Fix For: 5.10.0
>
> Attachments: AMQ-3725-10112013.txt
>
>
> An issue can arise that causes the broker to terminate when using kahaDB with a \
> SAN, when the SAN fails over. In this case the failover process is seamless \
> however, on fail back there is a 2-3 sec delay where writes are blocked and the \
> broker terminates. With the JDBC datastore a similar situation can be handled by \
> using the IOExceptionHandler. However with kahaDB, when this same \
> IOExceptionHandler is added it prevents the broker from terminating but kahaDB \
> retains an invalid index. {code}
> INFO | ActiveMQ JMS Message Broker (Broker1, \
> ID:macbookpro-251a.home-56915-1328715089252-0:1) started INFO | \
> jetty-7.1.6.v20100715 INFO | ActiveMQ WebConsole initialized.
> INFO | Initializing Spring FrameworkServlet 'dispatcher'
> INFO | ActiveMQ Console at http://0.0.0.0:8161/admin
> INFO | ActiveMQ Web Demos at http://0.0.0.0:8161/demo
> INFO | RESTful file access application at http://0.0.0.0:8161/fileserver
> INFO | FUSE Web Console at http://0.0.0.0:8161/console
> INFO | Started SelectChannelConnector@0.0.0.0:8161
> ERROR | KahaDB failed to store to Journal
> java.io.SyncFailedException: sync failed
> at java.io.FileDescriptor.sync(Native Method)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
> at java.io.FileDescriptor.sync(Native Method)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | Checkpoint failed
> java.io.SyncFailedException: sync failed
> at java.io.FileDescriptor.sync(Native Method)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
> at java.io.FileDescriptor.sync(Native Method)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory) at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.FileNotFoundException: \
> /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory) at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory) at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.FileNotFoundException: \
> /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory) at java.io.RandomAccessFile.open(Native Method)
> at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
> at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> WARN | Transport failed: java.io.EOFException
> WARN | Transport failed: java.io.EOFException
> INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
> at java.io.RandomAccessFile.write(Native Method)
> at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
> at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
> at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
> at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
> INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
> at java.io.RandomAccessFile.write(Native Method)
> at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
> at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
> at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
> at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
> INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
> at java.io.RandomAccessFile.write(Native Method)
> at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
> at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
> at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
> at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
> INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
> at java.io.RandomAccessFile.write(Native Method)
> at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
> at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
> at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
> at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
> WARN | Transport failed: java.io.EOFException
> {code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic