[prev in list] [next in list] [prev in thread] [next in thread] 

List:       activemq-dev
Subject:    [jira] [Commented] (AMQ-3725) Kahadb error during SAN failover delayed write - Allow kahaDB to recov
From:       "Dejan Bosanac (JIRA)" <jira () apache ! org>
Date:       2013-10-31 17:11:29
Message-ID: JIRA.12543079.1329505041202.1683.1383239489630 () arcas
[Download RAW message or body]


    [ https://issues.apache.org/jira/browse/AMQ-3725?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13810442#comment-13810442 \
] 

Dejan Bosanac commented on AMQ-3725:
------------------------------------

Hi,

I just pushed a change that should help with this scenario. I also tested ti with the \
USB drive and it seems that now KahaDB recovers properly. I also used a bit longer \
resume period (30 sec)

        <ioExceptionHandler>
            <defaultIOExceptionHandler stopStartConnectors="true" \
resumeCheckSleepPeriod="30000" />  </ioExceptionHandler>

so I can have a bit more time to unplug and plug the drive back in before broker \
tries to recreate the folder. 

I started a new snapshot build, but I'm not sure if it's gonna be built soon. You can \
build it locally anyways. Test it out and let me know if the problem is still there.

> Kahadb error during SAN failover delayed write - Allow kahaDB to recover in a \
>                 similar manner as the JDBC store using the IOExceptionHandler
> -------------------------------------------------------------------------------------------------------------------------------------------
>  
> Key: AMQ-3725
> URL: https://issues.apache.org/jira/browse/AMQ-3725
> Project: ActiveMQ
> Issue Type: Bug
> Components: Message Store
> Affects Versions: 5.5.1
> Reporter: Jason Sherman
> Fix For: 5.10.0
> 
> Attachments: AMQ-3725-10112013.txt
> 
> 
> An issue can arise that causes the broker to terminate when using kahaDB with a \
> SAN, when the SAN fails over.  In this case the failover process is seamless \
> however, on fail back there is a 2-3 sec delay where writes are blocked and the \
> broker terminates.  With the JDBC datastore a similar situation can be handled by \
> using the IOExceptionHandler.  However with kahaDB, when this same \
> IOExceptionHandler is added it prevents the broker from terminating but kahaDB \
> retains an invalid index. {code}
> INFO | ActiveMQ JMS Message Broker (Broker1, \
> ID:macbookpro-251a.home-56915-1328715089252-0:1) started INFO | \
> jetty-7.1.6.v20100715 INFO | ActiveMQ WebConsole initialized.
> INFO | Initializing Spring FrameworkServlet 'dispatcher'
> INFO | ActiveMQ Console at http://0.0.0.0:8161/admin
> INFO | ActiveMQ Web Demos at http://0.0.0.0:8161/demo
> INFO | RESTful file access application at http://0.0.0.0:8161/fileserver
> INFO | FUSE Web Console at http://0.0.0.0:8161/console
> INFO | Started SelectChannelConnector@0.0.0.0:8161
> ERROR | KahaDB failed to store to Journal
> java.io.SyncFailedException: sync failed
> 	at java.io.FileDescriptor.sync(Native Method)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
> 	at java.io.FileDescriptor.sync(Native Method)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | Checkpoint failed
> java.io.SyncFailedException: sync failed
> 	at java.io.FileDescriptor.sync(Native Method)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.SyncFailedException: sync failed
> java.io.SyncFailedException: sync failed
> 	at java.io.FileDescriptor.sync(Native Method)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:382)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory)  at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.FileNotFoundException: \
>                 /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory)  at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> ERROR | KahaDB failed to store to Journal
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory)  at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> INFO | Ignoring IO exception, java.io.FileNotFoundException: \
>                 /Volumes/NAS-01/data/kahadb/db-1.log (No such file or directory)
> java.io.FileNotFoundException: /Volumes/NAS-01/data/kahadb/db-1.log (No such file \
> or directory)  at java.io.RandomAccessFile.open(Native Method)
> 	at java.io.RandomAccessFile.<init>(RandomAccessFile.java:216)
> 	at org.apache.kahadb.journal.DataFile.openRandomAccessFile(DataFile.java:70)
> 	at org.apache.kahadb.journal.DataFileAppender.processQueue(DataFileAppender.java:324)
>   at org.apache.kahadb.journal.DataFileAppender$2.run(DataFileAppender.java:203)
> WARN | Transport failed: java.io.EOFException
> WARN | Transport failed: java.io.EOFException
> INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
> 	at java.io.RandomAccessFile.write(Native Method)
> 	at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> 	at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> 	at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>   at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>   at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>   at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
> 	at java.io.RandomAccessFile.write(Native Method)
> 	at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> 	at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> 	at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>   at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>   at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>   at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | KahaDB: Recovering checkpoint thread after death
> ERROR | Checkpoint failed
> java.io.IOException: Input/output error
> 	at java.io.RandomAccessFile.write(Native Method)
> 	at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> 	at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> 	at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>   at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>   at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>   at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  INFO | Ignoring IO exception, java.io.IOException: Input/output error
> java.io.IOException: Input/output error
> 	at java.io.RandomAccessFile.write(Native Method)
> 	at java.io.RandomAccessFile.writeLong(RandomAccessFile.java:1001)
> 	at org.apache.kahadb.page.PageFile.writeBatch(PageFile.java:1006)
> 	at org.apache.kahadb.page.PageFile.flush(PageFile.java:484)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointUpdate(MessageDatabase.java:1290)
>   at org.apache.activemq.store.kahadb.MessageDatabase$10.execute(MessageDatabase.java:768)
>   at org.apache.kahadb.page.Transaction.execute(Transaction.java:760)
> 	at org.apache.activemq.store.kahadb.MessageDatabase.checkpointCleanup(MessageDatabase.java:766)
>   at org.apache.activemq.store.kahadb.MessageDatabase$3.run(MessageDatabase.java:315)
>  WARN | Transport failed: java.io.EOFException
> {code}



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic