[prev in list] [next in list] [prev in thread] [next in thread]
List: ceph-users
Subject: [ceph-users] RBD on ec pool with compression.
From: sweil () redhat ! com (Sage Weil)
Date: 2017-10-31 0:23:31
Message-ID: alpine.DEB.2.11.1710310022500.10762 () piezo ! novalocal
[Download RAW message or body]
This looks like http://tracker.ceph.com/issues/21766, fixed in latest
luminous branch. The fix will be in 12.2.2. In the meantime, you can
'ceph-deploy install --dev luminous $hostname' (or equivalent).
sage
On Mon, 30 Oct 2017, Gregory Farnum wrote:
> This is "supposed" to work, but the compression in Bluestore has less
> testing than most things there and is pretty invasive, so when I discussed
> this with Radoslaw (added) last week there were some obvious places to look.
> Hopefully it's not too hard to identify the problem from these backtraces
> and get a fix in? :)-Greg
>
> On Fri, Oct 20, 2017 at 12:16 AM Cassiano Pilipavicius
> <cassiano at tips.com.br> wrote:
> Hello, is it possible to use compression on a EC pool? I am
> trying to
> enable this to use as a huge backup/archive disk, the data is
> almost
> static and access to it is very sporadic, so, bad performance is
> not a
> concern here.
>
> I've created the RBD storing data to the EC pool (--data-pool
> option)
> and enabled compression on the EC pool, but as soon as I start
> to write
> data to the volume, I have a crash on 12 osds (my ec ruleset is
> k=9 m=3)
>
> If I use only RBD+EC it works fine, if I use RBD+Compression (no
> erasure
> coding) it also works fine.
>
> This is what appears on the log of the crashed osds:
>
>
> ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)
> ??1: (ceph::__ceph_assert_fail(char const*, char const*, int,
> char
> const*)+0x110) [0x7f019377af20]
> ??2: (bluestore_blob_t::map(unsigned long, unsigned long,
> std::function<int (unsigned long, unsigned long)>) const+0xfe)
> [0x7f019365f14e]
> ??3: (BlueStore::_do_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
> ??4: (BlueStore::_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
> ??5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
> ??6: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
> [0x7f01936513a0]
> ??7:
> (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
> ??8: (ECBackend::handle_sub_write(pg_shard_t,
> boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
> const&,
> Context*)+0x631) [0x7f01934e1cc1]
> ??9:
> (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
> [0x7f01934f2867]
> ??10:
> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
> [0x7f01933f73d0]
> ??11:
> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
> ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
> ??12: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
> [0x7f01931f33a9]
> ??13:
> (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
> const&)+0x57) [0x7f0193465797]
> ??14: (OSD::ShardedOpWQ::_process(unsigned int,
> ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
> ??15: (ShardedThreadPool::shardedthreadpool_worker(unsigned
> int)+0x839)
> [0x7f0193780a39]
> ??16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
> [0x7f01937829d0]
> ??17: (()+0x7dc5) [0x7f019013adc5]
> ??18: (clone()+0x6d) [0x7f018f22e28d]
> ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
> is
> needed to interpret this.
>
> --- logging levels ---
> ??? 0/ 5 none
> ??? 0/ 0 lockdep
> ??? 0/ 0 context
> ??? 0/ 0 crush
> ??? 1/ 5 mds
> ??? 1/ 5 mds_balancer
> ??? 1/ 5 mds_locker
> ??? 1/ 5 mds_log
> ??? 1/ 5 mds_log_expire
> ??? 1/ 5 mds_migrator
> ??? 0/ 0 buffer
> ??? 0/ 0 timer
> ??? 0/ 1 filer
> ??? 0/ 1 striper
> ??? 0/ 1 objecter
> ??? 0/ 5 rados
> ??? 0/ 5 rbd
> ??? 0/ 5 rbd_mirror
> ??? 0/ 5 rbd_replay
> ??? 0/ 0 journaler
> ??? 0/ 5 objectcacher
> ??? 0/ 5 client
> ??? 0/ 0 osd
> ??? 0/ 0 optracker
> ??? 0/ 0 objclass
> ??? 0/ 0 filestore
> ??? 0/ 0 journal
> ??? 0/ 0 ms
> ??? 1/ 5 mon
> ??? 0/ 0 monc
> ??? 1/ 5 paxos
> ??? 0/ 0 tp
> ??? 0/ 0 auth
> ??? 1/ 5 crypto
> ??? 0/ 0 finisher
> ??? 0/ 0 heartbeatmap
> ??? 0/ 0 perfcounter
> ??? 1/ 5 rgw
> ??? 1/10 civetweb
> ??? 1/ 5 javaclient
> ??? 0/ 0 asok
> ??? 0/ 0 throttle
> ??? 0/ 0 refs
> ??? 1/ 5 xio
> ??? 1/ 5 compressor
> ??? 1/ 5 bluestore
> ??? 1/ 5 bluefs
> ??? 1/ 3 bdev
> ??? 1/ 5 kstore
> ??? 4/ 5 rocksdb
> ??? 4/ 5 leveldb
> ??? 4/ 5 memdb
> ??? 1/ 5 kinetic
> ??? 1/ 5 fuse
> ??? 1/ 5 mgr
> ??? 1/ 5 mgrc
> ??? 1/ 5 dpdk
> ??? 1/ 5 eventtrace
> ?? -2/-2 (syslog threshold)
> ?? -1/-1 (stderr threshold)
> ?? max_recent???? 10000
> ?? max_new???????? 1000
> ?? log_file /var/log/ceph/ceph-osd.8.log
> --- end dump of recent events ---
> 2017-10-20 04:28:51.564373 7f0176684700 -1 *** Caught signal
> (Aborted) **
> ??in thread 7f0176684700 thread_name:tp_osd_tp
>
> ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)
> ??1: (()+0xa29511) [0x7f019373c511]
> ??2: (()+0xf100) [0x7f0190142100]
> ??3: (gsignal()+0x37) [0x7f018f16d5f7]
> ??4: (abort()+0x148) [0x7f018f16ece8]
> ??5: (ceph::__ceph_assert_fail(char const*, char const*, int,
> char
> const*)+0x284) [0x7f019377b094]
> ??6: (bluestore_blob_t::map(unsigned long, unsigned long,
> std::function<int (unsigned long, unsigned long)>) const+0xfe)
> [0x7f019365f14e]
> ??7: (BlueStore::_do_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
> ??8: (BlueStore::_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
> ??9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
> ??10: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
> [0x7f01936513a0]
> ??11:
> (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
> ??12: (ECBackend::handle_sub_write(pg_shard_t,
> boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
> const&,
> Context*)+0x631) [0x7f01934e1cc1]
> ??13:
> (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
> [0x7f01934f2867]
> ??14:
> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
> [0x7f01933f73d0]
> ??15:
> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
> ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
> ??16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
> [0x7f01931f33a9]
> ??17:
> (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
> const&)+0x57) [0x7f0193465797]
> ??18: (OSD::ShardedOpWQ::_process(unsigned int,
> ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
> ??19: (ShardedThreadPool::shardedthreadpool_worker(unsigned
> int)+0x839)
> [0x7f0193780a39]
> ??20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
> [0x7f01937829d0]
> ??21: (()+0x7dc5) [0x7f019013adc5]
> ??22: (clone()+0x6d) [0x7f018f22e28d]
> ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
> is
> needed to interpret this.
>
> --- begin dump of recent events ---
> ????? 0> 2017-10-20 04:28:51.564373 7f0176684700 -1 *** Caught
> signal
> (Aborted) **
> ??in thread 7f0176684700 thread_name:tp_osd_tp
>
> ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
> luminous (stable)
> ??1: (()+0xa29511) [0x7f019373c511]
> ??2: (()+0xf100) [0x7f0190142100]
> ??3: (gsignal()+0x37) [0x7f018f16d5f7]
> ??4: (abort()+0x148) [0x7f018f16ece8]
> ??5: (ceph::__ceph_assert_fail(char const*, char const*, int,
> char
> const*)+0x284) [0x7f019377b094]
> ??6: (bluestore_blob_t::map(unsigned long, unsigned long,
> std::function<int (unsigned long, unsigned long)>) const+0xfe)
> [0x7f019365f14e]
> ??7: (BlueStore::_do_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
> ??8: (BlueStore::_remove(BlueStore::TransContext*,
> boost::intrusive_ptr<BlueStore::Collection>&,
> boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
> ??9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
> ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
> ??10: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
> std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
> [0x7f01936513a0]
> ??11:
> (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
> std::allocator<ObjectStore::Transaction> >&,
> boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
> ??12: (ECBackend::handle_sub_write(pg_shard_t,
> boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
> const&,
> Context*)+0x631) [0x7f01934e1cc1]
> ??13:
> (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
> [0x7f01934f2867]
> ??14:
> (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
> [0x7f01933f73d0]
> ??15:
> (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
> ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
> ??16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
> boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
> [0x7f01931f33a9]
> ??17:
> (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
> const&)+0x57) [0x7f0193465797]
> ??18: (OSD::ShardedOpWQ::_process(unsigned int,
> ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
> ??19: (ShardedThreadPool::shardedthreadpool_worker(unsigned
> int)+0x839)
> [0x7f0193780a39]
> ??20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
> [0x7f01937829d0]
> ??21: (()+0x7dc5) [0x7f019013adc5]
> ??22: (clone()+0x6d) [0x7f018f22e28d]
> ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
> is
> needed to interpret this.
>
> --- logging levels ---
> ??? 0/ 5 none
> ??? 0/ 0 lockdep
> ??? 0/ 0 context
> ??? 0/ 0 crush
> ??? 1/ 5 mds
> ??? 1/ 5 mds_balancer
> ??? 1/ 5 mds_locker
> ??? 1/ 5 mds_log
> ??? 1/ 5 mds_log_expire
> ??? 1/ 5 mds_migrator
> ??? 0/ 0 buffer
> ??? 0/ 0 timer
> ??? 0/ 1 filer
> ??? 0/ 1 striper
> ??? 0/ 1 objecter
> ??? 0/ 5 rados
> ??? 0/ 5 rbd
> ??? 0/ 5 rbd_mirror
> ??? 0/ 5 rbd_replay
> ??? 0/ 0 journaler
> ??? 0/ 5 objectcacher
> ??? 0/ 5 client
> ??? 0/ 0 osd
> ??? 0/ 0 optracker
> ??? 0/ 0 objclass
> ??? 0/ 0 filestore
> ??? 0/ 0 journal
> ??? 0/ 0 ms
> ??? 1/ 5 mon
> ??? 0/ 0 monc
> ??? 1/ 5 paxos
> ??? 0/ 0 tp
> ??? 0/ 0 auth
> ??? 1/ 5 crypto
> ??? 0/ 0 finisher
> ??? 0/ 0 heartbeatmap
> ??? 0/ 0 perfcounter
> ??? 1/ 5 rgw
> ??? 1/10 civetweb
> ??? 1/ 5 javaclient
> ??? 0/ 0 asok
> ??? 0/ 0 throttle
> ??? 0/ 0 refs
> ??? 1/ 5 xio
> ??? 1/ 5 compressor
> ??? 1/ 5 bluestore
> ??? 1/ 5 bluefs
> ??? 1/ 3 bdev
> ??? 1/ 5 kstore
> ??? 4/ 5 rocksdb
> ??? 4/ 5 leveldb
> ??? 4/ 5 memdb
> ??? 1/ 5 kinetic
> ??? 1/ 5 fuse
> ??? 1/ 5 mgr
> ??? 1/ 5 mgrc
> ??? 1/ 5 dpdk
> ??? 1/ 5 eventtrace
> ?? -2/-2 (syslog threshold)
> ?? -1/-1 (stderr threshold)
> ?? max_recent???? 10000
> ?? max_new???????? 1000
> ?? log_file /var/log/ceph/ceph-osd.8.log
> --- end dump of recent events ---
>
> _______________________________________________
> ceph-users mailing list
> ceph-users at lists.ceph.com
> http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
>
>
>
[prev in list] [next in list] [prev in thread] [next in thread]
Configure |
About |
News |
Add a list |
Sponsored by KoreLogic