[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] RBD on ec pool with compression.
From:       sweil () redhat ! com (Sage Weil)
Date:       2017-10-31 0:23:31
Message-ID: alpine.DEB.2.11.1710310022500.10762 () piezo ! novalocal
[Download RAW message or body]

This looks like http://tracker.ceph.com/issues/21766, fixed in latest 
luminous branch.  The fix will be in 12.2.2.  In the meantime, you can 
'ceph-deploy install --dev luminous $hostname' (or equivalent).

sage

On Mon, 30 Oct 2017, Gregory Farnum wrote:

> This is "supposed" to work, but the compression in Bluestore has less
> testing than most things there and is pretty invasive, so when I discussed
> this with Radoslaw (added) last week there were some obvious places to look.
> Hopefully it's not too hard to identify the problem from these backtraces
> and get a fix in? :)-Greg
> 
> On Fri, Oct 20, 2017 at 12:16 AM Cassiano Pilipavicius
> <cassiano at tips.com.br> wrote:
>       Hello, is it possible to use compression on a EC pool? I am
>       trying to
>       enable this to use as a huge backup/archive disk, the data is
>       almost
>       static and access to it is very sporadic, so, bad performance is
>       not a
>       concern here.
> 
>       I've created the RBD storing data to the EC pool (--data-pool
>       option)
>       and enabled compression on the EC pool, but as soon as I start
>       to write
>       data to the volume, I have a crash on 12 osds (my ec ruleset is
>       k=9 m=3)
> 
>       If I use only RBD+EC it works fine, if I use RBD+Compression (no
>       erasure
>       coding) it also works fine.
> 
>       This is what appears on the log of the crashed osds:
> 
> 
>       ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>       luminous (stable)
>       ??1: (ceph::__ceph_assert_fail(char const*, char const*, int,
>       char
>       const*)+0x110) [0x7f019377af20]
>       ??2: (bluestore_blob_t::map(unsigned long, unsigned long,
>       std::function<int (unsigned long, unsigned long)>) const+0xfe)
>       [0x7f019365f14e]
>       ??3: (BlueStore::_do_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
>       ??4: (BlueStore::_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
>       ??5: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
>       ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
>       ??6: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
>       std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
>       [0x7f01936513a0]
>       ??7:
>       (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
>       ??8: (ECBackend::handle_sub_write(pg_shard_t,
>       boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
>       const&,
>       Context*)+0x631) [0x7f01934e1cc1]
>       ??9:
>       (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
>       [0x7f01934f2867]
>       ??10:
>       (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>       [0x7f01933f73d0]
>       ??11:
>       (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>       ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
>       ??12: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>       boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
>       [0x7f01931f33a9]
>       ??13:
>       (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>       const&)+0x57) [0x7f0193465797]
>       ??14: (OSD::ShardedOpWQ::_process(unsigned int,
>       ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
>       ??15: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>       int)+0x839)
>       [0x7f0193780a39]
>       ??16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>       [0x7f01937829d0]
>       ??17: (()+0x7dc5) [0x7f019013adc5]
>       ??18: (clone()+0x6d) [0x7f018f22e28d]
>       ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
>       is
>       needed to interpret this.
> 
>       --- logging levels ---
>       ??? 0/ 5 none
>       ??? 0/ 0 lockdep
>       ??? 0/ 0 context
>       ??? 0/ 0 crush
>       ??? 1/ 5 mds
>       ??? 1/ 5 mds_balancer
>       ??? 1/ 5 mds_locker
>       ??? 1/ 5 mds_log
>       ??? 1/ 5 mds_log_expire
>       ??? 1/ 5 mds_migrator
>       ??? 0/ 0 buffer
>       ??? 0/ 0 timer
>       ??? 0/ 1 filer
>       ??? 0/ 1 striper
>       ??? 0/ 1 objecter
>       ??? 0/ 5 rados
>       ??? 0/ 5 rbd
>       ??? 0/ 5 rbd_mirror
>       ??? 0/ 5 rbd_replay
>       ??? 0/ 0 journaler
>       ??? 0/ 5 objectcacher
>       ??? 0/ 5 client
>       ??? 0/ 0 osd
>       ??? 0/ 0 optracker
>       ??? 0/ 0 objclass
>       ??? 0/ 0 filestore
>       ??? 0/ 0 journal
>       ??? 0/ 0 ms
>       ??? 1/ 5 mon
>       ??? 0/ 0 monc
>       ??? 1/ 5 paxos
>       ??? 0/ 0 tp
>       ??? 0/ 0 auth
>       ??? 1/ 5 crypto
>       ??? 0/ 0 finisher
>       ??? 0/ 0 heartbeatmap
>       ??? 0/ 0 perfcounter
>       ??? 1/ 5 rgw
>       ??? 1/10 civetweb
>       ??? 1/ 5 javaclient
>       ??? 0/ 0 asok
>       ??? 0/ 0 throttle
>       ??? 0/ 0 refs
>       ??? 1/ 5 xio
>       ??? 1/ 5 compressor
>       ??? 1/ 5 bluestore
>       ??? 1/ 5 bluefs
>       ??? 1/ 3 bdev
>       ??? 1/ 5 kstore
>       ??? 4/ 5 rocksdb
>       ??? 4/ 5 leveldb
>       ??? 4/ 5 memdb
>       ??? 1/ 5 kinetic
>       ??? 1/ 5 fuse
>       ??? 1/ 5 mgr
>       ??? 1/ 5 mgrc
>       ??? 1/ 5 dpdk
>       ??? 1/ 5 eventtrace
>       ?? -2/-2 (syslog threshold)
>       ?? -1/-1 (stderr threshold)
>       ?? max_recent???? 10000
>       ?? max_new???????? 1000
>       ?? log_file /var/log/ceph/ceph-osd.8.log
>       --- end dump of recent events ---
>       2017-10-20 04:28:51.564373 7f0176684700 -1 *** Caught signal
>       (Aborted) **
>       ??in thread 7f0176684700 thread_name:tp_osd_tp
> 
>       ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>       luminous (stable)
>       ??1: (()+0xa29511) [0x7f019373c511]
>       ??2: (()+0xf100) [0x7f0190142100]
>       ??3: (gsignal()+0x37) [0x7f018f16d5f7]
>       ??4: (abort()+0x148) [0x7f018f16ece8]
>       ??5: (ceph::__ceph_assert_fail(char const*, char const*, int,
>       char
>       const*)+0x284) [0x7f019377b094]
>       ??6: (bluestore_blob_t::map(unsigned long, unsigned long,
>       std::function<int (unsigned long, unsigned long)>) const+0xfe)
>       [0x7f019365f14e]
>       ??7: (BlueStore::_do_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
>       ??8: (BlueStore::_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
>       ??9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
>       ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
>       ??10: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
>       std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
>       [0x7f01936513a0]
>       ??11:
>       (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
>       ??12: (ECBackend::handle_sub_write(pg_shard_t,
>       boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
>       const&,
>       Context*)+0x631) [0x7f01934e1cc1]
>       ??13:
>       (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
>       [0x7f01934f2867]
>       ??14:
>       (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>       [0x7f01933f73d0]
>       ??15:
>       (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>       ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
>       ??16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>       boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
>       [0x7f01931f33a9]
>       ??17:
>       (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>       const&)+0x57) [0x7f0193465797]
>       ??18: (OSD::ShardedOpWQ::_process(unsigned int,
>       ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
>       ??19: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>       int)+0x839)
>       [0x7f0193780a39]
>       ??20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>       [0x7f01937829d0]
>       ??21: (()+0x7dc5) [0x7f019013adc5]
>       ??22: (clone()+0x6d) [0x7f018f22e28d]
>       ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
>       is
>       needed to interpret this.
> 
>       --- begin dump of recent events ---
>       ????? 0> 2017-10-20 04:28:51.564373 7f0176684700 -1 *** Caught
>       signal
>       (Aborted) **
>       ??in thread 7f0176684700 thread_name:tp_osd_tp
> 
>       ??ceph version 12.2.1 (3e7492b9ada8bdc9a5cd0feafd42fbca27f9c38e)
>       luminous (stable)
>       ??1: (()+0xa29511) [0x7f019373c511]
>       ??2: (()+0xf100) [0x7f0190142100]
>       ??3: (gsignal()+0x37) [0x7f018f16d5f7]
>       ??4: (abort()+0x148) [0x7f018f16ece8]
>       ??5: (ceph::__ceph_assert_fail(char const*, char const*, int,
>       char
>       const*)+0x284) [0x7f019377b094]
>       ??6: (bluestore_blob_t::map(unsigned long, unsigned long,
>       std::function<int (unsigned long, unsigned long)>) const+0xfe)
>       [0x7f019365f14e]
>       ??7: (BlueStore::_do_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>)+0x1151) [0x7f019363d191]
>       ??8: (BlueStore::_remove(BlueStore::TransContext*,
>       boost::intrusive_ptr<BlueStore::Collection>&,
>       boost::intrusive_ptr<BlueStore::Onode>&)+0x94) [0x7f019363da64]
>       ??9: (BlueStore::_txc_add_transaction(BlueStore::TransContext*,
>       ObjectStore::Transaction*)+0x15af) [0x7f019365047f]
>       ??10: (BlueStore::queue_transactions(ObjectStore::Sequencer*,
>       std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x3a0)
>       [0x7f01936513a0]
>       ??11:
>       (PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction,
>       std::allocator<ObjectStore::Transaction> >&,
>       boost::intrusive_ptr<OpRequest>)+0x65) [0x7f01933c1f35]
>       ??12: (ECBackend::handle_sub_write(pg_shard_t,
>       boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace
>       const&,
>       Context*)+0x631) [0x7f01934e1cc1]
>       ??13:
>       (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x327)
>       [0x7f01934f2867]
>       ??14:
>       (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50)
>       [0x7f01933f73d0]
>       ??15:
>       (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&,
>       ThreadPool::TPHandle&)+0x5ae) [0x7f0193362fbe]
>       ??16: (OSD::dequeue_op(boost::intrusive_ptr<PG>,
>       boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3f9)
>       [0x7f01931f33a9]
>       ??17:
>       (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest>
>       const&)+0x57) [0x7f0193465797]
>       ??18: (OSD::ShardedOpWQ::_process(unsigned int,
>       ceph::heartbeat_handle_d*)+0xfce) [0x7f019321e9ee]
>       ??19: (ShardedThreadPool::shardedthreadpool_worker(unsigned
>       int)+0x839)
>       [0x7f0193780a39]
>       ??20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10)
>       [0x7f01937829d0]
>       ??21: (()+0x7dc5) [0x7f019013adc5]
>       ??22: (clone()+0x6d) [0x7f018f22e28d]
>       ??NOTE: a copy of the executable, or `objdump -rdS <executable>`
>       is
>       needed to interpret this.
> 
>       --- logging levels ---
>       ??? 0/ 5 none
>       ??? 0/ 0 lockdep
>       ??? 0/ 0 context
>       ??? 0/ 0 crush
>       ??? 1/ 5 mds
>       ??? 1/ 5 mds_balancer
>       ??? 1/ 5 mds_locker
>       ??? 1/ 5 mds_log
>       ??? 1/ 5 mds_log_expire
>       ??? 1/ 5 mds_migrator
>       ??? 0/ 0 buffer
>       ??? 0/ 0 timer
>       ??? 0/ 1 filer
>       ??? 0/ 1 striper
>       ??? 0/ 1 objecter
>       ??? 0/ 5 rados
>       ??? 0/ 5 rbd
>       ??? 0/ 5 rbd_mirror
>       ??? 0/ 5 rbd_replay
>       ??? 0/ 0 journaler
>       ??? 0/ 5 objectcacher
>       ??? 0/ 5 client
>       ??? 0/ 0 osd
>       ??? 0/ 0 optracker
>       ??? 0/ 0 objclass
>       ??? 0/ 0 filestore
>       ??? 0/ 0 journal
>       ??? 0/ 0 ms
>       ??? 1/ 5 mon
>       ??? 0/ 0 monc
>       ??? 1/ 5 paxos
>       ??? 0/ 0 tp
>       ??? 0/ 0 auth
>       ??? 1/ 5 crypto
>       ??? 0/ 0 finisher
>       ??? 0/ 0 heartbeatmap
>       ??? 0/ 0 perfcounter
>       ??? 1/ 5 rgw
>       ??? 1/10 civetweb
>       ??? 1/ 5 javaclient
>       ??? 0/ 0 asok
>       ??? 0/ 0 throttle
>       ??? 0/ 0 refs
>       ??? 1/ 5 xio
>       ??? 1/ 5 compressor
>       ??? 1/ 5 bluestore
>       ??? 1/ 5 bluefs
>       ??? 1/ 3 bdev
>       ??? 1/ 5 kstore
>       ??? 4/ 5 rocksdb
>       ??? 4/ 5 leveldb
>       ??? 4/ 5 memdb
>       ??? 1/ 5 kinetic
>       ??? 1/ 5 fuse
>       ??? 1/ 5 mgr
>       ??? 1/ 5 mgrc
>       ??? 1/ 5 dpdk
>       ??? 1/ 5 eventtrace
>       ?? -2/-2 (syslog threshold)
>       ?? -1/-1 (stderr threshold)
>       ?? max_recent???? 10000
>       ?? max_new???????? 1000
>       ?? log_file /var/log/ceph/ceph-osd.8.log
>       --- end dump of recent events ---
> 
>       _______________________________________________
>       ceph-users mailing list
>       ceph-users at lists.ceph.com
>       http://lists.ceph.com/listinfo.cgi/ceph-users-ceph.com
> 
> 
> 

[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic