[prev in list] [next in list] [prev in thread] [next in thread] 

List:       ceph-users
Subject:    [ceph-users] Recovery pg from  backup
From:       刘亮 <liangliu () linkdoc ! com>
Date:       2021-12-28 11:55:15
Message-ID: 4D6ED59A-6620-4C00-A212-2DA65050C873 () linkdoc ! com
[Download RAW message or body]


Deal all:
    I  have a ceph cluster with 150 osds .Ceph version is 12.2.10  luminous. Pool \
vms.


Few days ago . pool vms set size 3. And I  make a backup for pg 10.1a4 which belong \
to pool vms. Use ceph-objectstore-tool export command

Now I set pool vms size to 1 。 after rebalance.
Pg  10.1a4 stay in  stale + incomplete .

I import the old pg data  I have backuped.  Use ceph-objectstore-tool import –pgid \
10.1a4 But the osd crash.

Here is the log

    -6> 2021-12-28 18:06:30.915130 7f4f517f6700  5 -- 10.222.3.24:6813/1339275 >> \
10.222.3.22:0/3043732 conn(0x56431088b800 :6813 \
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=454 cs=1 l=1). rx osd.69 seq 1 \
                0x564310831400 osd_ping(ping e117671 stamp 2021-12-28 \
                18:06:30.914946) v4
    -5> 2021-12-28 18:06:30.915141 7f4f51ff7700  5 -- 10.222.3.24:6815/1339275 >> \
10.222.3.22:0/3043732 conn(0x56431088d000 :6815 \
s=STATE_OPEN_MESSAGE_READ_FOOTER_AND_DISPATCH pgs=454 cs=1 l=1). rx osd.69 seq 1 \
                0x56431085ba00 osd_ping(ping e117671 stamp 2021-12-28 \
                18:06:30.914946) v4
    -4> 2021-12-28 18:06:30.915164 7f4f517f6700  1 -- 10.222.3.24:6813/1339275 <== \
osd.69 10.222.3.22:0/3043732 1 ==== osd_ping(ping e117671 stamp 2021-12-28 \
                18:06:30.914946) v4 ==== 2004+0+0 (2371603292 0 0) 0x564310831400 con \
                0x56431088b800
    -3> 2021-12-28 18:06:30.915187 7f4f51ff7700  1 -- 10.222.3.24:6815/1339275 <== \
osd.69 10.222.3.22:0/3043732 1 ==== osd_ping(ping e117671 stamp 2021-12-28 \
                18:06:30.914946) v4 ==== 2004+0+0 (2371603292 0 0) 0x56431085ba00 con \
                0x56431088d000
    -2> 2021-12-28 18:06:30.915196 7f4f517f6700  1 -- 10.222.3.24:6813/1339275 --> \
10.222.3.22:0/3043732 -- osd_ping(ping_reply e117671 stamp 2021-12-28 \
                18:06:30.914946) v4 -- 0x564310803a00 con 0
    -1> 2021-12-28 18:06:30.915236 7f4f51ff7700  1 -- 10.222.3.24:6815/1339275 --> \
10.222.3.22:0/3043732 -- osd_ping(ping_reply e117671 stamp 2021-12-28 \
18:06:30.914946) v4 -- 0x564310956000 con 0  0> 2021-12-28 18:06:31.003766 \
7f4f39ffd700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_A \
RCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/src/common/Throttle.cc: \
In function 'int64_t Throttle::take(int64_t)' thread 7f4f39ffd700 time 2021-12-28 \
                18:06:30.938147
/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAIL \
ABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/12.2.11/rpm/el7/BUILD/ceph-12.2.11/src/common/Throttle.cc: \
148: FAILED assert(c >= 0)

ceph version 12.2.11 (26dc3775efc7bb286a1d6d66faee0ba30ea23eee) luminous (stable)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) \
                [0x5642fb767120]
2: (Throttle::take(long)+0x2a9) [0x5642fb75cc49]
3: (Objecter::_op_submit_with_budget(Objecter::Op*, \
ceph::shunique_lock<boost::shared_mutex>&, unsigned long*, int*)+0x2a2) \
                [0x5642fb471b12]
4: (Objecter::op_submit(Objecter::Op*, unsigned long*, int*)+0x7a) [0x5642fb471d2a]
5: (PrimaryLogPG::_copy_some(std::shared_ptr<ObjectContext>, \
                std::shared_ptr<PrimaryLogPG::CopyOp>)+0xf75) [0x5642fb325625]
6: (PrimaryLogPG::start_copy(PrimaryLogPG::CopyCallback*, \
std::shared_ptr<ObjectContext>, hobject_t, object_locator_t, unsigned long, unsigned \
                int, bool, unsigned int, unsigned int)+0x6e4) [0x5642fb3265f4]
7: (PrimaryLogPG::do_osd_ops(PrimaryLogPG::OpContext*, std::vector<OSDOp, \
                std::allocator<OSDOp> >&)+0x7e8e) [0x5642fb36fa1e]
8: (PrimaryLogPG::prepare_transaction(PrimaryLogPG::OpContext*)+0xbf) \
                [0x5642fb379a9f]
9: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x753) [0x5642fb37a823]
10: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x3147) [0x5642fb37f207]
11: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, \
                ThreadPool::TPHandle&)+0xebb) [0x5642fb33c1db]
12: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, \
                ThreadPool::TPHandle&)+0x3f9) [0x5642fb1b84f9]
13: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) \
                [0x5642fb446957]
14: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xfbe) \
                [0x5642fb1e778e]
15: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x839) \
                [0x5642fb76cc39]
16: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5642fb76ebd0]
17: (()+0x7dd5) [0x7f4f550d9dd5]
18: (clone()+0x6d) [0x7f4f541c9ead]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret \
this.


How can I import the old data.
If I allow lost pg data . how can I create a new pg with active + clean.

Look forward to your reply.
_______________________________________________
ceph-users mailing list -- ceph-users@ceph.io
To unsubscribe send an email to ceph-users-leave@ceph.io


[prev in list] [next in list] [prev in thread] [next in thread] 

Configure | About | News | Add a list | Sponsored by KoreLogic