Open Source Software

Ceph 노드 제거 및 추가 방법(mon/mgr/osd)

Somaz 2022. 9. 21. 19:50
728x90
반응형

Overview

오늘은 Ceph. 노드 제거 및 추가 방법에 대해 공부해보려고 한다.

다만, 노드 추가 부분은 자신의 Ceph를 설치하고 노드 추가를 했던 방법을 사용하면 된다.

나는 ansible을 이용해서 추가했다.

 


 

작업전 확인사항

  • 제거 작업전에 반드시 클러스터 여유 공간을 확인한다.
  • 제거되는 노드의 용량만큼 받아줄수 있는지 확인한다.

 

1. Ceph 노드 제거

 

1.) 클러스터 상태 및 용량 확인

 
$ sudo ceph -s
$ sudo ceph osd df

 

2.) 스크러빙 비활성화 (I/O 부하 방지)

$ sudo ceph osd set noscrub
$ sudo ceph osd set nodeep-scrub

 

3.) Ceph OSD 제거

해당하는 ceph node의 osd를 제거해준다.(ex. taco2-ceph2 node osd 제거)

$ sudo ceph osd tree
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
-1       0.78119 root default
-3       0.39059     host taco2-ceph1
 1   hdd 0.19530         osd.1            up  1.00000 1.00000
 2   hdd 0.19530         osd.2            up  1.00000 1.00000
-5       0.39059     host taco2-ceph2
 0   hdd 0.19530         osd.0            up  1.00000 1.00000
 3   hdd 0.19530         osd.3            up  1.00000 1.00000

$ sudo ceph osd out osd.0
marked out osd.0.
$ sudo ceph osd down osd.0
marked down osd.0.
$ sudo ceph osd rm osd.0
removed osd.0
$ sudo ceph osd crush remove osd.0
removed item id 0 name 'osd.0' from crush map


$ sudo ceph osd out osd.3
marked out osd.3.
$ sudo ceph osd down osd.3
marked down osd.3.
$ sudo ceph osd rm osd.3
removed osd.3
$ sudo ceph osd crush remove osd.3
removed item id 3 name 'osd.3' from crush map

$ sudo ceph osd crush remove taco2-ceph2
removed item id -5 name 'taco2-ceph2' from crush map

$ sudo ceph osd tree
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
-1       0.39059 root default
-3       0.39059     host taco2-ceph1
 1   hdd 0.19530         osd.1            up  1.00000 1.00000
 2   hdd 0.19530         osd.2            up  1.00000 1.00000
  • osd를 제거할 때 down하고 rm을 바로 눌러줘야 한다. 이유는 바로 up으로 살아나기 때문이다.

 

ceph osd 인증 리스트 삭제

$ sudo ceph auth list
installed auth entries:

osd.0
        key: AQDkfipjW6P1ERAAcCdTZJ6lATN7i8wxwh7j3Q==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
osd.1
        key: AQDkfipjud7XFhAAqEEuJJtSofEOnHH5isz63w==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
osd.2
        key: AQDxfipjjTSxARAAk0TLZHmJNMpjba5cpCgoNQ==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
osd.3
        key: AQDxfipjEzNFCBAAqyOYJpLfbJQwAHr7kQg8hA==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
...
mgr.ceph1
        key: AQCIfipjAAAAABAAaZo/gviQ+VODxFJfPDrpkQ==
        caps: [mds] allow *
        caps: [mon] allow profile mgr
        caps: [osd] allow *
mgr.ceph2
        key: AQCKfipjAAAAABAAlxEFB8o0btVev+Rcky0FOw==
        caps: [mds] allow *
        caps: [mon] allow profile mgr
        caps: [osd] allow *

$ sudo ceph auth del osd.0
updated

$ sudo ceph auth del osd.2
updated

$ sudo ceph auth del mgr.ceph2
updated

$ sudo ceph auth list
installed auth entries:

osd.1
        key: AQDkfipjud7XFhAAqEEuJJtSofEOnHH5isz63w==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
osd.3
        key: AQDxfipjEzNFCBAAqyOYJpLfbJQwAHr7kQg8hA==
        caps: [mgr] allow profile osd
        caps: [mon] allow profile osd
        caps: [osd] allow *
...
mgr.ceph1
        key: AQCIfipjAAAAABAAaZo/gviQ+VODxFJfPDrpkQ==
        caps: [mds] allow *
        caps: [mon] allow profile mgr
        caps: [osd] allow *

 

4.) Ceph MON 제거

해당하는 ceph node의 Mon을 제거해준다.

$ sudo ceph mon stat
e1: 2 mons at {taco2-ceph1=[v2:10.3.2.206:3300/0,v1:10.3.2.206:6789/0],taco2-ceph2=[v2:10.3.2.207:3300/0,v1:10.3.2.207:6789/0]}, election epoch 4, leader 0 taco2-ceph1, quorum 0,1 taco2-ceph1,taco2-ceph2

$ sudo ceph mon remove taco2-ceph2
removing mon.taco2-ceph2 at [v2:10.3.2.207:3300/0,v1:10.3.2.207:6789/0], there will be 1 monitors

$ sudo ceph -s
  cluster:
    id:     14675ee4-b9dd-440b-9e73-e4c00a62eab1
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set

  services:
    mon: 1 daemons, quorum taco2-ceph1 (age 4s)

 

5.) Ceph MGR 제거

해당하는 ceph node를 stanbys로 만들어준다.

$ sudo ceph -s
  cluster:
    id:     14675ee4-b9dd-440b-9e73-e4c00a62eab1
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set

  services:
    mon: 1 daemons, quorum taco2-ceph1 (age 6m)
    mgr: ceph2(active, since 4w), standbys: ceph1
     
$ sudo ceph mgr fail ceph2

$ sudo ceph -s
  cluster:
    id:     14675ee4-b9dd-440b-9e73-e4c00a62eab1
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set

  services:
    mon: 1 daemons, quorum taco2-ceph1 (age 7m)
    mgr: ceph1(active, since 3s), standbys: ceph2

 

그리고 해당하는 ceph node로 가서 Mgr을 제거해준다.

$ ssh [해당되는 Ceph node]

$ sudo systemctl status ceph-mgr@ceph2
● ceph-mgr@ceph2.service - Ceph cluster manager daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled)
   Active: active (running) since 화 2022-08-16 17:34:45 KST; 1 months 4 days ago
...

$ sudo systemctl stop ceph-mgr@ceph2

$ sudo systemctl status ceph-mgr@ceph2
● ceph-mgr@ceph2.service - Ceph cluster manager daemon
   Loaded: loaded (/usr/lib/systemd/system/ceph-mgr@.service; enabled; vendor preset: disabled)
   Active: inactive (dead) since 화 2022-09-20 16:43:21 KST; 11s ago
...
$ sudo ceph -s
  cluster:
    id:     14675ee4-b9dd-440b-9e73-e4c00a62eab1
    health: HEALTH_WARN
            noscrub,nodeep-scrub flag(s) set

  services:
    mon: 1 daemons, quorum ceph1 (age 99m)
    mgr: ceph1(active, since 91m)
    osd: 2 osds: 2 up (since 2h), 2 in (since 2h)
    rgw: 3 daemons active (master1.rgw0, master2.rgw0, master3.rgw0)

 

6.) 스크러빙 활성화

$ sudo ceph osd unset nodeep-scrub
nodeep-scrub is unset

$ sudo ceph osd unset noscrub
noscrub is unset
 

7.) Ceph 상태 확인

$ sudo ceph -s
  cluster:
    id:     14675ee4-b9dd-440b-9e73-e4c00a62eab1
    health: HEALTH_WARN

  services:
    mon: 1 daemons, quorum ceph1 (age 99m)
    mgr: ceph1(active, since 91m)
    osd: 2 osds: 2 up (since 2h), 2 in (since 2h)
    rgw: 3 daemons active (master1.rgw0, master2.rgw0, master3.rgw0)

  task status:

  data:
    pools:   11 pools, 228 pgs
    objects: 4.41k objects, 15 GiB
    usage:   32 GiB used, 368 GiB / 400 GiB avail
    pgs:     228 active+clean

  io:
    client:   2.7 KiB/s wr, 0 op/s rd, 0 op/s wr
  • Ceph node 를 제거했기 때문에 HEALTH_WARN이 발생할 것입니다.
  • 노드 추가하면 정상적으로 돌아옵니다.

2. Ceph 노드 추가

 

1.) 추가할 ceph osd 노드를 구성

기존 ceph 노드와 같은 OS를 설치 후 IP를 할당한다.

 

2.) ssh key 교환

새로운 노드에 접근할 수 있도록 ssh 공개키를 교환한다.

 
$ ssh-copy-id [해당되는 Ceph node]

 

3.) 시간 동기화

$ ssh [해당되는 Ceph node]
$ sudo vi /etc/chrony.conf
server [해당되는 Control node] iburst

$ sudo systemctl restart chronyd
$ chronyc sources
210 Number of sources = 1
MS Name/IP address                   Stratum Poll Reach LastRx Last sample
===========================================================================================
^* [해당되는 Control node]                 3   6   377    36   +489us[+1186us] +/-   40ms

 

 

4.) inventory (hosts.ini, extra-vars) 수정

 

hosts.ini 수정

$ ZONE_NAME={해당 ZONE]

$ cd ~/taco

$ cp inventory/$ZONE_NAME/hosts.ini ~/taco/inventory/$Z
ONE_NAME/hosts.ini.ceph-add

$ vi inventory/$ZONE_NAME/hosts.ini.ceph-add
...
+ [추가 node name] ip=[추가 node ip]
...
# Ceph cluster
# we need empty mons group or clients role fails
[mons]
# [기존 ceph node]    # 주석처리
+ [추가 node name]

[mgrs]
# [기존 ceph node]    # 주석처리
+ [추가 ceph node]

[osds]
# [기존 ceph node]  # 주석처리
+ [추가 node name]
...
  • 추가할 mon,mgrs,osds에 각각 추가해준다. 

 

 extra-vars 수정

$ cp inventory/$ZONE_NAME/extra-vars.yml inventory/$ZONE_NAME/extra-vars.yml.ceph-add

$ vi inventory/$ZONE_NAME/extra-vars.yml.ceph-add
...
## ceph osd
osd_objectstore: bluestore
lvm_volumes:
  - data: /dev/sdb
  - data: /dev/sdc
...
 
  • extra-vars는 osd에 변화가 있다면 추가해준다. 동일한 개수의 osd라면 그대로 진행하면 된다.

 

5.) ansible playbook 실행 site.yml (setup-os, ceph tag)

$ ansible-playbook -b -u clex \
-i inventory/$ZONE_NAME/hosts.ini.ceph-add \
--extra-vars=@inventory/$ZONE_NAME/extra-vars.yml.ceph-add \
site.yml --tags=setup-os,ceph

 

6.) osd 추가확인

$ sudo ceph -s
  cluster:
    id:     9893a83c-63e2-41b6-a538-f72008e15a01
    health: HEALTH_OK

  services:
    mon: 2 daemons, quorum ceph1,ceph2 (age 32m)
    mgr: ceph1(active, since 3h), standbys: ceph2
    osd: 4 osds: 4 up (since 6m), 4 in (since 6m)
    rgw: 3 daemons active (master1.rgw0, master2.rgw0, master3.rgw0)

  task status:

  data:
    pools:   11 pools, 228 pgs
    objects: 200 objects, 4.7 KiB
    usage:   4.1 GiB used, 796 GiB / 800 GiB avail
    pgs:     228 active+clean

$ sudo ceph osd tree
ID CLASS WEIGHT  TYPE NAME            STATUS REWEIGHT PRI-AFF
-1       0.78119 root default
-3       0.39059     host ceph1
 1   hdd 0.19530         osd.1            up  1.00000 1.00000
 3   hdd 0.19530         osd.3            up  1.00000 1.00000
-5       0.39059     host ceph2
 0   hdd 0.19530         osd.0            up  1.00000 1.00000
 2   hdd 0.19530         osd.2            up  1.00000 1.00000

 

 


 

Reference

https://docs.ceph.com/en/quincy/

728x90
반응형

'Open Source Software' 카테고리의 다른 글

Cephadm-ansible이란?  (3) 2024.02.29
Rook-Ceph란?  (0) 2024.02.20
Redis(Remote Dictionary Server)란?  (0) 2022.09.26
RabbitMQ란?  (0) 2022.08.01
Ceph 란?  (0) 2022.07.29