Graph Convoluional Networks for Hyperspectral Image Classification :: Mini-GCN

<aside> 💡 프로젝트를 위해 공부 했던 'miniGCN' 에 대해 정리하고자 한다. 이번 포스팅에서는 '**miniGCN'**을 처음 소개했던 'Graph Convolutional Networks for Hyperspectral Image Classification' 논문의 내용을 바탕으로 참고 자료와 함께 다시 정리하고자 한다.

</aside>

0. Preliminaries

GCN paper review

1. Introduction

Untitled

                                                  [ 그림 1 ] Summarization miniGCN paper

HS(Hyperspectral) image classification task에서 CNN은 spatial-spectral feature representation 가능한 장점을 보임 하지만, 샘플간 관계를 modeling 하지 못하는 한계
이러한 한계를 보완하는 것이 GCN이나, 높은 computational cost를 동반함 (only use full-batch)
또한 GCN은 retraining 없이 out-of-sample (unseem data)를 추론하지 못하는 한계
본 논문에서는 mini-batch 통한 large-scale train하는 방법을 제안함(mini-GCN)
- miniGCN은 CNN과 GCN을 결합한 batch-wise network training 방법을 사용함
  
  (두개를 결합해서 사용하는 이유는 두 모델이 HS feature를 추출하는 관점이 다르기 때문)
- fusion strategies

2. Methodology

2-1. CNNs Versus GCNs : Qualitative Comparison

Data Preparation

GCN input은 CNN input과 다르게 두 샘플사이의 관계를 (adjacency matrix) 입력으로 사용함

(feed pixel-wise samples)
Feature Representation

CNN은 rich spatial, short-range region spectral 추출함 즉, CNN은 많은 정보를 함축하지만, 지역적인 정보를 추출할수 있음

반면 GCN은 middle spatial와, 샘플 사이의 long-range spectral 관계를 추출 할수있음
```
                         [ 그림 2 ] Diff. CNNs and GCNs (Feature Representation)
```
Network Training

GCN 학습을 위해서는 모든 샘플을 동시에 networks의 입력으로 사용함 (full-batch)

2-2. Proposed MiniGCNs

Graph의 size가 증가할수록 더 높아지는 computational cost 문제를 해결하기 위해 mini-batch processing을 사용
Inductive learning을 모티브로, mini-batch fashion에서 GCN을 학습 가능하도록 함

(** Inductive learning : train dataset으로 학습한 이후, out of sample data의 label을 예측)

→ random sampling을 통해 region을 추출하고, sub-graphs로 사용
```
  이후, 모든 batch의 결과를 collecting 함으로 $(l\\,+\\,1)th$ layer를 reformulate 
```
$$ Z^{(l+1)}{u}\,=\,\sum{u\in \mathcal{V}s}\frac{(\tilde D^{-\frac{1}{2}}\tilde A \tilde D^{-\frac{1}{2}}){uv}}{e_{uv}}Z^{(l)}_uW^{(l)}\,+\,b^{(l)}_u $$

$$ \tilde H_{s}^{(l+1)}\,=\,h(\tilde D_s^{-\frac{1}{2}}\tilde A_s \tilde D_s^{-\frac{1}{2}} \tilde H^{(l)}_s W^{(l)}\,+\,b_s^{(l)}) $$

$$ H^{(l+1)}\,=\,[\tilde H^{(l+1)},\,\dots \,,\tilde H^{(l+1)}s,\,\dots\,\tilde H^{(l+1)}{[\frac{N}{M}]}] $$
MiniGCNs Meet CNNs : End-to-End Fusion Networks (FuNet)

다른 두 network의 활용은 HS images에서 distinctive representation을 추출 (CNN에서는 spatial-spectral feature, GCN에서는 topological relation)
- 본 논문에서는 standard CNN model을 combined해서 사용하는 방법을 제안함
```
                                                       [ 그림 4 ] miniGCNs architecture 
```

0. Preliminaries

1. Introduction

2. Methodology

2-1. CNNs Versus GCNs : Qualitative Comparison

2-2. Proposed MiniGCNs

3. Experiments