PyTorch 튜토리얼 9 - 멀티 GPU 예제

import torch
import torch.nn as nn


class DataParallelModel(nn.Module):

    def __init__(self):
        super().__init__()
        self.block1 = nn.Linear(10, 20)

        # wrap block2 in DataParallel
        self.block2 = nn.Linear(20, 20)
        self.block2 = nn.DataParallel(self.block2)

        self.block3 = nn.Linear(20, 20)

    def forward(self, x):
        x = self.block1(x)
        x = self.block2(x)
        x = self.block3(x)
        return x

CPU 기반의 학습을 진행하고자 할 때에도 위 코드는 전혀 변경할 필요가 없다.

데이터 병렬(DataParallel)과 관련 문서는 다음 링크에서 찾아 볼 수 있다.

https://pytorch.org/docs/stable/nn.html#dataparallel-layers-multi-gpu-distributed

데이터 병렬(DataParallel)을 구현하는 기본 요소들은 다음과 같다.

일반적으로 파이토치의 nn.parallel 기본 요소는 독립적으로 사용할 수 있다. 파이토치 개발자들은 간단한 MPI-like-primitves(기본 요소와 유사한 MPI)를 구현하였다.

replicate : 여러 디바이스에서 모듈을 복제함
scatter : 첫 번째 차원에서 입력을 분산시킴
gather : 첫 번째 차원의 입력을 수집하고 연결함
parallel_apply : 이미 분산 된 모델 집합의 입력으로 이미 분산 된 입력 집합을 적용함

좀 더 이해를 돕기 위해 위와 같은 기본 요소로 구성 된 data_parallel 코드를 보면 다음과 같다.

- 코드

def data_parallel(module, input, device_ids, output_device=None):
    if not device_ids:
        return module(input)

    if output_device is None:
        output_device = device_ids[0]

    replicas = nn.parallel.replicate(module, device_ids)
    inputs = nn.parallel.scatter(input, device_ids)
    replicas = replicas[:len(inputs)]
    outputs = nn.parallel.parallel_apply(replicas, inputs)
    return nn.parallel.gather(outputs, output_device)

CPU, GPU 동시 사용 ( Part of the model on CPU and part on the GPU )

모델의 일부는 CPU에서 동작하고, 나머지는 GPU에서 동작하는 소규모 네트워크의 실행 코드를 보면 다음과 같다.

- 코드

device = torch.device("cuda:0")

class DistributedModel(nn.Module):

    def __init__(self):
        super().__init__(
            embedding=nn.Embedding(1000, 10),
            rnn=nn.Linear(10, 10).to(device),
        )

    def forward(self, x):
        # Compute embedding on CPU
        x = self.embedding(x)

        # Transfer to GPU
        x = x.to(device)

        # Compute RNN on GPU
        x = self.rnn(x)
        return x

이번 포스팅은 파이토을 이용하여 실제 뭔가를 구현해보고자 하는 사람들을 위한 파이토치에 대한 작은 소개였다. 실제로는 더 많은 부분에 대하여 공부를 진행해야 한다.

optim 패키지, 데이터 로더 등을 소개하는 좀더 포괄적인 입문서를 살펴 보기 바란다.

- 영어 : http://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

- 한글 : 본 블로그의 PyTorch 튜토리얼 포스팅 1~5

또한 추가적으로 다음과 같은 것들을 참고하기 바란다.

비디오 게임 플레이를 위한 뉴럴 네트워크 학습

http://pytorch.org/tutorials/intermediate/reinforcement_q_learning.html

ImageNet 대상으로 최첨단 ResNet 네트워크 학습

https://github.com/pytorch/examples/tree/master/imagenet

Generative Adversarial 네트워크를 이용한 얼굴 생성기 학습

https://github.com/pytorch/examples/tree/master/dcgan

반복적인 LSTM 네트워크를 사용하여 단어 수준 언어 모델 훈련

https://github.com/pytorch/examples/tree/master/word_language_model

추가 예제

https://github.com/pytorch/examples/tree/master/word_language_model

추가 튜토리얼

https://github.com/pytorch/examples

PyTorch 토론장

https://discuss.pytorch.org/

Slack에서 다른 사용자들과 채팅

https://pytorch.slack.com/?redir=%2Fmessages%2Fbeginner

저작자표시 비영리 동일조건 (새창열림)