PyTorch 模型建構與訓練基礎介紹 – PyTorch Training Steps & Tips
Contents
PyTorch 模型建構與訓練基礎介紹 – PyTorch Training Steps & Tips¶
from google.colab import drive
drive.mount('/content/drive')
Mounted at /content/drive
import os
os.chdir("/content/drive/MyDrive/0. codepool_python/deep_learning_hylee2022/ch0_pytorch_tutorial")
#@markdown **下載所需的資料 Download the necessary files here!**
%%bash
COLAB_ICON="${COLAB_ICON}https://miro.medium.com/max/200/"
COLAB_ICON="${COLAB_ICON}1*i_ncmAcN81MRMNRDcenKiw.png"
wget -q -nc -O Colab_icon.png $COLAB_ICON
echo "Hello! I am the data~. :P" > filename.txt
echo "Col0,Col1,Col2,Col3" > data.csv
echo "Row1,data11,data12,data13" >> data.csv
echo "Row2,data21,data22,data23" >> data.csv
echo "Row3,data31,data32,data33" >> data.csv
echo "Row4,data41,data42,data43" >> data.csv
echo "Row5,data51,data52,data53" >> data.csv
echo "Row6,data61,data62,data63" >> data.csv
echo "Row7,data71,data72,data73" >> data.csv
printf "%s" "Row8,data81,data82,data83" >> data.csv
gdown --id '19CzXudqN58R3D-1G8KeFWk8UDQwlb8is' \
--output food-11.zip # 下載資料集
unzip food-11.zip > unziplog # 解壓縮
rm -f unziplog
wget -q -N https://download.pytorch.org/tutorial/faces.zip
if [ ! -d data ]; then mkdir data; fi
unzip -q -o faces.zip -d data > unziplog
rm -f faces.zip
rm -f unziplog
/usr/local/lib/python3.7/dist-packages/gdown/cli.py:131: FutureWarning: Option `--id` was deprecated in version 4.3.1 and will be removed in 5.0. You don't need to pass it anymore to use a file ID.
category=FutureWarning,
Downloading...
From: https://drive.google.com/uc?id=19CzXudqN58R3D-1G8KeFWk8UDQwlb8is
To: /content/drive/MyDrive/0. codepool_python/deep_learning_hylee2022/ch0_pytorch_tutorial/food-11.zip
100%|██████████| 1.16G/1.16G [00:07<00:00, 148MB/s]
載入需要的套件和模組 Libraries
import os, sys
import time, json, csv
from glob import glob
import numpy as np
import pandas as pd
# %% 深度學習套件 deep learning related
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.utils.data import DataLoader, Dataset
# %% 視覺化/製圖套件 visualization / plotting
# MacOSX 比較麻煩⋯⋯
from platform import system
if system() == "Darwin":
import matplotlib
matplotlib.use('TkAgg')
import matplotlib.pyplot as plt
# %% 圖片處理套件 CV related
import cv2
from PIL import Image
import torchvision
import torchvision.transforms as transforms
# %% 文字處理套件 NLP related
from gensim.models import word2vec
# %% 音訊處理套件 Speech related
# import torchaudio
# import librosa
# %% 好用的進度條和排版工具
## progress bar and pretty print
from tqdm import tqdm
from pprint import pprint
資料前處理 - Data Preprocessing¶
首先,我們需要將我們的資料整理成 model 可以處理的形式
資料讀取 - Reading files¶
文字檔案 - Text Files¶
這是最簡單的,直接如一般 Python 讀取就好 Simply follows that in other Python programs.
with open("filename.txt", 'r') as f:
data = f.read()
print(data)
Hello! I am the data~. :P
CSV 檔案 - CSV Files¶
with open("data.csv") as f:
csv_data = f.read()
print("Here comes a csv data:",
'=' * 60, csv_data, '=' * 60, sep='\n')
Here comes a csv data:
============================================================
Col0,Col1,Col2,Col3
Row1,data11,data12,data13
Row2,data21,data22,data23
Row3,data31,data32,data33
Row4,data41,data42,data43
Row5,data51,data52,data53
Row6,data61,data62,data63
Row7,data71,data72,data73
Row8,data81,data82,data83
============================================================
Python 原生 - Pure Python¶
import csv
with open("data.csv", 'r') as f:
csv_reader = csv.reader(f, delimiter=',')
# If you have a "tsv", do this:
## `csv_reader = csv.reader(f, delimiter='\t')`
csv_data1 = [row for row in csv_reader]
csv_data1
[['Col0', 'Col1', 'Col2', 'Col3'],
['Row1', 'data11', 'data12', 'data13'],
['Row2', 'data21', 'data22', 'data23'],
['Row3', 'data31', 'data32', 'data33'],
['Row4', 'data41', 'data42', 'data43'],
['Row5', 'data51', 'data52', 'data53'],
['Row6', 'data61', 'data62', 'data63'],
['Row7', 'data71', 'data72', 'data73'],
['Row8', 'data81', 'data82', 'data83']]
使用 Pandas 套件(比較快!) — Pandas Library (Faster!)¶
csv_data2 = pd.read_csv("data.csv")
# Saved as a Pandas dataframe
csv_data2
| Col0 | Col1 | Col2 | Col3 | |
|---|---|---|---|---|
| 0 | Row1 | data11 | data12 | data13 |
| 1 | Row2 | data21 | data22 | data23 |
| 2 | Row3 | data31 | data32 | data33 |
| 3 | Row4 | data41 | data42 | data43 |
| 4 | Row5 | data51 | data52 | data53 |
| 5 | Row6 | data61 | data62 | data63 |
| 6 | Row7 | data71 | data72 | data73 |
| 7 | Row8 | data81 | data82 | data83 |
data_columns = csv_data2.columns
data_columns.values # `.values` to numpy arrays
array(['Col0', 'Col1', 'Col2', 'Col3'], dtype=object)
try: # after pandas ver.0.24.0
data_content = csv_data2.to_numpy()
except: # before pandas ver.0.24.0
data_content = csv_data2.values
data_content
array([['Row1', 'data11', 'data12', 'data13'],
['Row2', 'data21', 'data22', 'data23'],
['Row3', 'data31', 'data32', 'data33'],
['Row4', 'data41', 'data42', 'data43'],
['Row5', 'data51', 'data52', 'data53'],
['Row6', 'data61', 'data62', 'data63'],
['Row7', 'data71', 'data72', 'data73'],
['Row8', 'data81', 'data82', 'data83']], dtype=object)
圖片檔案 — Image Files¶
#@title **看看圖片! Run me to view image!**
from IPython.display import Image as ImageColab
image = cv2.imread("Colab_icon.png")
im = Image.fromarray(image[..., ::-1])
im.save("Colab_icon.png")
ImageColab('Colab_icon.png')
# image = cv2.imread("image1.png")
image = cv2.imread("Colab_icon.png")
image.shape
(200, 200, 3)
資料處理操作 – Data Manipulation¶
文字資料 — Text data¶
text = "Hello, world!\nI want to try tabs.\tLike this!"
text_splitted = text.split('\n')
text_splitted = text.splitlines()
# List comprehension
text_splitted = [line.split('\t')
for line in text.split('\n')]
Numpy 陣列 — Numpy Array data¶
arr1 = np.array([[1, 2], [3, 4], [5, 6]])
arr2 = np.array([[9, 8], [7, 6], [5, 4]])
arr6 = np.random.randint(0, 100, (3, 5, 4))
print(">>> arr1 is ..."); print(arr1)
print(">>> arr2 is ..."); print(arr2)
arr3 = np.concatenate((arr1, arr2), axis=0)
arr4 = np.concatenate((arr1, arr2), axis=1)
arr5 = np.transpose(arr1)
arr7 = np.transpose(arr6, axes=(0, 2, 1))
lst = [3, 76, 4, -45, 0, 6]
arr8 = np.array(lst)
print(f">>> The shape of arr8 is {arr8.shape}.")
arr9 = arr8.astype(np.float)
print(">>> arr9 is ..."); print(arr9)
lst = [[9, 8], [7, 6], [5, 4]]
arr10 = np.array(lst, dtype="uint8")
arr11 = arr10.astype(np.float32)
print(f">>> The type of `lst` is {type(lst)},")
print(f" but the type of `arr10` is {type(arr10)}.")
new_shape = (1, -1, 3)
arr10_reshaped = arr10.reshape(new_shape)
print(">>> arr10_reshaped is ...")
print(arr10_reshaped)
arr10_transposed = arr10.T
print(">>> arr10_transposed is ...")
print(arr10_transposed)
>>> arr1 is ...
[[1 2]
[3 4]
[5 6]]
>>> arr2 is ...
[[9 8]
[7 6]
[5 4]]
>>> The shape of arr8 is (6,).
>>> arr9 is ...
[ 3. 76. 4. -45. 0. 6.]
>>> The type of `lst` is <class 'list'>,
but the type of `arr10` is <class 'numpy.ndarray'>.
>>> arr10_reshaped is ...
[[[9 8 7]
[6 5 4]]]
>>> arr10_transposed is ...
[[9 7 5]
[8 6 4]]
/usr/local/lib/python3.7/dist-packages/ipykernel_launcher.py:17: DeprecationWarning: `np.float` is a deprecated alias for the builtin `float`. To silence this warning, use `float` by itself. Doing this will not modify any behavior and is safe. If you specifically wanted the numpy scalar type, use `np.float64` here.
Deprecated in NumPy 1.20; for more details and guidance: https://numpy.org/devdocs/release/1.20.0-notes.html#deprecations
圖片資料 — Image data¶
把原來的圖片做伸縮
img_shape = (100, 100)
images = [cv2.resize(image, img_shape)]
zeros_reserved = \
np.zeros((len(images), *img_shape),
dtype=np.uint8)
print(images[0].shape)
(100, 100, 3)
#@markdown 看轉換的圖片! View resized image!
im = Image.fromarray(images[0][..., ::-1])
im.save("your_file.png")
ImageColab("your_file.png")
常用的圖片轉換 Data augmentations¶
import torchvision.transforms as transforms
# Basic transformations
train_transform = transforms.Compose([
transforms.ToPILImage(), # np.array --> PIL_image
transforms.ToTensor() # PIL_image --> Tensor
])
下面有一些重複且被註解掉的部分,是因為版本差異造成。請注意應該只有一個會 work
# For data augmentation
train_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.RandomHorizontalFlip(),
transforms.RandomRotation(DEGREE),
# transforms.RandomRotation(DEGREE, fill=(0,)),
# transforms.RandomRotation(DEGREE,
# resample=False, expand=False, center=None),
transforms.ToTensor()
])
# Feature Scaling,
# `mean` and `std` as np.array provided
train_transform = transforms.Compose([
transforms.ToPILImage(),
transforms.ToTensor(),
transforms.Normalize(
[mean], [std], inplace=False)
# transforms.Normalize([mean], [std])
])
詞向量 — Word2Vec¶
# https://en.wikipedia.org/wiki/Machine_learning
article = '''Machine learning (ML) is the scientific
study of algorithms and statistical models that
computer systems use to perform a specific task
without using explicit instructions, relying on
patterns and inference instead. It is seen as a
subset of artificial intelligence. Machine learning
algorithms build a mathematical model based on sample
data, known as "training data", in order to make
predictions or decisions without being explicitly
programmed to perform the task. Machine learning
algorithms are used in a wide variety of applications,
such as email filtering and computer vision, where it
is difficult or infeasible to develop a conventional
algorithm for effectively performing the task.
'''[:-1].replace('\n', '')
for punctuation in ",()\"":
article = article.replace(punctuation, '')
tokenized_sentences = []
for sentence in article.split('.'):
if sentence == '': continue
sentence = sentence.strip()
sentence = sentence[0].lower() + sentence[1:]
tokenized_sentences.append(sentence.split())
# 這格有時可能要跑一段時間! This cell may take time!
w2vmodel = word2vec.Word2Vec(
tokenized_sentences,
size=100, # Word embedding 的維度數
window=5, min_count=1,
workers=12, iter=5)
資料準備 — Dataset / Dataloader Preparation¶
在處理訓練資料時,進行資料型態的前處理與分批(batch)等是相當麻煩的事。
PyTorch 提供了一個很好的 dataset 與 dataloader 讓我們進行分裝以利訓練進行,還可以依需求自訂 dataset 的型態
簡言之,dataset 是用來做打包與預處理(例如輸入資料路徑自動讀取);
Dataloader 則是可以將整個資料集(dataset)按照 batch 進行迭代分裝或 shuffle(會得到一個 iterator 以利 for 迴圈讀取)
其中 dataset 必須給予 __len__(dataset 大小)與__getitem__(取得特定 index 的資料)的定義
(否則會跳出 NotImplementedError)
另外 Dataloader 可以自訂 collate_fn 決定 batch 的分裝方式,可以參見這裡
自己寫 Dataset class¶
X = np.random.rand(1000, 100, 100, 1) # 虛構 1000 張 100 x 100 單色圖片
Y = np.random.randint(0, 7, [1000, 10]) # 虛構 1000 個 labels
class RandomDataset(Dataset):
def __init__(self, data, target): # 把資料存進 class object
self.data = data
self.target = target
def __len__(self):
assert len(self.data) == len(self.target) # 確定資料有互相對應
return len(self.data)
def __getitem__(self, idx): # 定義我們需要取得某筆資料的方式
return self.data[idx], self.target[idx]
randomdataset = RandomDataset(X.astype(np.float32), Y.astype(np.float32))
taken_x, taken_y = randomdataset[0] # 原則上可以取得第一筆資料
taken_x.shape, taken_y.shape
((100, 100, 1), (10,))
# 將 dataset 包裝成 dataloader
randomdataloader = DataLoader(
randomdataset, batch_size=4,
shuffle=True, num_workers=4)
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:566: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
# 跑一個 loop 確認拿到的 batch 是否正確
for batch_x, batch_y in randomdataloader:
print((batch_x.shape, batch_y.shape))
break
(torch.Size([4, 100, 100, 1]), torch.Size([4, 10]))
直接用 TensorDataset¶
from torch.utils.data import TensorDataset
# 把資料轉成 Tensor
tsrX, tsrY = torch.tensor(X), torch.tensor(Y)
# 然後就只要一行了!
tsrdataset = TensorDataset(tsrX, tsrY)
# dataloader 本來就相對簡單
tsrdataloader = DataLoader(
tsrdataset, batch_size=4,
shuffle=True, num_workers=4)
# 跑一個 loop 確認拿到的 batch 是否正確
for batch_x, batch_y in tsrdataloader:
print((batch_x.shape, batch_y.shape))
break
(torch.Size([4, 100, 100, 1]), torch.Size([4, 10]))
實際上資料處理還可以更加複雜,機器學習中資料的前處理也是相當的學問!
取內建資料集 – MNIST¶
from torchvision import datasets, transforms
train_set = datasets.MNIST('../data', train=True, download=True,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
train_loader = DataLoader(train_set, batch_size=32, shuffle=True)
test_set = datasets.MNIST('../data', train=False,
transform=transforms.Compose([
transforms.ToTensor(),
transforms.Normalize((0.1307,), (0.3081,))
]))
test_loader = DataLoader(test_set, batch_size=32, shuffle=True)
for batch_x, batch_y in train_loader:
print((batch_x.shape, batch_y.shape))
break
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to ../data/MNIST/raw/train-images-idx3-ubyte.gz
Extracting ../data/MNIST/raw/train-images-idx3-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to ../data/MNIST/raw/train-labels-idx1-ubyte.gz
Extracting ../data/MNIST/raw/train-labels-idx1-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw/t10k-images-idx3-ubyte.gz
Extracting ../data/MNIST/raw/t10k-images-idx3-ubyte.gz to ../data/MNIST/raw
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz
Extracting ../data/MNIST/raw/t10k-labels-idx1-ubyte.gz to ../data/MNIST/raw
(torch.Size([32, 1, 28, 28]), torch.Size([32]))
Model Construction – torch.nn¶
常用的起手式就是以下兩行:
import torch.nn as nn
import torch.nn.functional as F
Model – nn.Module¶
This is the basic module for PyTorch Neural network models. To build an NN model, inherit from it.
class MyNNModel(nn.Module):
def __init__(self):
super(MyNNModel, self).__init__()
# other layers or else...
Wrapper – nn.Sequential¶
PyTorch provides a convenient layer wrapper nn.Sequential for us.
We can wrap a couple of layers together and use it for many times.
nn.Sequential(layers)
# Let us have 3 layers
layer1 = nn.Linear(100, 20)
layer2 = nn.Linear(20, 16)
layer3 = nn.Linear(16, 7)
# Data format:
# - Input: 100 x 100
# - Output: 100 x 7
input_data = torch.randn(100, 100)
output_data = torch.randn(100, 7)
print("Befor using `nn.Sequential`...")
# Originally, we need to write this.
print(f"The input tensor shape: {input_data.shape}")
out = layer1(input_data)
out = layer2(out)
result = layer3(out)
print(f"The output tensor shape: {result.shape}\n")
Befor using `nn.Sequential`...
The input tensor shape: torch.Size([100, 100])
The output tensor shape: torch.Size([100, 7])
# If we wrap them together,
## we can just view the layers as a block.
print("After using `nn.Sequential`...")
print(f"The input tensor shape: {input_data.shape}")
layer_block = nn.Sequential(
layer1, layer2, layer3
)
result = layer_block(input_data)
print(f"The output tensor shape: {result.shape}")
After using `nn.Sequential`...
The input tensor shape: torch.Size([100, 100])
The output tensor shape: torch.Size([100, 7])
Model Layers¶
NN¶
nn.Linear – Often used fully-connected layer
nn.Linear(in_dim, out_dim)
batch_size = 32
"""nn.Linear(in_dim, out_dim)"""
fake_data = torch.randn(batch_size, 128)
print(f"The input data shape: {fake_data.shape}")
Linear_layer = nn.Linear(128, 32)
print( "The output data shape: "\
f"{Linear_layer(fake_data).shape}")
The input data shape: torch.Size([32, 128])
The output data shape: torch.Size([32, 32])
CNN¶
Convolution layers¶
nn.Conv2d – Basic 2D Convolutional Layer
nn.Conv2d(in_channels, out_channels, kernel_size, stride=1, padding=0)
Input shape: \((N, C_{in}, H_{in}, W_{in})\)
Output shape: \((N, C_{out}, H_{out}, W_{out})\) where
$\(H_{out}=\left \lfloor \cfrac{H_{in}+ 2 \times \text{padding}[0]-\text{dilation}[0]\times(\text{kernel_size}[0]-1)-1}{\text{stride}[0]} \right \rfloor+1\)\(
\)\(W_{out}=\left \lfloor \cfrac{H_{in}+ 2 \times \text{padding}[1]-\text{dilation}[1]\times(\text{kernel_size}[1]-1)-1}{\text{stride}[1]} \right \rfloor+1\)$
The [0], [1] in the formula means the same value if the variable is passed in as an integer. (They can be tuples.)
fake_data = torch.randn(batch_size, 3, 100, 100)
print(f"The input data shape: {fake_data.shape}")
The input data shape: torch.Size([32, 3, 100, 100])
#@title Deciding channels
input_channels = 3#@param {type:"integer"}
output_channels = 128#@param {type:"integer"}
#@title Only `kernel_size`
kernel_size = 7#@param {type:"integer"}
Conv_layer1 = nn.Conv2d(input_channels,
output_channels,
kernel_size)
output_res1 = Conv_layer1(fake_data)
print("############### Try the following code... ###############\n")
print(f"Conv_layer1 = nn.Conv2d({input_channels}, "\
f"{output_channels}, "\
f"{kernel_size})")
print("output_res1 = Conv_layer1(fake_data)")
print("print(f\"The output data shape: {output_res1.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"((⌊100 + 2 × 0 - 1 × ({kernel_size} - 1) - 1⌋ / 1) + 1 =",
int((100 + 2 * 0 - 1 * (kernel_size - 1) - 1) / 1) + 1)
print(f"The output data shape: {output_res1.shape}")
############### Try the following code... ###############
Conv_layer1 = nn.Conv2d(3, 128, 7)
output_res1 = Conv_layer1(fake_data)
print(f"The output data shape: {output_res1.shape}")
#########################################################
Output `H_out` = ((⌊100 + 2 × 0 - 1 × (7 - 1) - 1⌋ / 1) + 1 = 94
The output data shape: torch.Size([32, 128, 94, 94])
Conv_layer1 = nn.Conv2d(3, 128, 7)
output_res1 = Conv_layer1(fake_data)
print(f"The output data shape: {output_res1.shape}")
The output data shape: torch.Size([32, 128, 94, 94])
#@title `kernel_size` and `stride`
kernel_size = 9#@param {type:"integer"}
stride = 3#@param {type:"integer"}
Conv_layer2 = nn.Conv2d(input_channels,
output_channels,
kernel_size,
stride)
output_res2 = Conv_layer2(fake_data)
print("############### Try the following code... ###############\n")
print(f"Conv_layer2 = nn.Conv2d({input_channels}, "\
f"{output_channels}, "\
f"{kernel_size}, {stride})")
print("output_res2 = Conv_layer2(fake_data)")
print("print(f\"The output data shape: {output_res2.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"((⌊100 + 2 × 0 - 1 × ({kernel_size} - 1) - 1⌋ / {stride}) + 1 =",
int((100 + 2 * 0 - 1 * (kernel_size - 1) - 1) / stride) + 1)
print( "The output data shape: "\
f"{output_res2.shape}")
############### Try the following code... ###############
Conv_layer2 = nn.Conv2d(3, 128, 9, 3)
output_res2 = Conv_layer2(fake_data)
print(f"The output data shape: {output_res2.shape}")
#########################################################
Output `H_out` = ((⌊100 + 2 × 0 - 1 × (9 - 1) - 1⌋ / 3) + 1 = 31
The output data shape: torch.Size([32, 128, 31, 31])
Conv_layer2 = nn.Conv2d(3, 128, 9, 3)
output_res2 = Conv_layer2(fake_data)
print(f"The output data shape: {output_res2.shape}")
The output data shape: torch.Size([32, 128, 31, 31])
#@title `kernel_size` and `padding`
kernel_size = 3#@param {type:"integer"}
padding = 1#@param {type:"integer"}
Conv_layer3 = nn.Conv2d(input_channels,
output_channels,
kernel_size,
padding=padding)
output_res3 = Conv_layer3(fake_data)
print("############### Try the following code... ###############\n")
print(f"Conv_layer3 = nn.Conv2d({input_channels}, "\
f"{output_channels}, "\
f"{kernel_size}, padding={padding})")
print("output_res3 = Conv_layer3(fake_data)")
print("print(f\"The output data shape: {output_res3.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"⌊((100 + 2 × {padding} - 1 × ({kernel_size} - 1) - 1⌋ / 1) + 1 =",
int((100 + 2 * padding - 1 * (kernel_size - 1) - 1) / 1) + 1)
print( "The output data shape: "\
f"{output_res3.shape}")
############### Try the following code... ###############
Conv_layer3 = nn.Conv2d(3, 128, 3, padding=1)
output_res3 = Conv_layer3(fake_data)
print(f"The output data shape: {output_res3.shape}")
#########################################################
Output `H_out` = ⌊((100 + 2 × 1 - 1 × (3 - 1) - 1⌋ / 1) + 1 = 100
The output data shape: torch.Size([32, 128, 100, 100])
Conv_layer3 = nn.Conv2d(3, 128, 3, padding=1)
output_res3 = Conv_layer3(fake_data)
print(f"The output data shape: {output_res3.shape}")
The output data shape: torch.Size([32, 128, 100, 100])
#@title `kernel_size`, `stride` and `padding`
kernel_size = 6#@param {type:"integer"}
stride = 2#@param {type:"integer"}
padding = 3#@param {type:"integer"}
Conv_layer4 = nn.Conv2d(input_channels,
output_channels,
kernel_size,
stride,
padding)
output_res4 = Conv_layer4(fake_data)
print("############### Try the following code... ###############\n")
print(f"Conv_layer4 = nn.Conv2d({input_channels}, "\
f"{output_channels}, "\
f"{kernel_size}, {stride}, {padding})")
print("output_res4 = Conv_layer4(fake_data)")
print("print(f\"The output data shape: {output_res4.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"⌊((100 + 2 × {padding} - 1 × ({kernel_size} - 1) - 1⌋ / {stride}) + 1 =",
int((100 + 2 * padding - 1 * (kernel_size - 1) - 1) / stride) + 1)
print( "The output data shape: "\
f"{output_res4.shape}")
############### Try the following code... ###############
Conv_layer4 = nn.Conv2d(3, 128, 6, 2, 3)
output_res4 = Conv_layer4(fake_data)
print(f"The output data shape: {output_res4.shape}")
#########################################################
Output `H_out` = ⌊((100 + 2 × 3 - 1 × (6 - 1) - 1⌋ / 2) + 1 = 51
The output data shape: torch.Size([32, 128, 51, 51])
Conv_layer4 = nn.Conv2d(3, 128, 6, 2, 3)
output_res4 = Conv_layer4(fake_data)
print(f"The output data shape: {output_res4.shape}")
The output data shape: torch.Size([32, 128, 51, 51])
Pooling layers¶
nn.MaxPool2d – Basic 2D Max Pooling Layer
nn.MaxPool2d(kernel_size, stride=None, padding=0, dilation=1)
# stride default: kernel_size
Input shape: \((N, C, H_{in}, W_{in})\)
Output shape: \((N, C, H_{out}, W_{out})\) where
$\(H_{out}=\left \lfloor \cfrac{H_{in}+ 2 \times \text{padding}[0]-\text{dilation}[0]\times(\text{kernel_size}[0]-1)-1}{\text{stride}[0]} \right \rfloor+1\)\(
\)\(W_{out}=\left \lfloor \cfrac{H_{in}+ 2 \times \text{padding}[1]-\text{dilation}[1]\times(\text{kernel_size}[1]-1)-1}{\text{stride}[1]} \right \rfloor+1\)$
fake_data = torch.randn(batch_size, 3, 100, 100)
print(f"The input data shape: {fake_data.shape}")
The input data shape: torch.Size([32, 3, 100, 100])
#@title Only `kernel_size` {display-mode:"form"}
kernel_size = 6#@param {type:"integer"}
MaxPool_layer1 = nn.MaxPool2d(kernel_size)
output_res1 = MaxPool_layer1(fake_data)
print("############### Try the following code... ###############\n")
print(f"MaxPool_layer1 = nn.MaxPool2d({kernel_size})")
print("output_res1 = MaxPool_layer1(fake_data)")
print("print(f\"The output data shape: {output_res1.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"⌊((100 + 2 × 0 - 1 × ({kernel_size} - 1) - 1) / {kernel_size}⌋ + 1 =",
int((100 + 2 * 0 - 1 * (kernel_size - 1) - 1) / kernel_size) + 1)
print(f"The output data shape: {output_res1.shape}")
############### Try the following code... ###############
MaxPool_layer1 = nn.MaxPool2d(6)
output_res1 = MaxPool_layer1(fake_data)
print(f"The output data shape: {output_res1.shape}")
#########################################################
Output `H_out` = ⌊((100 + 2 × 0 - 1 × (6 - 1) - 1) / 6⌋ + 1 = 16
The output data shape: torch.Size([32, 3, 16, 16])
MaxPool_layer1 = nn.MaxPool2d(6)
output_res1 = MaxPool_layer1(fake_data)
print(f"The output data shape: {output_res1.shape}")
The output data shape: torch.Size([32, 3, 16, 16])
# @title `kernel_size` ≠ `stride` {display-mode:"form"}
kernel_size = 7#@param {type:"integer"}
stride = 9#@param {type:"integer"}
MaxPool_layer2 = nn.MaxPool2d(kernel_size, stride)
output_res2 = MaxPool_layer2(fake_data)
print("############### Try the following code... ###############\n")
print(f"MaxPool_layer2 = nn.MaxPool2d({kernel_size}, {stride})")
print("output_res2 = MaxPool_layer2(fake_data)")
print("print(f\"The output data shape: {output_res2.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"⌊((100 + 2 × 0 - 1 × ({kernel_size} - 1) - 1⌋ / {stride}) + 1 =",
int((100 + 2 * 0 - 1 * (kernel_size - 1) - 1) / stride) + 1)
print(f"The output data shape: {output_res2.shape}")
############### Try the following code... ###############
MaxPool_layer2 = nn.MaxPool2d(7, 9)
output_res2 = MaxPool_layer2(fake_data)
print(f"The output data shape: {output_res2.shape}")
#########################################################
Output `H_out` = ⌊((100 + 2 × 0 - 1 × (7 - 1) - 1⌋ / 9) + 1 = 11
The output data shape: torch.Size([32, 3, 11, 11])
MaxPool_layer2 = nn.MaxPool2d(7, 9)
output_res2 = MaxPool_layer2(fake_data)
print(f"The output data shape: {output_res2.shape}")
The output data shape: torch.Size([32, 3, 11, 11])
#@title `kernel_size`, `stride` and `padding`
kernel_size = 5#@param {type:"integer"}
stride = 3#@param {type:"integer"}
padding = 2#@param {type:"integer"}
MaxPool_layer3 = nn.MaxPool2d(kernel_size, stride, padding)
output_res3 = MaxPool_layer3(fake_data)
print("############### Try the following code... ###############\n")
print(f"MaxPool_layer3 = nn.MaxPool2d({kernel_size}, {stride}, {padding})")
print("output_res3 = MaxPool_layer3(fake_data)")
print("print(f\"The output data shape: {output_res3.shape}\")")
print("\n#########################################################\n")
print("Output `H_out` =",
f"⌊((100 + 2 × {padding} - 1 × ({kernel_size} - 1) - 1⌋ / {stride}) + 1 =",
int((100 + 2 * padding - 1 * (kernel_size - 1) - 1) / stride) + 1)
print( "The output data shape: "\
f"{output_res3.shape}")
############### Try the following code... ###############
MaxPool_layer3 = nn.MaxPool2d(5, 3, 2)
output_res3 = MaxPool_layer3(fake_data)
print(f"The output data shape: {output_res3.shape}")
#########################################################
Output `H_out` = ⌊((100 + 2 × 2 - 1 × (5 - 1) - 1⌋ / 3) + 1 = 34
The output data shape: torch.Size([32, 3, 34, 34])
MaxPool_layer3 = nn.MaxPool2d(5, 3, 2)
output_res3 = MaxPool_layer3(fake_data)
print(f"The output data shape: {output_res3.shape}")
The output data shape: torch.Size([32, 3, 34, 34])
RNN¶
Embedding layers¶
Embedding_layer = nn.Embedding(10, 3)
fake_data = torch.LongTensor([[1, 2, 4, 5], [4, 3, 2, 9]])
print("""
The input data shape: {}
The output data shape: {}""".format
(fake_data.shape, Embedding_layer(fake_data).shape)[1:])
The input data shape: torch.Size([2, 4])
The output data shape: torch.Size([2, 4, 3])
Loading Word2Vec models¶
Embedding_layer = nn.Embedding(*(w2vmodel.wv.vectors.shape))
Embedding_layer.weight = nn.Parameter(
torch.FloatTensor(w2vmodel.wv.vectors))
fix_embedding = True
Embedding_layer.weight.requires_grad = not fix_embedding
word2index = {word: ind \
for ind, word in enumerate(w2vmodel.wv.index2word)}
print("Our embedding dimesion is {}.".format(w2vmodel.wv.vector_size))
Our embedding dimesion is 100.
sent_ori = "The model is used for training."
sent = sent_ori.replace('.', '').lower()
list_of_indices = [word2index[w] for w in sent.split()]
tensor_of_indices = torch.LongTensor(list_of_indices)
print(
"The sentence is:\n \"", sent_ori + '"',
"""\nWe pass {} tokens into the model,
which is treated as a LongTensor with shape "{}".
The embedding layer transformed it to shape "{}".
""".format(len(list_of_indices),
tensor_of_indices.shape,
Embedding_layer(tensor_of_indices).shape))
The sentence is:
" The model is used for training."
We pass 6 tokens into the model,
which is treated as a LongTensor with shape "torch.Size([6])".
The embedding layer transformed it to shape "torch.Size([6, 100])".
RNN layers (LSTM, GRU)¶
LSTM_layer = nn.LSTM(100, 80, 2)
fake_data = torch.randn(5, 3, 100)
h0 = torch.randn(2, 3, 80)
c0 = torch.randn(2, 3, 80)
output, (hn, cn) = LSTM_layer(fake_data, (h0, c0))
print("""Input shape: {}
Output shape: {}""".format(
fake_data.shape, (output.shape, (hn.shape, cn.shape))))
Input shape: torch.Size([5, 3, 100])
Output shape: (torch.Size([5, 3, 80]), (torch.Size([2, 3, 80]), torch.Size([2, 3, 80])))
GRU_layer = nn.GRU(100, 80, 2)
fake_data = torch.randn(5, 3, 100)
h0 = torch.randn(2, 3, 80)
output, hn = GRU_layer(fake_data, h0)
print("""Input shape: {}
Output shape: {}""".format(
fake_data.shape, (output.shape, hn.shape)))
Input shape: torch.Size([5, 3, 100])
Output shape: (torch.Size([5, 3, 80]), torch.Size([2, 3, 80]))
Activation functions¶
You have two choices for your activation functions:
In
torch.nn, we have model layer modules.In
torch.nn.functionals, we have function implementations of activation functions, loss functions, and so on.
Just to list a few…
activation function |
|
|
|---|---|---|
Sigmoid |
|
|
Softmax |
|
|
ReLU |
|
|
LeakyReLU |
|
|
Tanh |
|
|
GELU |
|
|
ReLU6 |
|
|
Python Tips
Functions vs. Objects
What you get from calling nn.Sigmoid() (and others…) is an object initialized from the module.
Hence, if you want to pass a tensor to that “layer object”, you should write this:
# `x` is a tensor.
activation = nn.Sigmoid() # Note that this is a "constructor"!
out = activation(x) # i.e. `out = nn.Sigmoid()(x)` is valid,
# but the object is discarded if you do that.
On the other hand, if you simply want to use functions, do this:
# `x` is a tensor.
out = F.sigmoid(x) # Since `F.sigmoid` is already a "function"!
For most time, both are valid. It’s just two coding styles.
Loss functions¶
兩種寫 loss 的方式,一種是 class, 一種是 function
loss functions |
|
|
|---|---|---|
Mean Square Error |
|
|
Cross Entropy (Multi-label) |
|
|
Binary Cross Entropy |
|
|
Negative Log Likelihood |
|
|
優化器 — Optimizers¶
用來更新參數的方法(
SGD、Adagrad、Adam⋯⋯)在 PyTorch 中要經過
backward()函數計算 gradient,而在這之前要先用
optim.zero_grad()將 gradient 清掉,否則 PyTorch 會將 gradient 累加起來(以下請注意 model 參數更新的方向)
small_model = nn.Linear(3, 7)
print("Take a look at model params:")
print(small_model.weight)
X, Y = torch.rand(3,), torch.rand(7,)
print("\nGiven input X:")
print(X)
print("\nAnd target Y:")
print(Y)
optim = torch.optim.SGD(
small_model.parameters(), lr=1e-2)
mse_loss = nn.MSELoss()
print("\nThe output Y:")
temp_Y = small_model(X)
print(temp_Y)
print("\nCalculate their MSE Loss = ", end='')
loss = mse_loss(temp_Y, Y)
print(loss.item())
print("\n##### Update a step! #####")
optim.zero_grad()
loss.backward()
optim.step()
print("\nTake a look at \"updated\"model params:")
print(small_model.weight)
updated_params = small_model.weight.data
Take a look at model params:
Parameter containing:
tensor([[ 0.0458, 0.5581, 0.2630],
[-0.0966, -0.5707, -0.4858],
[-0.4740, -0.5006, -0.0467],
[ 0.1894, 0.4758, -0.2796],
[ 0.3748, -0.0029, -0.3990],
[-0.3965, -0.2291, -0.5162],
[ 0.2167, 0.0915, -0.3886]], requires_grad=True)
Given input X:
tensor([0.8153, 0.2770, 0.2055])
And target Y:
tensor([0.2356, 0.1756, 0.9553, 0.8135, 0.6706, 0.7289, 0.2415])
The output Y:
tensor([ 0.3008, -0.7839, 0.0222, 0.2534, 0.7063, -0.1808, -0.2983],
grad_fn=<AddBackward0>)
Calculate their MSE Loss = 0.461369127035141
##### Update a step! #####
Take a look at "updated"model params:
Parameter containing:
tensor([[ 0.0456, 0.5580, 0.2630],
[-0.0944, -0.5700, -0.4852],
[-0.4718, -0.4999, -0.0462],
[ 0.1907, 0.4762, -0.2792],
[ 0.3747, -0.0030, -0.3990],
[-0.3943, -0.2284, -0.5157],
[ 0.2179, 0.0919, -0.3883]], requires_grad=True)
Normalization¶
PyTorch 提供不少 normalization 的方法,在初期用得到的主要是 CNN 的 batch normalization
nn.BatchNorm2d(num_features)
Training (Validation) / Fine-Tuning¶
先像上面一樣處理隨機資料
from torch.utils.data import TensorDataset
# Training
X = np.random.rand(1000, 100, 100, 1) # 虛構 1000 張 100 x 100 單色圖片
Y = np.random.randint(0, 7, [1000, 10]) # 虛構 1000 個 labels
X, Y = X.astype(np.float32), Y.astype(np.float32)
tsrX, tsrY = torch.tensor(X), torch.tensor(Y)
tsrdataset = TensorDataset(tsrX, tsrY)
tsrdataloader = DataLoader(
tsrdataset, batch_size=4,
shuffle=True, num_workers=4)
# Validation
vX = np.random.rand(100, 100, 100, 1) # 虛構 100 張 100 x 100 單色圖片
vY = np.random.randint(0, 7, [100, 10]) # 虛構 100 個 labels
vX, vY = vX.astype(np.float32), vY.astype(np.float32)
vtsrX, vtsrY = torch.tensor(vX), torch.tensor(vY)
vtsrdataset = TensorDataset(tsrX, tsrY)
vtsrdataloader = DataLoader(
vtsrdataset, batch_size=4,
shuffle=False, num_workers=4) # Validation 不需要 shuffle
# Testing
tX = np.random.rand(100, 100, 100, 1) # 虛構 100 張 100 x 100 單色圖片
tY = np.random.randint(0, 7, [100, 10]) # 虛構 100 個 labels
tX, tY = tX.astype(np.float32), tY.astype(np.float32)
ttsrX, ttsrY = torch.tensor(tX), torch.tensor(tY)
ttsrdataset = TensorDataset(tsrX, tsrY)
ttsrdataloader = DataLoader(
ttsrdataset, batch_size=4,
shuffle=False, num_workers=4) # Testing 不需要 shuffle
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:566: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
在訓練之前,先根據我們前面全部的東西搭一個簡單的 model
class SimpleNN(nn.Module):
def __init__(self):
super(SimpleNN, self).__init__()
self.fc = nn.Sequential(
nn.Linear(10000, 500),
nn.Linear(500, 10))
def forward(self, x):
# 傳入 model 的函數會經過 forward 做 inference
x = x.view(x.size(0), -1) # flatten 的意思,原本的 x.size = (batch_size, 100, 100, 1) -> 改成 (batch_size, 100*100*1)
return self.fc(x)
simpleNN = SimpleNN()
接著準備 optimizer 跟 loss function
optim = torch.optim.Adam(simpleNN.parameters(),
lr=1e-4)
criterion = nn.MSELoss()
接著進入 training
Training 的本質就是跑一個迴圈,在每一次(叫一個 epoch)要做的事有——
載入資料
經過 model 跑一次
比對資料的正確性,算誤差(loss)
把梯度清掉,然後根據這次誤差算新的梯度
根據 optimizer 更新參數
為了方便觀察,將本次 epoch 訓練的變化顯示出來,包括
進度條(觀察訓練快慢)
batch loss (這個有時候會輸出太多東西)
epoch loss (記得累計並除掉資料數量)
記錄到其他變數中(方便作圖)
記錄到 Tensorboard 中(SummaryWriter)
為了避免 overfit,我們每個 epoch 還會進行一次 validation,事情少一些,變成——
載入資料
經過 model 跑一次
比對資料的正確性,算誤差(loss)
為了方便觀察,將本次 epoch validate 的結果顯示出來,包括
進度條(觀察訓練快慢)
batch loss (這個有時候會輸出太多東西)
epoch loss (記得累計並除掉資料數量)
記錄到其他變數中(方便作圖)
記錄到 Tensorboard 中(SummaryWriter)
EPOCHS = 10
for epoch in range(EPOCHS):
simpleNN.train()
epoch_loss = 0.0
for x, y in tsrdataloader:
y_hat = simpleNN(x)
loss = criterion(y, y_hat)
optim.zero_grad()
loss.backward()
optim.step()
epoch_loss += loss.item()
average_epoch_loss = epoch_loss / len(tsrdataset)
print(f"Training Epoch {epoch + 1:2d}: Loss = {average_epoch_loss:.4f}")
simpleNN.eval()
vepoch_loss = 0.0
for x, y in vtsrdataloader:
y_hat = simpleNN(x)
loss = criterion(y, y_hat)
vepoch_loss += loss.item()
vaverage_epoch_loss = vepoch_loss / len(vtsrdataset)
print(f"Validation Epoch {epoch + 1:2d}: Loss = {vaverage_epoch_loss:.4f}")
Training Epoch 1: Loss = 1.1901
Validation Epoch 1: Loss = 0.9613
Training Epoch 2: Loss = 0.9284
Validation Epoch 2: Loss = 0.7341
Training Epoch 3: Loss = 0.7754
Validation Epoch 3: Loss = 0.7614
Training Epoch 4: Loss = 0.6181
Validation Epoch 4: Loss = 0.4514
Training Epoch 5: Loss = 0.4494
Validation Epoch 5: Loss = 0.3682
Training Epoch 6: Loss = 0.3269
Validation Epoch 6: Loss = 0.2154
Training Epoch 7: Loss = 0.2071
Validation Epoch 7: Loss = 0.1433
Training Epoch 8: Loss = 0.1292
Validation Epoch 8: Loss = 0.1190
Training Epoch 9: Loss = 0.0766
Validation Epoch 9: Loss = 0.0466
Training Epoch 10: Loss = 0.0421
Validation Epoch 10: Loss = 0.0397
嘛⋯⋯畢竟是隨機生成的,所以發生 overfit 什麼的也不要太意外,不過有時好像會 train 起來??!
覺得跑很慢嗎?我們有 GPU 為什麼不用呢??!來看看怎麼用!
device = "cuda" if torch.cuda.is_available() else "cpu"
device # Check if GPU available
# 其實寫 x.to(device) 之外也可以寫 x.cuda()
# 但是前者會自動根據環境決定是否使用 GPU 比較彈性
'cuda'
simpleNN = SimpleNN()
simpleNN.to(device) # 把 model 移到 GPU 計算
optim = torch.optim.Adam(
simpleNN.parameters(), lr=1e-4)
criterion = nn.MSELoss()
EPOCHS = 10
for epoch in range(EPOCHS):
simpleNN.train()
epoch_loss = 0.0
for x, y in tsrdataloader:
y_hat = simpleNN(x.to(device)) # 把 x tensor 移到 GPU 計算
loss = criterion(y.to(device), y_hat) # 把 y tensor 移到 GPU 計算,
## y_hat 因為是從 GPU model input GPU Tensor 出來的
## 所以不用再次 .to(device) 當然要也是沒差啦 =_=|||
optim.zero_grad()
loss.backward()
optim.step()
epoch_loss += loss.item()
average_epoch_loss = epoch_loss / len(tsrdataset)
print(f"Training Epoch {epoch + 1:2d}: Loss = {average_epoch_loss:.4f}")
simpleNN.eval()
vepoch_loss = 0.0
for x, y in vtsrdataloader:
y_hat = simpleNN(x.to(device))
loss = criterion(y.to(device), y_hat)
vepoch_loss += loss.item()
vaverage_epoch_loss = vepoch_loss / len(vtsrdataset)
print(f"Validation Epoch {epoch + 1:2d}: Loss = {vaverage_epoch_loss:.4f}")
Training Epoch 1: Loss = 1.2024
Validation Epoch 1: Loss = 1.1204
Training Epoch 2: Loss = 0.9753
Validation Epoch 2: Loss = 0.8663
Training Epoch 3: Loss = 0.7765
Validation Epoch 3: Loss = 0.5975
Training Epoch 4: Loss = 0.6070
Validation Epoch 4: Loss = 0.5318
Training Epoch 5: Loss = 0.4657
Validation Epoch 5: Loss = 0.4495
Training Epoch 6: Loss = 0.3318
Validation Epoch 6: Loss = 0.2707
Training Epoch 7: Loss = 0.2139
Validation Epoch 7: Loss = 0.2161
Training Epoch 8: Loss = 0.1349
Validation Epoch 8: Loss = 0.0770
Training Epoch 9: Loss = 0.0797
Validation Epoch 9: Loss = 0.0439
Training Epoch 10: Loss = 0.0440
Validation Epoch 10: Loss = 0.0437
– Model saving / loading (checkpoints)¶
# Save model
simpleNN.cpu() # 先移回 CPU
torch.save(simpleNN.state_dict(), "randmodel.model")
# Load model
model2 = SimpleNN()
model2.load_state_dict(torch.load("randmodel.model"))
<All keys matched successfully>
# 確認是同一個 model
torch.equal(model2(x), simpleNN(x))
True
4. Testing / Evaluation¶
這裡當然也可以開 GPU,用法相同
simpleNN.eval()
tepoch_loss = 0.0
for x, y in ttsrdataloader:
y_hat = simpleNN(x)
loss = criterion(y, y_hat)
tepoch_loss += loss.item()
taverage_epoch_loss = tepoch_loss / len(ttsrdataset)
print(f"Testing Loss = {taverage_epoch_loss:.4f}")
Testing Loss = 0.0437
5. Result Post-processing¶
1 – Saving files¶
with open("loss.txt", 'w') as f: f.write(str(taverage_epoch_loss))
如果已經把答案存成一個 np.array 或 list,叫做 answer
with open("result.csv", 'w') as f:
f.write("index,ans\n")
for idx, i in enumerate(answer):
f.write(f"{idx},{i}\n")
2 – Kaggle Upload¶
Kaggle 有提供方便的 API 可以直接在 Colab 或是 terminal 上傳,只要先弄好 token,以下列指令
kaggle competitions submit -c <competition_name> -f <filename> -m <message>
就可以上傳,例如作業3
kaggle competitions submit -c ml2020spring-hw3 -f result.csv -m "The first try!"
另外要看自己的 submissions
# kaggle competitions submissions -c <competition_name>
kaggle competitions submissions -c ml2020spring-hw3
或是排行榜
# kaggle competitions leaderboard -c <competition_name> --show
kaggle competitions leaderboard -c ml2020spring-hw3 --show
(記得把競賽名對應改掉)
3 – Colaboratory / Kaggle Notebook¶
你現在在用的這個環境本質是 Jupyter Notebook,Kaggle 也有提供一樣的環境可以操作
(不過在比賽中還是不要用的好,大家都想先藏招嘛! B-) )

另外 Jupyter Notebook 也是可以開本地端的(如去年和前年有 DeepQ 贊助運算資源即是以此為介面)
6. Visualization¶
TensorBoard¶
%load_ext tensorboard
import os
logs_base_dir = "runs"
os.makedirs(logs_base_dir, exist_ok=True)
%tensorboard --logdir {logs_base_dir}
Output hidden; open in https://colab.research.google.com to view.
from torch.utils.tensorboard import SummaryWriter
tb = SummaryWriter()
simpleNN = SimpleNN()
simpleNN.to(device) # 把 model 移到 GPU 計算
optim = torch.optim.Adam(
simpleNN.parameters(), lr=1e-4)
loss = nn.MSELoss()
EPOCHS = 100
for epoch in range(EPOCHS):
simpleNN.train()
epoch_loss = 0.0
for x, y in tsrdataloader:
y_hat = simpleNN(x.to(device)) # 把 x tensor 移到 GPU 計算
loss = criterion(y.to(device), y_hat) # 把 y tensor 移到 GPU 計算,
## y_hat 因為是從 GPU model input GPU Tensor 出來的
## 所以不用再次 .to(device) 當然要也是沒差啦 =_=|||
optim.zero_grad()
loss.backward()
optim.step()
epoch_loss += loss.item()
average_epoch_loss = epoch_loss / len(tsrdataset)
print(f"Training Epoch {epoch + 1:2d}: Loss = {average_epoch_loss:.4f}")
tb.add_scalar("Loss/train", average_epoch_loss, epoch + 1) # 加這個
simpleNN.eval()
vepoch_loss = 0.0
for x, y in vtsrdataloader:
y_hat = simpleNN(x.to(device))
loss = criterion(y.to(device), y_hat)
vepoch_loss += loss.item()
vaverage_epoch_loss = vepoch_loss / len(vtsrdataset)
print(f"Validation Epoch {epoch + 1:2d}: Loss = {vaverage_epoch_loss:.4f}")
tb.add_scalar("Loss/val", vaverage_epoch_loss, epoch + 1) # 加這個
tb.close() # 加這個
/usr/local/lib/python3.7/dist-packages/torch/utils/data/dataloader.py:566: UserWarning: This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current system is 2, which is smaller than what this DataLoader is going to create. Please be aware that excessive worker creation might get DataLoader running slow or even freeze, lower the worker number to avoid potential slowness/freeze if necessary.
cpuset_checked))
Training Epoch 1: Loss = 1.1834
Validation Epoch 1: Loss = 0.9316
Training Epoch 2: Loss = 0.9446
Validation Epoch 2: Loss = 0.7346
Training Epoch 3: Loss = 0.7585
Validation Epoch 3: Loss = 0.5875
Training Epoch 4: Loss = 0.6048
Validation Epoch 4: Loss = 0.4823
Training Epoch 5: Loss = 0.4672
Validation Epoch 5: Loss = 0.3465
Training Epoch 6: Loss = 0.3253
Validation Epoch 6: Loss = 0.2362
Training Epoch 7: Loss = 0.2127
Validation Epoch 7: Loss = 0.1778
Training Epoch 8: Loss = 0.1236
Validation Epoch 8: Loss = 0.0781
Training Epoch 9: Loss = 0.0720
Validation Epoch 9: Loss = 0.0429
Training Epoch 10: Loss = 0.0419
Validation Epoch 10: Loss = 0.0446
Training Epoch 11: Loss = 0.0238
Validation Epoch 11: Loss = 0.0203
Training Epoch 12: Loss = 0.0134
Validation Epoch 12: Loss = 0.0114
Training Epoch 13: Loss = 0.0080
Validation Epoch 13: Loss = 0.0060
Training Epoch 14: Loss = 0.0057
Validation Epoch 14: Loss = 0.0042
Training Epoch 15: Loss = 0.0042
Validation Epoch 15: Loss = 0.0046
Training Epoch 16: Loss = 0.0039
Validation Epoch 16: Loss = 0.0037
Training Epoch 17: Loss = 0.0051
Validation Epoch 17: Loss = 0.0044
Training Epoch 18: Loss = 0.0083
Validation Epoch 18: Loss = 0.0175
Training Epoch 19: Loss = 0.0155
Validation Epoch 19: Loss = 0.0206
Training Epoch 20: Loss = 0.0254
Validation Epoch 20: Loss = 0.0372
Training Epoch 21: Loss = 0.0360
Validation Epoch 21: Loss = 0.0306
Training Epoch 22: Loss = 0.0401
Validation Epoch 22: Loss = 0.0264
Training Epoch 23: Loss = 0.0352
Validation Epoch 23: Loss = 0.0388
Training Epoch 24: Loss = 0.0277
Validation Epoch 24: Loss = 0.0459
Training Epoch 25: Loss = 0.0207
Validation Epoch 25: Loss = 0.0216
Training Epoch 26: Loss = 0.0155
Validation Epoch 26: Loss = 0.0207
Training Epoch 27: Loss = 0.0126
Validation Epoch 27: Loss = 0.0141
Training Epoch 28: Loss = 0.0142
Validation Epoch 28: Loss = 0.0088
Training Epoch 29: Loss = 0.0180
Validation Epoch 29: Loss = 0.0158
Training Epoch 30: Loss = 0.0196
Validation Epoch 30: Loss = 0.0444
Training Epoch 31: Loss = 0.0198
Validation Epoch 31: Loss = 0.0188
Training Epoch 32: Loss = 0.0196
Validation Epoch 32: Loss = 0.0106
Training Epoch 33: Loss = 0.0201
Validation Epoch 33: Loss = 0.0154
Training Epoch 34: Loss = 0.0218
Validation Epoch 34: Loss = 0.0219
Training Epoch 35: Loss = 0.0198
Validation Epoch 35: Loss = 0.0109
Training Epoch 36: Loss = 0.0178
Validation Epoch 36: Loss = 0.0177
Training Epoch 37: Loss = 0.0143
Validation Epoch 37: Loss = 0.0103
Training Epoch 38: Loss = 0.0115
Validation Epoch 38: Loss = 0.0174
Training Epoch 39: Loss = 0.0124
Validation Epoch 39: Loss = 0.0113
Training Epoch 40: Loss = 0.0152
Validation Epoch 40: Loss = 0.0164
Training Epoch 41: Loss = 0.0164
Validation Epoch 41: Loss = 0.0099
Training Epoch 42: Loss = 0.0166
Validation Epoch 42: Loss = 0.0201
Training Epoch 43: Loss = 0.0181
Validation Epoch 43: Loss = 0.0160
Training Epoch 44: Loss = 0.0176
Validation Epoch 44: Loss = 0.0139
Training Epoch 45: Loss = 0.0156
Validation Epoch 45: Loss = 0.0113
Training Epoch 46: Loss = 0.0148
Validation Epoch 46: Loss = 0.0147
Training Epoch 47: Loss = 0.0152
Validation Epoch 47: Loss = 0.0109
Training Epoch 48: Loss = 0.0141
Validation Epoch 48: Loss = 0.0201
Training Epoch 49: Loss = 0.0140
Validation Epoch 49: Loss = 0.0077
Training Epoch 50: Loss = 0.0133
Validation Epoch 50: Loss = 0.0078
Training Epoch 51: Loss = 0.0129
Validation Epoch 51: Loss = 0.0124
Training Epoch 52: Loss = 0.0136
Validation Epoch 52: Loss = 0.0131
Training Epoch 53: Loss = 0.0139
Validation Epoch 53: Loss = 0.0105
Training Epoch 54: Loss = 0.0125
Validation Epoch 54: Loss = 0.0179
Training Epoch 55: Loss = 0.0131
Validation Epoch 55: Loss = 0.0170
Training Epoch 56: Loss = 0.0134
Validation Epoch 56: Loss = 0.0118
Training Epoch 57: Loss = 0.0133
Validation Epoch 57: Loss = 0.0114
Training Epoch 58: Loss = 0.0144
Validation Epoch 58: Loss = 0.0232
Training Epoch 59: Loss = 0.0137
Validation Epoch 59: Loss = 0.0176
Training Epoch 60: Loss = 0.0131
Validation Epoch 60: Loss = 0.0144
Training Epoch 61: Loss = 0.0130
Validation Epoch 61: Loss = 0.0108
Training Epoch 62: Loss = 0.0133
Validation Epoch 62: Loss = 0.0094
Training Epoch 63: Loss = 0.0119
Validation Epoch 63: Loss = 0.0103
Training Epoch 64: Loss = 0.0116
Validation Epoch 64: Loss = 0.0119
Training Epoch 65: Loss = 0.0120
Validation Epoch 65: Loss = 0.0076
Training Epoch 66: Loss = 0.0116
Validation Epoch 66: Loss = 0.0143
Training Epoch 67: Loss = 0.0111
Validation Epoch 67: Loss = 0.0100
Training Epoch 68: Loss = 0.0101
Validation Epoch 68: Loss = 0.0084
Training Epoch 69: Loss = 0.0114
Validation Epoch 69: Loss = 0.0094
Training Epoch 70: Loss = 0.0121
Validation Epoch 70: Loss = 0.0145
Training Epoch 71: Loss = 0.0118
Validation Epoch 71: Loss = 0.0092
Training Epoch 72: Loss = 0.0113
Validation Epoch 72: Loss = 0.0101
Training Epoch 73: Loss = 0.0106
Validation Epoch 73: Loss = 0.0184
Training Epoch 74: Loss = 0.0100
Validation Epoch 74: Loss = 0.0068
Training Epoch 75: Loss = 0.0092
Validation Epoch 75: Loss = 0.0163
Training Epoch 76: Loss = 0.0099
Validation Epoch 76: Loss = 0.0082
Training Epoch 77: Loss = 0.0111
Validation Epoch 77: Loss = 0.0103
Training Epoch 78: Loss = 0.0116
Validation Epoch 78: Loss = 0.0123
Training Epoch 79: Loss = 0.0120
Validation Epoch 79: Loss = 0.0136
Training Epoch 80: Loss = 0.0117
Validation Epoch 80: Loss = 0.0143
Training Epoch 81: Loss = 0.0113
Validation Epoch 81: Loss = 0.0116
Training Epoch 82: Loss = 0.0105
Validation Epoch 82: Loss = 0.0066
Training Epoch 83: Loss = 0.0097
Validation Epoch 83: Loss = 0.0125
Training Epoch 84: Loss = 0.0093
Validation Epoch 84: Loss = 0.0115
Training Epoch 85: Loss = 0.0092
Validation Epoch 85: Loss = 0.0080
Training Epoch 86: Loss = 0.0099
Validation Epoch 86: Loss = 0.0089
Training Epoch 87: Loss = 0.0099
Validation Epoch 87: Loss = 0.0088
Training Epoch 88: Loss = 0.0104
Validation Epoch 88: Loss = 0.0086
Training Epoch 89: Loss = 0.0101
Validation Epoch 89: Loss = 0.0074
Training Epoch 90: Loss = 0.0092
Validation Epoch 90: Loss = 0.0128
Training Epoch 91: Loss = 0.0092
Validation Epoch 91: Loss = 0.0076
Training Epoch 92: Loss = 0.0081
Validation Epoch 92: Loss = 0.0071
Training Epoch 93: Loss = 0.0076
Validation Epoch 93: Loss = 0.0083
Training Epoch 94: Loss = 0.0078
Validation Epoch 94: Loss = 0.0072
Training Epoch 95: Loss = 0.0108
Validation Epoch 95: Loss = 0.0097
Training Epoch 96: Loss = 0.0129
Validation Epoch 96: Loss = 0.0126
Training Epoch 97: Loss = 0.0123
Validation Epoch 97: Loss = 0.0091
Training Epoch 98: Loss = 0.0093
Validation Epoch 98: Loss = 0.0105
Training Epoch 99: Loss = 0.0075
Validation Epoch 99: Loss = 0.0088
Training Epoch 100: Loss = 0.0072
Validation Epoch 100: Loss = 0.0115
應該會像這樣子(左下角只會有一個顏色才是,此處請忽略)

7. GPU Usage¶
%%bash
nvidia-smi
echo
echo "沒有跑出來請看下面"
Wed Mar 25 22:22:23 2020
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.64.00 Driver Version: 418.67 CUDA Version: 10.1 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
|===============================+======================+======================|
| 0 Tesla P100-PCIE... Off | 00000000:00:04.0 Off | 0 |
| N/A 39C P0 32W / 250W | 877MiB / 16280MiB | 0% Default |
+-------------------------------+----------------------+----------------------+
+-----------------------------------------------------------------------------+
| Processes: GPU Memory |
| GPU PID Type Process name Usage |
|=============================================================================|
+-----------------------------------------------------------------------------+
沒有跑出來請看下面
1 – Colaboratory¶
如果不知道怎麼切 GPU 的,看左上角

然後

其實還有 TPU 可以用,就自己去研究吧!