9. 三種搭建神經網路的方式

  • 分為三種:

    • sequential: 寫法簡單明瞭適合初學者快速地搭建神經網路

    • funcitional: 有技巧性的將網路包成一個函式且支援多個輸入與輸出,並且在神經網路傳遞過程中能夠動態的掌握資料流

    • sub-class of Model: 更進階,如果要包裝成產品或是自定義神經網路演算法時就建議使用這種方法

  • 以下,以 iris 資料集為例,示範三種寫法

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

9.1. Data

df_train = pd.read_csv('Iris.csv')
df_train = df_train.drop(labels=['Id'],axis=1) # 移除Id
df_train
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species
0 5.1 3.5 1.4 0.2 Iris-setosa
1 4.9 3.0 1.4 0.2 Iris-setosa
2 4.7 3.2 1.3 0.2 Iris-setosa
3 4.6 3.1 1.5 0.2 Iris-setosa
4 5.0 3.6 1.4 0.2 Iris-setosa
... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 Iris-virginica
146 6.3 2.5 5.0 1.9 Iris-virginica
147 6.5 3.0 5.2 2.0 Iris-virginica
148 6.2 3.4 5.4 2.3 Iris-virginica
149 5.9 3.0 5.1 1.8 Iris-virginica

150 rows × 5 columns

9.1.1. 檢查 & 處理 na

numeric_cols = ["SepalLengthCm", "SepalWidthCm", "PetalLengthCm", "PetalWidthCm"]
number_of_na = len(np.where(np.isnan(df_train[numeric_cols]))[0])

# checked missing data
print("checked missing data(NAN mount):",number_of_na)

# 移除缺失值
# train=train.dropna()
checked missing data(NAN mount): 0
  • 沒有 na ,不用處理

9.1.2. 做出 y 和 X

  • 現在的 y 是 Species,裡面有三種類別

  • 但使用 tensorflow 時,只吃數值資料

    • 所以現在把這三種類別編碼為 0, 1, 2 三種

    • 再把 integer 的 0, 1, 2 轉成 one-hot

label_map = {'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 2}
df_train['Class'] = df_train['Species'].map(label_map)
df_train
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm Species Class
0 5.1 3.5 1.4 0.2 Iris-setosa 0
1 4.9 3.0 1.4 0.2 Iris-setosa 0
2 4.7 3.2 1.3 0.2 Iris-setosa 0
3 4.6 3.1 1.5 0.2 Iris-setosa 0
4 5.0 3.6 1.4 0.2 Iris-setosa 0
... ... ... ... ... ... ...
145 6.7 3.0 5.2 2.3 Iris-virginica 2
146 6.3 2.5 5.0 1.9 Iris-virginica 2
147 6.5 3.0 5.2 2.0 Iris-virginica 2
148 6.2 3.4 5.4 2.3 Iris-virginica 2
149 5.9 3.0 5.1 1.8 Iris-virginica 2

150 rows × 6 columns

  • 此時, Class 已經是 integer,再做 one-hot

from tensorflow.keras.utils import to_categorical
y = to_categorical(df_train['Class'])
y[0:5]
array([[1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.],
       [1., 0., 0.]], dtype=float32)
  • 最後,做出乾淨的 X

X = df_train.drop(labels=['Species', 'Class'],axis=1)
X
SepalLengthCm SepalWidthCm PetalLengthCm PetalWidthCm
0 5.1 3.5 1.4 0.2
1 4.9 3.0 1.4 0.2
2 4.7 3.2 1.3 0.2
3 4.6 3.1 1.5 0.2
4 5.0 3.6 1.4 0.2
... ... ... ... ...
145 6.7 3.0 5.2 2.3
146 6.3 2.5 5.0 1.9
147 6.5 3.0 5.2 2.0
148 6.2 3.4 5.4 2.3
149 5.9 3.0 5.1 1.8

150 rows × 4 columns

9.1.3. 切 training & testing

from sklearn.model_selection import train_test_split
X_train , X_test , y_train , y_test = train_test_split(X,y , test_size=.3 , random_state=40)
print('訓練資料: ',X_train.shape)
print('測試資料: ',X_test.shape)
訓練資料:  (105, 4)
測試資料:  (45, 4)

9.2. 建立模型

  • 先來定義一些 畫圖 和算 performance 的 function

def process_plot(hist_obj):
    history_dict = hist_obj.history
    history_dict.keys()
    acc = history_dict['acc']
    val_acc = history_dict['val_acc']
    loss = history_dict['loss']
    val_loss = history_dict['val_loss']
    epochs_ = range(1,len(acc)+1)
    
    # loss plot
    plt.plot(epochs_ , loss , label = 'training loss')
    plt.plot(epochs_ , val_loss , label = 'val los')
    plt.title('training and val loss')
    plt.xlabel('epochs')
    plt.ylabel('loss')
    plt.legend()
    plt.show()
    
    # acc plot
    plt.clf()
    plt.plot(epochs_ , acc , label='train accuracy')
    plt.plot(epochs_ , val_acc , label = 'val accuracy')
    plt.title('train and val acc')
    plt.xlabel('epochs')
    plt.ylabel('acc')
    plt.legend()
    plt.show()

    
from sklearn.metrics import accuracy_score
def acc_score(X, y, model):    
    label=np.argmax(y,axis=1)
    pred =  np.argmax(model.predict(X), axis=1)
    acc = accuracy_score(label, pred)
    return acc

9.2.1. Sequential

import tensorflow as tf
from tensorflow.keras.layers import Dense, Activation
from tensorflow.keras import Sequential
from tensorflow.keras.optimizers import Adam


# sequential
model_seq = Sequential()
model_seq.add(Dense(8, Activation('relu'), input_dim=X.shape[-1]))
model_seq.add(Dense(16, Activation('relu')))
model_seq.add(Dense(3, Activation('softmax')))

# structure
model_seq.summary()   # Weights = (4+1)*8+(8+1)*16+(16+1)*3

# 編譯模型
optim = Adam(lr=0.001)
model_seq.compile(loss='categorical_crossentropy',
              optimizer=optim,
              metrics=['acc'])


# 參數
batch_size= 7
epochs = 50

# 訓練模型
history_sequential = model_seq.fit(X_train, 
                    y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=False,
                    shuffle=True,
                    validation_split=0.1)
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense (Dense)                (None, 8)                 40        
_________________________________________________________________
dense_1 (Dense)              (None, 16)                144       
_________________________________________________________________
dense_2 (Dense)              (None, 3)                 51        
=================================================================
Total params: 235
Trainable params: 235
Non-trainable params: 0
_________________________________________________________________
  • 看一下 training process

process_plot(history_sequential)
_images/tf_create_model_24_0.png _images/tf_create_model_24_1.png
  • 看一下 accuracy

acc_score(X_test, y_test, model_seq)
0.9333333333333333

9.2.2. Functional

import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
# model definition
model_input = Input(shape=X.shape[-1])
x = Dense(8,activation='relu')(model_input)
x = Dense(16,activation='relu')(x)
model_output = Dense(3,activation='softmax')(x)
model_func = Model(model_input, model_output)

# model structure
model_func.summary()   # Weights = (4+1)*8+(8+1)*16+(16+1)*3

# 編譯模型
optim = Adam(lr=0.001)
model_func.compile(loss='categorical_crossentropy',
              optimizer=optim,
              metrics=['acc'])

# 參數
batch_size=7
epochs = 50

# 訓練模型
history_func = model_func.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=False,
                    shuffle=True,
                    validation_split=0.1)
Model: "functional_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_1 (InputLayer)         [(None, 4)]               0         
_________________________________________________________________
dense_3 (Dense)              (None, 8)                 40        
_________________________________________________________________
dense_4 (Dense)              (None, 16)                144       
_________________________________________________________________
dense_5 (Dense)              (None, 3)                 51        
=================================================================
Total params: 235
Trainable params: 235
Non-trainable params: 0
_________________________________________________________________
process_plot(history_func)
_images/tf_create_model_30_0.png _images/tf_create_model_30_1.png
acc_score(X_test, y_test, model_func)
0.9555555555555556

9.2.3. Model

import tensorflow as tf
from tensorflow.keras.layers import Input, Dense
from tensorflow.keras.models import Model
from tensorflow.keras.optimizers import Adam
# 使用 class 客製化 Models
class MyClassifier(Model):
    def __init__(self):
        super(MyClassifier,self).__init__()
        
        # 定義各種會用到的 layer
        self.layer_1 = Dense(8 , activation='relu')
        self.layer_2 = Dense(16 , activation='relu')
        self.output_layer = Dense(3 , activation='softmax')
        
    def call(self,x):
        # 這邊用 functional api 的寫法,定義出 forward propagation 的結果
        x_1 = self.layer_1(x)
        x_2 = self.layer_2(x_1)
        predict = self.output_layer(x_2)
        return predict
# model definition
model_class = MyClassifier()

# 編譯模型
optim = Adam(lr=0.001)
model_class.compile(loss='categorical_crossentropy',
              optimizer=optim,
              metrics=['acc'])

# 參數
batch_size=7
epochs = 50

# 訓練模型
history_class = model_class.fit(X_train, y_train,
                    batch_size=batch_size,
                    epochs=epochs,
                    verbose=False,
                    shuffle=True,
                    validation_split=0.1)

# model structure
model_class.summary()
WARNING:tensorflow:Layer my_classifier is casting an input tensor from dtype float64 to the layer's dtype of float32, which is new behavior in TensorFlow 2.  The layer has dtype float32 because its dtype defaults to floatx.

If you intended to run this layer in float32, you can safely ignore this warning. If in doubt, this warning is likely only an issue if you are porting a TensorFlow 1.X model to TensorFlow 2.

To change all layers to have dtype float64 by default, call `tf.keras.backend.set_floatx('float64')`. To change just this layer, pass dtype='float64' to the layer constructor. If you are the author of this layer, you can disable autocasting by passing autocast=False to the base Layer constructor.

Model: "my_classifier"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
dense_6 (Dense)              multiple                  40        
_________________________________________________________________
dense_7 (Dense)              multiple                  144       
_________________________________________________________________
dense_8 (Dense)              multiple                  51        
=================================================================
Total params: 235
Trainable params: 235
Non-trainable params: 0
_________________________________________________________________
process_plot(history_class)
_images/tf_create_model_36_0.png _images/tf_create_model_36_1.png
acc_score(X_test, y_test, model_class)
0.9555555555555556