使用Pytorch创建自定义模型-DEV365 开发者社区

pytorch构建了自定义模型。只需几行代码，就可以在几分钟内旋转并训练深度学习模型。这是在Pytorch中创建典型深度学习模型的快速指南。

简单的自定义模型

要构建自定义模型，只需继承 nn.module 并定义 forward 函数。

class FCN(nn.Module):
    def __init__(self, input_dims, output_dims):
        super(FCN, self).__init__()

        self.model = nn.Sequential(
            nn.Linear(input_dims, 5), 
            nn.LeakyReLU(), 
            nn.Linear(5, output_dims), 
                        nn.Sigmoid()
        )

    def forward(self, X):
        return self.model(X)

当您实例化时，

fcn = FCN(10, 1)
fcn

fcn（
（型号）：顺序（
（0）：线性（in_features = 10，out_features = 5，bias = true）
（1）：LeakyRelu（负_slope = 0.01）
（2）：线性（in_features = 5，out_features = 1，bias = true）
（3）：sigmoid（）
）
）

很简单，您现在拥有2层神经网络！请注意此模式，因为这重复到更复杂的模型。

一个稍微复杂的模型

有时，模型具有重复层，这些层称为块。这些通常在卷积神经网络中发现，其中一个块通常包含卷积层，然后是最大层。我们通过编写自己的自定义功能来创建这些块！这次，我们将创建一个带有batchNormorization和Relu的线性块。（不用担心，我们会在本指南的稍后进入CNN）

只需创建一个函数，然后返回包裹图层的顺序模型。

def net_block(input_dim, output_dim):
    return nn.Sequential(
        nn.Linear(input_dim, output_dim),
        nn.BatchNorm1d(output_dim),
        nn.ReLU()
    )

让我们在我们的模型中添加一些这些块！

class Network(nn.Module):
    def __init__(self, input_dim, hidden_dim_1, hidden_dim_2, output_dim):
        super(Network, self).__init__()

        self.model = nn.Sequential(
            net_block(input_dim, hidden_dim_1),
            net_block(hidden_dim_1, hidden_dim_2),
            net_block(hidden_dim_2, output_dim)
        )

    def forward(self, X):
        return self.model(X)

net = Network(10, 4, 5, 1)
net

网络（
（型号）：顺序（
（0）：顺序（
（0）：线性（in_features = 10，out_features = 4，bias = true）
（1）：batchnorm1d（4，eps = 1e-05，动量= 0.1，affine = true，track_running_stats = true）
（2）：relu（）
）
（1）：顺序（
（0）：线性（in_features = 4，out_features = 5，bias = true）
（1）：batchnorm1d（5，eps = 1e-05，动量= 0.1，affine = true，track_running_stats = true）
（2）：relu（）
）
（2）：顺序（
（0）：线性（in_features = 5，out_features = 1，bias = true）
（1）：batchnorm1d（1，eps = 1e-05，动量= 0.1，affine = true，track_running_stats = true）
（2）：relu（）
）
）
）

我们实际上可以看到我们有这些重复块。 ph！这使我们免于定义这9个单独的层。

VGG网络

VGG网络是有史以来最早建造的深CNN之一！让我尝试重新创建它。首先，我们定义块。 VGG有两种类型的块，在此paper中定义。

图1. VGG块的两种类型：两层（蓝色，橙色）和三层（紫色，绿色，红色）。取自此datasciencecececencecentral article的图像。

让定义两个块。

def vgg_block(in_channels, out_channels):  
    return  nn.Sequential(  
        nn.Conv2d(in_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
        nn.ReLU(inplace =True), 
        nn.MaxPool2d(kernel_size=(2,2),stride=2)
    )

def vgg_block2(in_channels, out_channels):
    return nn.Sequential(
        nn.Conv2d(in_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
        nn.ReLU(inplace =True), 
        nn.Conv2d(out_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
        nn.ReLU(inplace =True),  
        nn.MaxPool2d(kernel_size=(2,2),stride=2)
    )

让我们还定义一个平坦的层。平坦的层基本上将张量重新为n维矢量。

class Flatten(nn.Module):
    def forward(self, input):
        return input.view(input.size(0), -1)

让我们定义分类块。

def classifier_block(input_dim, hidden_dim, num_classes):
    return nn.Sequential( 
        Flatten(),
        nn.Linear(input_dim, hidden_dim),
        nn.ReLU(),
        nn.Dropout(0.5),
        nn.Linear(hidden_dim, hidden_dim),
        nn.ReLU(),
        nn.Dropout(0.5),
        nn.Linear(hidden_dim , num_classes),
        nn.Softmax()
    )

现在让我们定义模型。

class VGG(nn.Module):
    def __init__(self):
        super(VGG, self).__init__()
        self.model = nn.Sequential( 
            vgg_block(3, 64 ),
            vgg_block(64, 128),
            vgg_block2(128, 256),
            vgg_block2(256, 512), 
            vgg_block2(512, 512),   
            nn.AdaptiveAvgPool2d(output_size=(7, 7)),
            classifier_block(512*7*7, 4096, 1000)
        )

    def forward(self, X):
        return self.model(X)

vgg = VGG()
vgg

vgg（
（型号）：顺序（
（0）：顺序（
（0）：conv2d（3，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（Inplace = true）
（2）：maxpool2d（kernel_size =（2，2），步幅= 2，填充= 0，扩张= 1，ceil_mode = false）
）
（1）：顺序（
（0）：conv2d（64，128，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（Inplace = true）
（2）：maxpool2d（kernel_size =（2，2），步幅= 2，填充= 0，扩张= 1，ceil_mode = false）
）
（2）：顺序（
（0）：conv2d（128，256，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（Inplace = true）
（2）：conv2d（256，256，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（3）：relu（Inplace = true）
（4）：maxpool2d（kernel_size =（2，2），步幅= 2，填充= 0，扩张= 1，ceil_mode = false）
）
（3）：顺序（
（0）：conv2d（256，512，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（Inplace = true）
（2）：conv2d（512，512，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（3）：relu（Inplace = true）
（4）：maxpool2d（kernel_size =（2，2），步幅= 2，填充= 0，扩张= 1，ceil_mode = false）
）
（4）：顺序（
（0）：conv2d（512，512，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（Inplace = true）
（2）：conv2d（512，512，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（3）：relu（Inplace = true）
（4）：maxpool2d（kernel_size =（2，2），步幅= 2，填充= 0，扩张= 1，ceil_mode = false）
）
（5）：Adaptiveavgpool2d（output_size =（7，7））
（6）：顺序（
（0）：Flatten（）
（1）：线性（in_features = 25088，out_features = 4096，bias = true）
（2）：relu（）
（3）：辍学（p = 0.5，intplace = false）
（4）：线性（in_features = 4096，out_features = 4096，bias = true）
（5）：relu（）
（6）：辍学（p = 0.5，intplace = false）
（7）：线性（in_features = 4096，out_features = 1000，bias = true）
（8）：softmax（dim = none）
）
）
）

我们现在已经重新创建了VGG模型！那有多酷。

一个复杂的模型

到目前为止，我们遇到的是相对简单的网络，它们接收到上一层的输出并转换它。从中有所改进是残留网络，这些网络将输出从较早的一层传递到一层，然后几步构成跳过连接。这已被证明可以帮助深度神经网络学习得更好，因为梯度可以更容易地通过这些连接。残留网络的一些例子是 ResNets 和 UNet 。

剩余网络（重新NET）

图2.重新连接。取自deeplearning.ai卷积神经网络课程的图像

剩余网络的设计有所不同，以迎合此功能。在这些类型的网络中，我们需要参考这些具有跳过连接的层（您将在以后看到）。

这次使用函数来创建块将不再削减它。这是因为我们的正向功能需要自定义以适应跳过连接！

残留块

class ResidualBlock(nn.Module):
    def __init__(self, input_dims, output_dims):
        super(ResidualBlock, self).__init__()
                # Defining the blocks
        self.conv1 = nn.Sequential( 
            nn.Conv2d(input_dims, output_dims, kernel_size = (3,3), padding = (1,1)),
            nn.ReLU()
        )
        self.conv2 = nn.Conv2d(output_dims, output_dims, kernel_size =(1,1))
        self.act = nn.ReLU()

    def forward(self, X): 
                # Clone X as this will be passed in a later layer
        X2 = X.clone()
        X = self.conv1(X)
        X = self.conv2(X)
                # Add the output of the previous layer to X2
        X = self.act(X + X2)
        return X

在上面的代码中，我们可以看到我们具有不同层的单独属性，并且我们对远期函数进行了一些自定义。由于我们需要将初始输入X传递到将来层，因此我们将其值存储在X2中。然后，我们通过将X传递到层，然后在最终激活之前最终添加X2。

。

问题：这怎么可能？不卷积将图像下样本，因此由于其形状的差异而使X + X2无法实现。

很棒的问题！实际上，我在审查重新收集时问自己。事实证明，这种工作的原因很简单，就像这些张量具有相同的精确形状。该块中的第一个卷积特别具有1x1填充，内核大小为3x3，因此输出形状将与输入形状相同。回忆：$ \ frac {\ text {dim}+2p-k} {s} $其中 dim 是输入维度， p 是填充物， k 是内核大小， s 是特定维度中的大步。

$ \ frac {\ text {dim}+2p-k} {s} $

class ResNet(nn.Module):
    def __init__(self):
        super(ResNet, self).__init__() 
        self.model = nn.Sequential(
            nn.Conv2d(3, 64, stride = (2,2), kernel_size=(7,7)),  
            nn.MaxPool2d(stride = (2,2), kernel_size = (7,7)),
            ResidualBlock(64,64), 
            ResidualBlock(64,64), 
            ResidualBlock(64,64), 
            ResidualBlock(64,64), 
            ResidualBlock(64,64), 
            ResidualBlock(64,64), 
            nn.AdaptiveAvgPool2d(output_size=(7, 7)),
            classifier_block(64*7*7, 4096, 1000)

        )
    def forward(self,X):
        return self.model(X)

resnet（
（型号）：顺序（
（0）：conv2d（3，64，kernel_size =（7，7），步幅=（2，2））
（1）：maxpool2d（kernel_size =（7，7），步幅=（2，2），填充= 0，扩张= 1，ceil_mode = false）
（2）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（3）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（4）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（5）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（6）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（7）：残留块（
（conv1）：顺序（
（0）：conv2d（64，64，kernel_size =（3，3），步幅=（1，1），填充=（1，1））
（1）：relu（）
）
（conv2）：conv2d（64，64，kernel_size =（1，1），步幅=（1，1））
（ACT）：Relu（）
）
（8）：Adaptiveavgpool2d（output_size =（7，7））
（9）：顺序（
（0）：Flatten（）
（1）：线性（in_features = 3136，out_features = 4096，bias = true）
（2）：relu（）
（3）：辍学（p = 0.5，intplace = false）
（4）：线性（in_features = 4096，out_features = 4096，bias = true）
（5）：relu（）
（6）：辍学（p = 0.5，intplace = false）
（7）：线性（in_features = 4096，out_features = 1000，bias = true）
（8）：softmax（dim = none）
）
）
）

一个稍微复杂的模型

复发性神经网络是神经网络，通过序列数据表现更好。要了解RNN，我们首先了解RNN单元是不可或缺的。

RNN细胞

RNN单元格是一个接受两个输入的紧凑线性网络。它由以下公式约束。

$ a^{} = f（w_ {aa} a^{} + w_ {ax} x^{} + b_a）$

$ \ hat y^{} = g（w_ {ya} a^{} + b_y）$

图3.一个RNN单元。取自deeplearning.ai序列模型的图像课程。

它取出两个输入激活从上一个单元格，$ a^{} $以及上一层$ x^{} $的输出。这些是串联的，并应用于由$ W_A $和$ b_a $参数化的线性模型，以获取当前的单元格激活。此激活将传递给下一个单元格，也通过$ w_y $和$ b_y $（可选）线性转换，以形成$ \ hat y $。

class RNNCell(nn.Module):
    def __init__(self, embed_length,act_dim, output_dim, **kwargs):
        super(RNNCell, self).__init__()

        act = kwargs.get("act", "relu")
        acty = kwargs.get("acty", "relu")
        activation_map = {"relu": nn.ReLU(), "l-relu": nn.LeakyReLU(), "sig": nn.Sigmoid(), "tanh": nn.Tanh()}
        self.act_dim = act_dim

        self.linear = nn.Linear(act_dim+embed_length,act_dim) 
        self.activation = activation_map.get(act)
        self.linear_y = nn.Linear(act_dim, output_dim)
        self.activation_y = activation_map.get(acty)
        pass

    def forward(self, a, X): 
        # X is n x embed_length
        # a is n x act_dim 
        assert(a.shape[1] == self.act_dim)
        # concatenate a and X since they will be transformed by the same Linear.
        X = torch.cat((a,X), axis = 1)
        # transform the inputs
        X = self.linear(X)
        a = self.activation(X)
        X = self.linear_y(a)
        Y = self.activation_y(X)
        return a,Y

        pass

RNN层

RNN层只是一系列的RNN单元，其激活供应到下一个单元。随着上一个单元的激活传递到下一个。

，它的正常功能更加复杂。

图4. RNN层。取自colah’s blog的图像。

正向函数计算每个细胞的激活并首先输出。从左到右计算每个单元，因为下一个单元需要先前的细胞激活。

class RNN(nn.Module):
    def __init__(self, **kwargs):
        super(RNN, self).__init__()
        # output shape is 3d. n x output_dim x embed_length
        # input_dim is the embedding length(ie. word embedding length)
        self.input_dim = kwargs.get("input_dim", 0)
        self.act_dim = kwargs.get("act_dim", self.input_dim)
        self.output_dim = kwargs.get("output_dim", 1)
        self.time_steps = kwargs.get("time_steps", self.output_dim) 
        self.unit_output_dim = kwargs.get("unit_output_dim" , 1)

        assert(self.output_dim <= self.time_steps)
        # Populate the layer with the cells based on timesteps.
        self.models = nn.ModuleList([
            RNNCell(self.input_dim,self.act_dim, self.unit_output_dim) 
        ] * self.time_steps )

    def forward(self,  X):
        # x is n x time_steps x embed_length 
        n = X.shape[0] 

        # make sure X axis 1 is less than time_steps, cause it might crash.
        assert(X.shape[1] <= self.time_steps)

        # Sometimes our input would have lesser dimensions than our timesteps, hence we add zero tensors to it
        # to match the size and be compatible.
        X = torch.cat((X, torch.zeros(n, self.time_steps - X.shape[1], self.input_dim)), axis = 1)

        # Initialize the first activation.
        a = torch.zeros(n, self.act_dim)

        # Create the Y array, the individual y-predictions will be stored here
        Y = torch.zeros(n, self.output_dim, self.unit_output_dim)

        # Create iterator for y_i for locating which timestep we are in.
        y_i = 0
        for i,cell in enumerate(self.models):
            # Get the input for the current timestep
            x = X[:, i, :] 
            # Forward the x and activations to the current cell.
            a,y = cell(a, x)  
            # Only add the last predictions for the output of size output_dim
            if i >= self.time_steps - self.output_dim:
                Y[:, y_i,:] = y 
                y_i+=1

        return Y

基本的RNN

那很忙！现在让我们把所有内容放在一起，一个顺序的对象现在可以避免复杂性。让我们将这些RNN层叠在一起。

model = nn.Sequential(

    RNN(input_dim = embed_length, act_dim = 10, time_steps = 5, output_dim = 4, unit_output_dim = embed_length),
    RNN(input_dim = embed_length, act_dim = 7, time_steps = 4, output_dim = 2, unit_output_dim = embed_length),
    RNN(input_dim = embed_length, act_dim = 3, time_steps = 2, output_dim = 1, unit_output_dim = 1)

)

顺序（
（0）：RNN（
（模型）：调节仪（
（0）：rnncell（
（线性）：线性（in_features = 22，out_features = 10，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 10，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（1）：rnncell（
（线性）：线性（in_features = 22，out_features = 10，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 10，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（2）：rnncell（
（线性）：线性（in_features = 22，out_features = 10，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 10，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（3）：rnncell（
（线性）：线性（in_features = 22，out_features = 10，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 10，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（4）：rnncell（
（线性）：线性（in_features = 22，out_features = 10，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 10，out_features = 12，bias = true）
（activation_y）：rearead（）
）
）
）
（1）：RNN（
（模型）：调节仪（
（0）：rnncell（
（线性）：线性（in_features = 19，out_features = 7，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 7，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（1）：rnncell（
（线性）：线性（in_features = 19，out_features = 7，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 7，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（2）：rnncell（
（线性）：线性（in_features = 19，out_features = 7，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 7，out_features = 12，bias = true）
（activation_y）：rearead（）
）
（3）：rnncell（
（线性）：线性（in_features = 19，out_features = 7，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 7，out_features = 12，bias = true）
（activation_y）：rearead（）
）
）
）
（2）：RNN（
（模型）：调节仪（
（0）：rnncell（
（线性）：线性（in_features = 15，out_features = 3，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 3，out_features = 1，bias = true）
（activation_y）：rearead（）
）
（1）：rnncell（
（线性）：线性（in_features = 15，out_features = 3，bias = true）
（激活）：rearead（）
（linear_y）：线性（in_features = 3，out_features = 1，bias = true）
（activation_y）：rearead（）
）
）
）
）

现在就是这样！我们可以继续使用所有不同类型的RNN，但是这篇文章已经越来越长。如果这篇文章获得足够的吸引力，我可以继续使用更复杂的模型，因此请务必与您的朋友分享！
'到下次！

本文也发表在Medium中。