pytorch构建了自定义模型。只需几行代码,就可以在几分钟内旋转并训练深度学习模型。这是在Pytorch中创建典型深度学习模型的快速指南。
简单的自定义模型
要构建自定义模型,只需继承 nn.module 并定义 forward 函数。
class FCN(nn.Module):
def __init__(self, input_dims, output_dims):
super(FCN, self).__init__()
self.model = nn.Sequential(
nn.Linear(input_dims, 5),
nn.LeakyReLU(),
nn.Linear(5, output_dims),
nn.Sigmoid()
)
def forward(self, X):
return self.model(X)
当您实例化时,
fcn = FCN(10, 1)
fcn
fcn(
(型号):顺序(
(0):线性(in_features = 10,out_features = 5,bias = true)
(1):LeakyRelu(负_slope = 0.01)
(2):线性(in_features = 5,out_features = 1,bias = true)
(3):sigmoid()
)
)
很简单,您现在拥有2层神经网络!请注意此模式,因为这重复到更复杂的模型。
一个稍微复杂的模型
有时,模型具有重复层,这些层称为块。这些通常在卷积神经网络中发现,其中一个块通常包含卷积层,然后是最大层。我们通过编写自己的自定义功能来创建这些块!这次,我们将创建一个带有batchNormorization和Relu的线性块。 (不用担心,我们会在本指南的稍后进入CNN)
只需创建一个函数,然后返回包裹图层的顺序模型。
def net_block(input_dim, output_dim):
return nn.Sequential(
nn.Linear(input_dim, output_dim),
nn.BatchNorm1d(output_dim),
nn.ReLU()
)
让我们在我们的模型中添加一些这些块!
class Network(nn.Module):
def __init__(self, input_dim, hidden_dim_1, hidden_dim_2, output_dim):
super(Network, self).__init__()
self.model = nn.Sequential(
net_block(input_dim, hidden_dim_1),
net_block(hidden_dim_1, hidden_dim_2),
net_block(hidden_dim_2, output_dim)
)
def forward(self, X):
return self.model(X)
net = Network(10, 4, 5, 1)
net
网络(
(型号):顺序(
(0):顺序(
(0):线性(in_features = 10,out_features = 4,bias = true)
(1):batchnorm1d(4,eps = 1e-05,动量= 0.1,affine = true,track_running_stats = true)
(2):relu()
)
(1):顺序(
(0):线性(in_features = 4,out_features = 5,bias = true)
(1):batchnorm1d(5,eps = 1e-05,动量= 0.1,affine = true,track_running_stats = true)
(2):relu()
)
(2):顺序(
(0):线性(in_features = 5,out_features = 1,bias = true)
(1):batchnorm1d(1,eps = 1e-05,动量= 0.1,affine = true,track_running_stats = true)
(2):relu()
)
)
)
我们实际上可以看到我们有这些重复块。 ph!这使我们免于定义这9个单独的层。
VGG网络
VGG网络是有史以来最早建造的深CNN之一!让我尝试重新创建它。首先,我们定义块。 VGG有两种类型的块,在此paper中定义。
图1. VGG块的两种类型:两层(蓝色,橙色)和三层(紫色,绿色,红色)。取自此datasciencecececencecentral article的图像。
让定义两个块。
def vgg_block(in_channels, out_channels):
return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
nn.ReLU(inplace =True),
nn.MaxPool2d(kernel_size=(2,2),stride=2)
)
def vgg_block2(in_channels, out_channels):
return nn.Sequential(
nn.Conv2d(in_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
nn.ReLU(inplace =True),
nn.Conv2d(out_channels, out_channels, kernel_size=(3,3), stride=1, padding = (1,1)),
nn.ReLU(inplace =True),
nn.MaxPool2d(kernel_size=(2,2),stride=2)
)
让我们还定义一个平坦的层。平坦的层基本上将张量重新为n维矢量。
class Flatten(nn.Module):
def forward(self, input):
return input.view(input.size(0), -1)
让我们定义分类块。
def classifier_block(input_dim, hidden_dim, num_classes):
return nn.Sequential(
Flatten(),
nn.Linear(input_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(hidden_dim, hidden_dim),
nn.ReLU(),
nn.Dropout(0.5),
nn.Linear(hidden_dim , num_classes),
nn.Softmax()
)
现在让我们定义模型。
class VGG(nn.Module):
def __init__(self):
super(VGG, self).__init__()
self.model = nn.Sequential(
vgg_block(3, 64 ),
vgg_block(64, 128),
vgg_block2(128, 256),
vgg_block2(256, 512),
vgg_block2(512, 512),
nn.AdaptiveAvgPool2d(output_size=(7, 7)),
classifier_block(512*7*7, 4096, 1000)
)
def forward(self, X):
return self.model(X)
vgg = VGG()
vgg
vgg(
(型号):顺序(
(0):顺序(
(0):conv2d(3,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu(Inplace = true)
(2):maxpool2d(kernel_size =(2,2),步幅= 2,填充= 0,扩张= 1,ceil_mode = false)
)
(1):顺序(
(0):conv2d(64,128,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu(Inplace = true)
(2):maxpool2d(kernel_size =(2,2),步幅= 2,填充= 0,扩张= 1,ceil_mode = false)
)
(2):顺序(
(0):conv2d(128,256,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu(Inplace = true)
(2):conv2d(256,256,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(3):relu(Inplace = true)
(4):maxpool2d(kernel_size =(2,2),步幅= 2,填充= 0,扩张= 1,ceil_mode = false)
)
(3):顺序(
(0):conv2d(256,512,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu(Inplace = true)
(2):conv2d(512,512,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(3):relu(Inplace = true)
(4):maxpool2d(kernel_size =(2,2),步幅= 2,填充= 0,扩张= 1,ceil_mode = false)
)
(4):顺序(
(0):conv2d(512,512,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu(Inplace = true)
(2):conv2d(512,512,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(3):relu(Inplace = true)
(4):maxpool2d(kernel_size =(2,2),步幅= 2,填充= 0,扩张= 1,ceil_mode = false)
)
(5):Adaptiveavgpool2d(output_size =(7,7))
(6):顺序(
(0):Flatten()
(1):线性(in_features = 25088,out_features = 4096,bias = true)
(2):relu()
(3):辍学(p = 0.5,intplace = false)
(4):线性(in_features = 4096,out_features = 4096,bias = true)
(5):relu()
(6):辍学(p = 0.5,intplace = false)
(7):线性(in_features = 4096,out_features = 1000,bias = true)
(8):softmax(dim = none)
)
)
)
我们现在已经重新创建了VGG模型!那有多酷。
一个复杂的模型
到目前为止,我们遇到的是相对简单的网络,它们接收到上一层的输出并转换它。从中有所改进是残留网络,这些网络将输出从较早的一层传递到一层,然后几步构成跳过连接。这已被证明可以帮助深度神经网络学习得更好,因为梯度可以更容易地通过这些连接。残留网络的一些例子是 ResNets 和 UNet 。
剩余网络(重新NET)
图2.重新连接。取自deeplearning.ai卷积神经网络课程的图像
剩余网络的设计有所不同,以迎合此功能。在这些类型的网络中,我们需要参考这些具有跳过连接的层(您将在以后看到)。
这次使用函数来创建块将不再削减它。这是因为我们的正向功能需要自定义以适应跳过连接!
残留块
class ResidualBlock(nn.Module):
def __init__(self, input_dims, output_dims):
super(ResidualBlock, self).__init__()
# Defining the blocks
self.conv1 = nn.Sequential(
nn.Conv2d(input_dims, output_dims, kernel_size = (3,3), padding = (1,1)),
nn.ReLU()
)
self.conv2 = nn.Conv2d(output_dims, output_dims, kernel_size =(1,1))
self.act = nn.ReLU()
def forward(self, X):
# Clone X as this will be passed in a later layer
X2 = X.clone()
X = self.conv1(X)
X = self.conv2(X)
# Add the output of the previous layer to X2
X = self.act(X + X2)
return X
在上面的代码中,我们可以看到我们具有不同层的单独属性,并且我们对远期函数进行了一些自定义。由于我们需要将初始输入X传递到将来层,因此我们将其值存储在X2中。然后,我们通过将X传递到层,然后在最终激活之前最终添加X2。
。-
问题:这怎么可能?不卷积将图像下样本,因此由于其形状的差异而使X + X2无法实现。
很棒的问题!实际上,我在审查重新收集时问自己。事实证明,这种工作的原因很简单,就像这些张量具有相同的精确形状。该块中的第一个卷积特别具有1x1填充,内核大小为3x3,因此输出形状将与输入形状相同。回忆:$ \ frac {\ text {dim}+2p-k} {s} $其中 dim 是输入维度, p 是填充物, k 是内核大小, s 是特定维度中的大步。
$ \ frac {\ text {dim}+2p-k} {s} $
class ResNet(nn.Module):
def __init__(self):
super(ResNet, self).__init__()
self.model = nn.Sequential(
nn.Conv2d(3, 64, stride = (2,2), kernel_size=(7,7)),
nn.MaxPool2d(stride = (2,2), kernel_size = (7,7)),
ResidualBlock(64,64),
ResidualBlock(64,64),
ResidualBlock(64,64),
ResidualBlock(64,64),
ResidualBlock(64,64),
ResidualBlock(64,64),
nn.AdaptiveAvgPool2d(output_size=(7, 7)),
classifier_block(64*7*7, 4096, 1000)
)
def forward(self,X):
return self.model(X)
resnet(
(型号):顺序(
(0):conv2d(3,64,kernel_size =(7,7),步幅=(2,2))
(1):maxpool2d(kernel_size =(7,7),步幅=(2,2),填充= 0,扩张= 1,ceil_mode = false)
(2):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(3):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(4):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(5):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(6):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(7):残留块(
(conv1):顺序(
(0):conv2d(64,64,kernel_size =(3,3),步幅=(1,1),填充=(1,1))
(1):relu()
)
(conv2):conv2d(64,64,kernel_size =(1,1),步幅=(1,1))
(ACT):Relu()
)
(8):Adaptiveavgpool2d(output_size =(7,7))
(9):顺序(
(0):Flatten()
(1):线性(in_features = 3136,out_features = 4096,bias = true)
(2):relu()
(3):辍学(p = 0.5,intplace = false)
(4):线性(in_features = 4096,out_features = 4096,bias = true)
(5):relu()
(6):辍学(p = 0.5,intplace = false)
(7):线性(in_features = 4096,out_features = 1000,bias = true)
(8):softmax(dim = none)
)
)
)
一个稍微复杂的模型
复发性神经网络是神经网络,通过序列数据表现更好。要了解RNN,我们首先了解RNN单元是不可或缺的。
RNN细胞
RNN单元格是一个接受两个输入的紧凑线性网络。它由以下公式约束。
$ a^{} = f(w_ {aa} a^{} + w_ {ax} x^{} + b_a)$
$ \ hat y^{} = g(w_ {ya} a^{} + b_y)$
图3.一个RNN单元。取自deeplearning.ai序列模型的图像课程。
它取出两个输入激活从上一个单元格,$ a^{} $以及上一层$ x^{} $的输出。这些是串联的,并应用于由$ W_A $和$ b_a $参数化的线性模型,以获取当前的单元格激活。此激活将传递给下一个单元格,也通过$ w_y $和$ b_y $(可选)线性转换,以形成$ \ hat y $。
class RNNCell(nn.Module):
def __init__(self, embed_length,act_dim, output_dim, **kwargs):
super(RNNCell, self).__init__()
act = kwargs.get("act", "relu")
acty = kwargs.get("acty", "relu")
activation_map = {"relu": nn.ReLU(), "l-relu": nn.LeakyReLU(), "sig": nn.Sigmoid(), "tanh": nn.Tanh()}
self.act_dim = act_dim
self.linear = nn.Linear(act_dim+embed_length,act_dim)
self.activation = activation_map.get(act)
self.linear_y = nn.Linear(act_dim, output_dim)
self.activation_y = activation_map.get(acty)
pass
def forward(self, a, X):
# X is n x embed_length
# a is n x act_dim
assert(a.shape[1] == self.act_dim)
# concatenate a and X since they will be transformed by the same Linear.
X = torch.cat((a,X), axis = 1)
# transform the inputs
X = self.linear(X)
a = self.activation(X)
X = self.linear_y(a)
Y = self.activation_y(X)
return a,Y
pass
RNN层
RNN层只是一系列的RNN单元,其激活供应到下一个单元。随着上一个单元的激活传递到下一个。
,它的正常功能更加复杂。图4. RNN层。取自colah’s blog的图像。
正向函数计算每个细胞的激活并首先输出。从左到右计算每个单元,因为下一个单元需要先前的细胞激活。
class RNN(nn.Module):
def __init__(self, **kwargs):
super(RNN, self).__init__()
# output shape is 3d. n x output_dim x embed_length
# input_dim is the embedding length(ie. word embedding length)
self.input_dim = kwargs.get("input_dim", 0)
self.act_dim = kwargs.get("act_dim", self.input_dim)
self.output_dim = kwargs.get("output_dim", 1)
self.time_steps = kwargs.get("time_steps", self.output_dim)
self.unit_output_dim = kwargs.get("unit_output_dim" , 1)
assert(self.output_dim <= self.time_steps)
# Populate the layer with the cells based on timesteps.
self.models = nn.ModuleList([
RNNCell(self.input_dim,self.act_dim, self.unit_output_dim)
] * self.time_steps )
def forward(self, X):
# x is n x time_steps x embed_length
n = X.shape[0]
# make sure X axis 1 is less than time_steps, cause it might crash.
assert(X.shape[1] <= self.time_steps)
# Sometimes our input would have lesser dimensions than our timesteps, hence we add zero tensors to it
# to match the size and be compatible.
X = torch.cat((X, torch.zeros(n, self.time_steps - X.shape[1], self.input_dim)), axis = 1)
# Initialize the first activation.
a = torch.zeros(n, self.act_dim)
# Create the Y array, the individual y-predictions will be stored here
Y = torch.zeros(n, self.output_dim, self.unit_output_dim)
# Create iterator for y_i for locating which timestep we are in.
y_i = 0
for i,cell in enumerate(self.models):
# Get the input for the current timestep
x = X[:, i, :]
# Forward the x and activations to the current cell.
a,y = cell(a, x)
# Only add the last predictions for the output of size output_dim
if i >= self.time_steps - self.output_dim:
Y[:, y_i,:] = y
y_i+=1
return Y
基本的RNN
那很忙!现在让我们把所有内容放在一起,一个顺序的对象现在可以避免复杂性。让我们将这些RNN层叠在一起。
model = nn.Sequential(
RNN(input_dim = embed_length, act_dim = 10, time_steps = 5, output_dim = 4, unit_output_dim = embed_length),
RNN(input_dim = embed_length, act_dim = 7, time_steps = 4, output_dim = 2, unit_output_dim = embed_length),
RNN(input_dim = embed_length, act_dim = 3, time_steps = 2, output_dim = 1, unit_output_dim = 1)
)
顺序(
(0):RNN(
(模型):调节仪(
(0):rnncell(
(线性):线性(in_features = 22,out_features = 10,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 10,out_features = 12,bias = true)
(activation_y):rearead()
)
(1):rnncell(
(线性):线性(in_features = 22,out_features = 10,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 10,out_features = 12,bias = true)
(activation_y):rearead()
)
(2):rnncell(
(线性):线性(in_features = 22,out_features = 10,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 10,out_features = 12,bias = true)
(activation_y):rearead()
)
(3):rnncell(
(线性):线性(in_features = 22,out_features = 10,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 10,out_features = 12,bias = true)
(activation_y):rearead()
)
(4):rnncell(
(线性):线性(in_features = 22,out_features = 10,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 10,out_features = 12,bias = true)
(activation_y):rearead()
)
)
)
(1):RNN(
(模型):调节仪(
(0):rnncell(
(线性):线性(in_features = 19,out_features = 7,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 7,out_features = 12,bias = true)
(activation_y):rearead()
)
(1):rnncell(
(线性):线性(in_features = 19,out_features = 7,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 7,out_features = 12,bias = true)
(activation_y):rearead()
)
(2):rnncell(
(线性):线性(in_features = 19,out_features = 7,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 7,out_features = 12,bias = true)
(activation_y):rearead()
)
(3):rnncell(
(线性):线性(in_features = 19,out_features = 7,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 7,out_features = 12,bias = true)
(activation_y):rearead()
)
)
)
(2):RNN(
(模型):调节仪(
(0):rnncell(
(线性):线性(in_features = 15,out_features = 3,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 3,out_features = 1,bias = true)
(activation_y):rearead()
)
(1):rnncell(
(线性):线性(in_features = 15,out_features = 3,bias = true)
(激活):rearead()
(linear_y):线性(in_features = 3,out_features = 1,bias = true)
(activation_y):rearead()
)
)
)
)
现在就是这样!我们可以继续使用所有不同类型的RNN,但是这篇文章已经越来越长。如果这篇文章获得足够的吸引力,我可以继续使用更复杂的模型,因此请务必与您的朋友分享!
'到下次!
本文也发表在Medium中。