如何构建具有多个输入的Tensorflow模型？

html5 • 2022年9月13日 pm4:05 • 问答

我想使用 Functional API 创建一个 Tensorflow 神经网络模型，但我不确定如何将输入分成两个。我想做这样的事情：给定一个输入，它的前半部分进入神经网络的第一部分，它的后半部分进入第二部分，每个输入都通过层，直到它们连接起来，再通过另一层，然后最终到达输出。我想到了类似下面的代码片段，以及一个快速草图。

from tensorflow.keras.layers import Dense

def define_model(self):
    input1 = tf.keras.Input(shape=(4,)) #input is a 1D vector containing 7 elements, split as 4 and 3
    input2 = tf.keras.Input(shape=(3,))

    layer1_1 = Dense(4, activation=tf.nn.leaky_relu)(input1)
    layer2_1 = Dense(4, activation=tf.nn.leaky_relu)(layer1_1)

    layer1_2 = Dense(4, activation=tf.nn.leaky_relu)(input2)
    layer2_2 = Dense(3, activation=tf.nn.leaky_relu)(layer1_2)

    concat_layer = tf.keras.concatenate([layer2_1,layer2_2], axis = 0)
    layer3 = Dense(6, activation=tf.nn.leaky_relu)(concat_layer)

    output = Dense(4)(layer3) #no activation

    self.model = tf.keras.Model(inputs = [input1,input2],outputs = output)
    self.model.compile(loss = 'mean_squared_error', optimizer = 'rmsprop')
    return self.model

首先，我应该在这个模型中添加任何 Dropout 或 BatchNormalization 层吗？

此外，输入数组的前 4 个元素是二进制的（如 [1,0,0,1] 或 [0,1,1,1]），而其他 3 个元素可以是任何实数。考虑到第一个在 0<x<1 范围内使用输入操作，而第二个没有，我是否应该将神经网络的第一个“列”与第二个“列”区别对待？

听起来不错，但我无法真正测试它是否应该工作，因为我必须重新编写大量代码以生成足够的数据来训练它。我是朝着正确的方向前进还是应该做一些不同的事情？这段代码会起作用吗？

编辑：我在训练期间遇到问题。假设我想像这样训练模型（值并不重要，重要的是数据类型）：

#this snippet generates training data - nothing real, just test examples. Also, I changed the output layer from 4 elements to just 1 to test it.
A1=[np.array([[1.,0,0,1]]),np.array([[0,1.,0]])]
B1=np.array([7])

c=np.array([[5,-4,1,-1],[2,3,-1]], dtype = object)
A2 = [[np.random.randint(2, size= [1,4]),np.random.randint(2, size= [1,3])] for i in range(1000)]
B2 = np.array([np.sum(A[i][0]*c[0])+np.sum(A[i][1]*c[1]) for i in range(1000)]) 

model.fit(A1,B1, epochs = 50, verbose=False) #this works!
model.fit(A2,B2, epochs = 50, verbose=False) #but this doesn't.

最终编辑：这里是 predict() 和 predict_on_batch() 函数。

def predict(a,b):
    pred = m.predict([a,b])
    return pred

def predict_b(c,d):
    preds = m.predict_on_batch([c,d])
    return preds

#a, b, c and d must look like this:
a = [np.array([0,1,0,1])]
b = [np.array([0,0,1])]

c =        [np.array([1, 0, 0, 1]), 
            np.array([0, 1, 1, 1]), 
            np.array([0, 1, 0, 0]), 
            np.array([1, 0, 0, 0]), 
            np.array([0, 0, 1, 0])] 

d =        [np.array([1, 0, 1]),
            np.array([0, 0, 1]),
            np.array([0, 1, 1]),
            np.array([1, 1, 1]),
            np.array([0, 0, 0])]
#notice that all of those should follow the same pattern, which is a list of arrays.

其余代码在 M. Innat 的回答下。

以上是如何构建具有多个输入的Tensorflow模型？的全部内容。

THE END

二维码

DockerDesktopforWindowsDashboard运行，但Docker本身不运行

< <上一篇

java记录上的构造函数注释

下一篇>>

搜索内容

如何构建具有多个输入的Tensorflow模型？

目录

目录

推荐文章

最新文章