好问题。但这与LSTM功能无关,而是处理任务本身。所以任务是预测,下一个角色是什么。下一个字符的预测本身有两个方面:分类和近似。 如果我们只处理近似,我们只能处理一维数组。但是如果我们同时处理近似和分类,我们就不能仅将归一化的ascii字符表示为神经网络。我们需要将每个字符转换为数组。
例如,a(a not capital)将以这种方式表示:
1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
b(非资本)将表示为: 0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 c将表示为:
0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
Z(z资本!!!!)
0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1
所以,每个字符都给我们两个维数组。如何构建所有这些维度?代码评论有以下解释:
// dimension 0 = number of examples in minibatch // dimension 1 = size of each vector (i.e., number of characters) // dimension 2 = length of each time series/example
我想要真诚地赞扬你为理解LSTM如何工作所做的努力,但你指出的代码给出了适用于各种NN的例子,并解释了如何在神经网络中处理文本数据,但没有解释LSTM如何工作。您需要查看源代码的另一部分。