PROSAGA码农传奇-部署模型-ML Engine Google Cloud Platform，从部署模型中的字符串解析功能

<div class =“post-text”itemprop =“text”>
  <P>
    应遵循的一般模式是（见
    <a href="https://www.tensorflow.org/get_started/feature_columns#feature_columns_2" rel="nofollow noreferrer">
      这个文件
    </A>
    ）：
  </p>
  <OL>
    <LI>
      创建一个
       <code>
        input_fn
      </code>
       用于训练，通常使用
       <code>
        tf.data.Dataset
      </code>
      。该
       <code>
        input_fn
      </code>
       应该调用辅助函数来进行数据转换，就像代码中那样。输出将是批量值的特征名称字典。
    </LI>
    <LI>
      为您的输出中的项目定义FeatureColumns
       <code>
        input_fn
      </code>
      。如有必要，可以执行特征交叉，bucketization等操作。
    </LI>
    <LI>
      实例化估算器（例如
       <code>
        DnnRegressor
      </code>
      ），将FeatureColumns传递给构造函数
    </LI>
    <LI>
      创建一个
       <code>
        input_fn
      </code>
       专门用于服务，有一个或多个
       <code>
        tf.Placeholder
      </code>
       同
       <code>
        None
      </code>
       （可变批量大小）作为外部维度。从（1）调用相同的辅助函数来进行转换。回来一个
       <code>
        tf.estimator.export.ServingInputReceiver
      </code>
       使用占位符作为输入和dict应该看起来与（1）中的dict相同。
    </LI>
  </醇>
  <P>
    您的具体情况需要一些额外的细节。首先，您已将批量大小1硬编码到占位符中，相应的代码继续该假设。你的占位符必须有
     <code>
      shape=[None]
    </code>
    。
  </p>
  <P>
    不幸的是，你的代码是在假设形状的基础上编写的
    <a href="https://www.tensorflow.org/get_started/feature_columns#feature_columns_2" rel="nofollow noreferrer">
      1
    </A>
    ，例如，
     <code>
      split_date_time.values[0]
    </code>
     将不再有效。我在下面的代码中添加了一个辅助函数来解决这个问题。
  </p>
  <P>
    这里有一些代码可能适合你：
  </p>
   <pre>
    <code>
      import tensorflow as tf

# tf.string_split returns a SparseTensor. When using a variable batch size,
# this can be difficult to further manipulate. In our case, we don't need
# a SparseTensor, because we have a fixed number of elements each split.
# So we do the split and convert the SparseTensor to a dense tensor.
def fixed_split(batched_string_tensor, delimiter, num_cols):
    # When splitting a batch of elements, the values array is row-major, e.g.
    # ["2018-01-02", "2019-03-04"] becomes ["2018", "01", "02", "2019", "03", "04"].
    # So we simply split the string then reshape the array to create a dense
    # matrix with the same rows as the input, but split into columns, e.g.,
    # [["2018", "01", "02"], ["2019", "03", "04"]]
    split = tf.string_split(batched_string_tensor, delimiter)
    return tf.reshape(split.values, [-1, num_cols])

def parse_dates(dates):  
    split_date_time = fixed_split(dates, ' ', 2)

date = split_date_time[:, 0]
    time = split_date_time[:, 1]

# The values of the resulting SparseTensor will alternate between year, month, and day
    split_date = fixed_split(date, '-', 3)
    split_time = fixed_split(time, ':', 2)

year = split_date[:, 0]
    month = split_date[:, 1]
    day = split_date[:, 2]
    hours = split_time[:, 0]
    minutes = split_time[:, 1]

year = tf.string_to_number(year, out_type=tf.int32, name="year_temp")
    month = tf.string_to_number(month, out_type=tf.int32, name="month_temp")
    day = tf.string_to_number(day, out_type=tf.int32, name="day_temp")
    hours = tf.string_to_number(hours, out_type=tf.int32, name="hour_temp")
    minutes = tf.string_to_number(minutes, out_type=tf.int32, name="minute_temp")

return {"year": year, "month": month, "day": day, "hours": hours, "minutes": minutes}

def training_input_fn():
    filenames = ["/var/data/file1.txt", "/var/data/file2.txt"]
    dataset = tf.data.TextLineDataset(filenames)    
    dataset.batch(BATCH_SIZE)
    return parse_dates(iterator.get_next())

def serving_input_fn():
    date_strings = tf.placeholder(dtype=tf.string, shape=[None], name="date_strings")
    features = parse_dates(date_strings)
    return tf.estimator.export.ServingInputReceiver(features, date_strings)

with tf.Session() as sess:
    date_time_list = ["2018-12-31 22:59", "2018-01-23 2:09"]

date_strings = tf.placeholder(dtype=tf.string, shape=[None], name="date_strings")
    features = parse_dates(date_strings)

fetches = [features[k] for k in ["year", "month", "day", "hours", "minutes"]]
    year, month, day, hours, minutes = sess.run(fetches, feed_dict={date_strings: date_time_list})
    print("Year =", year)
    print("Month =", month)
    print("Day =", day)
    print("Hours =", hours)
    print("Minutes =", minutes)

</code>
  </pre>
</DIV>