PROSAGA码农传奇-Caffe-回归caffe的测试标签，不允许浮动？

0# 薄情 | 2019-08-31 10-32

<div class =“post-text”itemprop =“text”>
  
    使用图像数据集输入图层时（使用其中任何一个）
     <code>
 lmdb
 </code>
     要么
     <code>
 leveldb
 </code>
     后端）caffe只支持一个
    的
      整数
    </强>
     每个输入图像的标签。
  
  
    如果要进行回归并使用浮点标签，则应尝试使用HDF5数据层。例如，参见
    <a href="https://stackoverflow.com/q/31617486/1714410">
      这个问题
    </A>
    。
  
  
    在python中你可以使用
     <code>
 h5py
 </code>
     包创建hdf5文件。
  
   <pre class="lang-py prettyprint-override">
 <code>
 import h5py, os
import caffe
import numpy as np

SIZE = 224 # fixed size to all images
with open( 'train.txt', 'r' ) as T :
    lines = T.readlines()
# If you do not have enough memory split data into
# multiple batches and generate multiple separate h5 files
X = np.zeros( (len(lines), 3, SIZE, SIZE), dtype='f4' ) 
y = np.zeros( (len(lines),1), dtype='f4' )
for i,l in enumerate(lines):
    sp = l.split(' ')
    img = caffe.io.load_image( sp[0] )
    img = caffe.io.resize( img, (SIZE, SIZE, 3) ) # resize to fixed size
    # you may apply other input transformations here...
    # Note that the transformation should take img from size-by-size-by-3 and transpose it to 3-by-size-by-size
    # for example
    # transposed_img = img.transpose((2,0,1))[::-1,:,:] # RGB->BGR
    X[i] = transposed_img
    y[i] = float(sp[1])
with h5py.File('train.h5','w') as H:
    H.create_dataset( 'X', data=X ) # note the name X given to the dataset!
    H.create_dataset( 'y', data=y ) # note the name y given to the dataset!
with open('train_h5_list.txt','w') as L:
    L.write( 'train.h5' ) # list all h5 files you are going to use

</code>
 </pre>
  
    一旦你拥有了所有
     <code>
 h5
 </code>
     文件和列出它们的相应测试文件可以为您添加HDF5输入图层
     <code>
 train_val.prototxt
 </code>
    ：
  
   <pre>
 <code>
 layer {
 type: "HDF5Data"
 top: "X" # same name as given in create_dataset!
 top: "y"
 hdf5_data_param {
 source: "train_h5_list.txt" # do not give the h5 files directly, but the list.
 batch_size: 32
 }
 include { phase:TRAIN }
 }

</code>
 </pre>
  <HR />
  
    的
      澄清
    </强>
    ：
     
    
当我说“caffe每个输入图像只支持一个整数标签”时，我并不是说leveldb / lmdb容器是有限的，我的意思是caffe的工具，特别是
    <a href="https://stackoverflow.com/a/31431716/1714410">
       <code>
 convert_imageset
 </code>
    </A>
     工具。
     
    
仔细观察，似乎caffe存储了类型的数据
     <code>
 Datum
 </code>
     在leveldb / lmdb中，此类型的“label”属性定义为整数（参见
    <a href="https://github.com/BVLC/caffe/blob/master/src/caffe/proto/caffe.proto#L30" rel="nofollow noreferrer">
      caffe.proto
    </A>
    ）因此当使用caffe接口到leveldb / lmdb时，每个图像限制为一个int32标签。
  
</DIV>

1# 大黑骡子王 | 2019-08-31 10-32

<div class =“post-text”itemprop =“text”>
  
    除了
    <a href="https://stackoverflow.com/a/31808324/6281477">
      @ Shai的回答
    </A>
     上面，我写了一篇
    的
      <a href="https://github.com/DaleSong89/caffe-batch-normalization" rel="nofollow noreferrer">
        MultiTaskData
      </A>
    </强>
     层支持
     <code>
 float
 </code>
     打字标签。
  
  
    它的主要思想是将标签存储在
     <code>
 float_data
 </code>
     现场
     <code>
 Datum
 </code>
    ，和
     <code>
 MultiTaskDataLayer
 </code>
     将根据值将它们解析为任意数量任务的标签
     <code>
 task_num
 </code>
     和
     <code>
 label_dimension
 </code>
     设置
     <code>
 net.prototxt
 </code>
    。相关文件包括：
     <code>
 caffe.proto
 </code>
    ，
     <code>
 multitask_data_layer.hpp/cpp
 </code>
    ，
     <code>
 io.hpp/cpp
 </code>
    。
  
  
    您可以轻松地将此图层添加到您自己的caffe中并像这样使用它（这是面部表情标签分布学习任务的示例，其中“exp_label”可以是浮点型向量，例如[0.1,0.1,0.5,0.2,0.1 ]表示面部表情（5级）的概率分布。）：
  
   <pre>
 <code>
 name: "xxxNet"
 layer {
 name: "xxx"
 type: "MultiTaskData"
 top: "data"
 top: "exp_label"
 data_param { 
 source: "expression_ld_train_leveldb" 
 batch_size: 60 
 task_num: 1
 label_dimension: 8
 }
 transform_param {
 scale: 0.00390625
 crop_size: 60
 mirror: true
 }
 include:{ phase: TRAIN }
 }
 layer { 
 name: "exp_prob" 
 type: "InnerProduct"
 bottom: "data" 
 top: "exp_prob" 
 param {
 lr_mult: 1
 decay_mult: 1
 }
 param {
 lr_mult: 2
 decay_mult: 0
 }
 inner_product_param {
 num_output: 8
 weight_filler {
 type: "xavier"
 } 
 bias_filler { 
 type: "constant"
 } 
 }
 }
 layer { 
 name: "exp_loss" 
 type: "EuclideanLoss" 
 bottom: "exp_prob" 
 bottom: "exp_label"
 top: "exp_loss"
 include:{ phase: TRAIN }
 }

</code>
 </pre>
</DIV>

2# 满目山河 | 2019-08-31 10-32

<div class =“post-text”itemprop =“text”>
  
    我最终调换，切换频道顺序，并使用无符号整数而不是浮点数来获得结果。我建议您从HDF5文件中读取图像，以确保它正确显示。
  
  
    首先将图像作为无符号整数读取：
  
  
     <code>
 img = np.array(Image.open('images/' + image_name))
 </code>
  
  
    然后将通道顺序从RGB更改为BGR：
  
  
     <code>
 img = img[:, :, ::-1]
 </code>
  
  
    最后，从高度x宽度x通道切换到通道x高度x宽度：
  
  
     <code>
 img = img.transpose((2, 0, 1))
 </code>
  
  
    仅仅改变形状会扰乱你的形象并破坏你的数据！
  
  
    要回读图像：
  
   <pre>
 <code>
 with h5py.File(h5_filename, 'r') as hf:
 images_test = hf.get('images')
 targets_test = hf.get('targets')
 for i, img in enumerate(images_test):
 print(targets_test[i])
 from skimage.viewer import ImageViewer
 viewer = ImageViewer(img.reshape(SIZE, SIZE, 3))
 viewer.show()

</code>
 </pre>
  
    这是我写的一个脚本，它处理自动驾驶汽车任务的两个标签（转向和速度）：
    <a href="https://gist.github.com/crizCraig/aa46105d34349543582b177ae79f32f0" rel="nofollow">
      https://gist.github.com/crizCraig/aa46105d34349543582b177ae79f32f0
    </A>
  
</DIV>