项目作者: yhenon

项目描述 :
Spatial pyramid pooling layers for keras
高级语言: Python
项目地址: git://github.com/yhenon/keras-spp.git
创建时间: 2016-11-15T18:51:15Z
项目社区:https://github.com/yhenon/keras-spp

开源协议:MIT License

下载


keras-spp

Spatial pyramid pooling layers for keras, based on https://arxiv.org/abs/1406.4729 . This code requires Keras version 2.0 or greater.

spp

(Image credit: Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition, K. He, X. Zhang, S. Ren, J. Sun)

Three types of pooling layers are currently available:

  • SpatialPyramidPooling: apply the pooling procedure on the entire image, given an image batch. This is especially useful if the image input
    can have varying dimensions, but needs to be fed to a fully connected layer.

For example, this trains a network on images of both 32x32 and 64x64 size:

  1. import numpy as np
  2. from keras.models import Sequential
  3. from keras.layers import Convolution2D, Activation, MaxPooling2D, Dense
  4. from spp.SpatialPyramidPooling import SpatialPyramidPooling
  5. batch_size = 64
  6. num_channels = 3
  7. num_classes = 10
  8. model = Sequential()
  9. # uses theano ordering. Note that we leave the image size as None to allow multiple image sizes
  10. model.add(Convolution2D(32, 3, 3, border_mode='same', input_shape=(3, None, None)))
  11. model.add(Activation('relu'))
  12. model.add(Convolution2D(32, 3, 3))
  13. model.add(Activation('relu'))
  14. model.add(MaxPooling2D(pool_size=(2, 2)))
  15. model.add(Convolution2D(64, 3, 3, border_mode='same'))
  16. model.add(Activation('relu'))
  17. model.add(Convolution2D(64, 3, 3))
  18. model.add(Activation('relu'))
  19. model.add(SpatialPyramidPooling([1, 2, 4]))
  20. model.add(Dense(num_classes))
  21. model.add(Activation('softmax'))
  22. model.compile(loss='categorical_crossentropy', optimizer='sgd')
  23. # train on 64x64x3 images
  24. model.fit(np.random.rand(batch_size, num_channels, 64, 64), np.zeros((batch_size, num_classes)))
  25. # train on 32x32x3 images
  26. model.fit(np.random.rand(batch_size, num_channels, 32, 32), np.zeros((batch_size, num_classes)))
  • RoiPooling: extract multiple rois from a single image. In roi pooling, the spatial pyramid pooling is applied at the specified subregions of the image. This is useful for object detection, and is used in fast-RCNN and faster-RCNN. Note that the batch_size is limited to 1 currently.
  1. pooling_regions = [1, 2, 4]
  2. num_rois = 2
  3. num_channels = 3
  4. if dim_ordering == 'tf':
  5. in_img = Input(shape=(None, None, num_channels))
  6. elif dim_ordering == 'th':
  7. in_img = Input(shape=(num_channels, None, None))
  8. in_roi = Input(shape=(num_rois, 4))
  9. out_roi_pool = RoiPooling(pooling_regions, num_rois)([in_img, in_roi])
  10. model = Model([in_img, in_roi], out_roi_pool)
  11. if dim_ordering == 'th':
  12. X_img = np.random.rand(1, num_channels, img_size, img_size)
  13. row_length = [float(X_img.shape[2]) / i for i in pooling_regions]
  14. col_length = [float(X_img.shape[3]) / i for i in pooling_regions]
  15. elif dim_ordering == 'tf':
  16. X_img = np.random.rand(1, img_size, img_size, num_channels)
  17. row_length = [float(X_img.shape[1]) / i for i in pooling_regions]
  18. col_length = [float(X_img.shape[2]) / i for i in pooling_regions]
  19. X_roi = np.array([[0, 0, img_size / 1, img_size / 1],
  20. [0, 0, img_size / 2, img_size / 2]])
  21. X_roi = np.reshape(X_roi, (1, num_rois, 4))
  22. Y = model.predict([X_img, X_roi])
  • RoiPoolingConv: like RoiPooling, but maintains spatial information.

  • Thank you to @jlhbaseball15 for his contribution