项目作者: rexwangcc

项目描述 :
My useful resources of Deep Learning in Tensorflow
高级语言: Jupyter Notebook
项目地址: git://github.com/rexwangcc/Joyful-Deep-Leaning-I.git
创建时间: 2017-04-10T04:50:01Z
项目社区:https://github.com/rexwangcc/Joyful-Deep-Leaning-I

开源协议:

下载


Joyful Deep Learning - I

This repository is a collection of my useful resources of Deep Learning in Tensorflow, in 5 sections:

  1. Machine Learning and Deep Learning Basics in Math and Numpy
  2. Deep Learning Basics in Math, Numpy and Scikit-Learn
  3. Deep Learning Basics in Tensoflow
  4. Deep Learning Advanced in Tensoflow - CNN and Tensorboard
  5. Deep Learning Advanced in Tensoflow - RNN, LSTM and RBM

It would be really grateful to contribute to/clone this repository, commercial uses are not welcomed. Thanks for the help of Prof.Brian Kulis and Prof.Kate Saenko and TFs of CS591-S2 (Deep Learning) at Boston University. Of course also thanks to Google’s Open Source Tensorflow!

All results in Jupyter-Notebook are trained under GTX 1070, training CPUs may cost much more time.

This readme file is supported by readme2tex








Section 1 Content - Machine Learning and Deep Learning Basics in Math and Numpy (click to view full notebook)

  • Coding requirements:
  1. # Python 3.5+
  2. import numpy as np
  3. import matplotlib.pyplot as plt
  4. from scipy.spatial.distance import cosine
  5. import matplotlib.cm as cm
  • Closed-Form Maximum Likelihood mathematical derivation:

    • $P(x \ | \ \theta) = \theta e^{-\theta x}$ for $x \geq 0$

    • $P(x \ | \ \theta) = \frac{1}{\theta}$ for $ 0 \leq x \leq \theta$

  • Gradient for Maximum Likelihood Estimation mathematical derivation:

    • Gradients for log-likelihood of the following model:

      • we have $X \in \mathbf R^{n \times k}$ - constant data matrix, $\mathbf x_i$ - vector corresponding to a single data point

      • $\theta$ is a $k$-dimensional (unknown) weight vector

      • $\varepsilon \sim \text{Student}(v)$ is a $n$-dimensional (unknown) noise vector

      • and we observe vector $\mathbf y = X\theta + \varepsilon$

      • $$ P(y_i \ | \ \mathbf x_i, \theta, v) = \frac{1}{Z(v)} \Big(1 + \frac{(\theta^T \mathbf x_i - y_i) ^2}{v}\Big)^{-\frac{v+1}{2}}$$

    • Stochastic Gradient Descent Implementation

  • Matrix Derivatives mathematical derivation:

    • Multivariate Gaussian:

      • $ \frac{\partial \mathcal L(\Sigma)}{\partial \Sigma} = -\frac12 \left( \frac{1}{|\Sigma|} |\Sigma| \Sigma^{-T}  - \Sigma^{-T} (x- \bar \mu)(x-\bar \mu)^T\Sigma^{-T} \right) =  -\frac12 \left(\Sigma^{-T}  - \Sigma^{-T} (x- \bar \mu)(x-\bar \mu)^T\Sigma^{-T} \right)$
    • Multi-target Linear Regression model:

      • we have $X \in \mathbf R^{n \times k}$ is a constant data matrix

      • $\theta$ is a $k \times m$-dimentional weight matrix

      • $\varepsilon_{ij} \sim \mathcal N(0, \sigma_\epsilon)$ is a normal noise ($i \in [0; n], j \in [0;m]$)

      • and we observe a matrix $Y = X\theta + \varepsilon \in \mathbf R^{n \times m}$

      • $$\varepsilon = Y - X\theta \sim \mathcal N_n(0, \sigma_\epsilon I)$$

      • $$\mathcal L(\theta) = \log P(Y - X\theta \ | \ \theta) = \log \mathcal N_n(Y - X\theta \ | \ 0, \sigma_\epsilon I)$$

      • $$\theta_{MLE} = \arg \max_{\theta} \mathcal L(\theta) = \arg \min_{\theta} \text{loss}(\theta) = \arg \min_{\theta} \big( ||Y-X\theta||^2_F \big)$$

      • Deriavation: $\frac{\partial\text{loss}(\theta)}{\partial \theta} = -2X^T (Y-X\theta)$

      • Deriavation: $\theta_{MLE} = (X^T X)^{-1} X^T Y$

  • Logistic Regression mathematical derivation

  • Logistic Regression implementation

Section 2 Content - Deep Learning Basics in Math, Numpy and Scikit-Learn (click to view full notebook)

  • Coding requirements:
  1. # Python 3.5+
  2. import numpy as np
  3. import matplotlib.pyplot as plt
  4. from scipy.misc import imread
  5. from sklearn.datasets import fetch_mldata
  • Cross-Entropy and Softmax mathematical derivation:

    • Minimizing the multiclass cross-entropy loss function to obtain the maximum likelihood estimate of the parameters $\theta$:

      • $L(\theta)= - \frac{1}{N}\sum_{i=1}^{N} \sum_{k=1}^{K} y_{ik} \log(h_k(x_i,\theta))$ where $N$ is the number of examples $\{x_i,y_i\}$
  • Simple Regularization Methods:

    • L2 regularization

    • L1 regularization

  • Backprop in a simple MLP - Multi-layer perceptron’s mathematical derivation:



  • XOR problem - A Neural network to solve the XOR problem: (This is a really good example to help us understand the essence of neural networks)



  • Implementing a simple MLP - Implement a MLP by hand in numpy and scipy

    • Common useful activation functions implementation escaped from numerical accuracy problems:

      • softplus function:

        1. import numpy as np
        2. def softplus(x):
        3. return np.logaddexp(0, x)
        4. def derivative_softplus(x):
        5. return np.exp(-np.logaddexp(0,-z))
      • sigmoid function:

        1. import numpy as np
        2. def sigmoid(x):
        3. return np.exp(-np.logaddexp(0, -x))
        4. def derivative_sigmoid(x):
        5. return np.multiply(np.exp(-np.logaddexp(0, -x)), (1.-np.exp(-np.logaddexp(0, -x))))
      • relu function:

        1. import numpy as np
        2. def relu(x):
        3. return np.maximum(0, x)
        4. def derivative_relu(x):
        5. for i in range(0, len(x)):
        6. for k in range(len(x[i])):
        7. if x[i][k] > 0:
        8. x[i][k] = 1
        9. else:
        10. x[i][k] = 0
        11. return x
    • Forward pass implementation

    • Backward pass implementation

    • Test MLP on MNIST dataset and its visualization

Section 3 Content - Deep Learning Basics in Tensoflow (click to view full notebook)

  • Coding requirements:
  1. # Python 3.5+
  2. import numpy as np
  3. # tensorflow-gpu==1.0.1 or tensorflow==1.0.1
  4. import tensorflow as tf
  5. from matplotlib import pyplot as plt
  6. # Scikit-learn's TSNE is relatively slow, use BHTSNE as a faster alternative:
  7. # https://github.com/dominiek/python-bhtsne
  8. from sklearn.manifold import TSNE
  • MNIST Softmax Classifier Demo in TensorFlow

  • Building Neural Networks with the power of Variable Scope - MLP in TensorFlow:

    With the power of variable scope, we can implement a very flexible MLP in tensorflow without hard-code the layers and weights:

  1. def mlp(x, hidden_sizes, activation_fn=tf.nn.relu):
  2. '''
  3. Inputs:
  4. x: an input tensor of the images in the current batch [batch_size, 28x28]
  5. hidden_sizes: a list of the number of hidden units per layer. For example: [5,2] means 5 hidden units in the first layer, and 2 hidden units in the second (output) layer. (Note: for MNIST, we need hidden_sizes[-1]==10 since it has 10 classes.)
  6. activation_fn: the activation function to be applied
  7. Output:
  8. a tensor of shape [batch_size, hidden_sizes[-1]].
  9. '''
  10. if not isinstance(hidden_sizes, (list, tuple)):
  11. raise ValueError("hidden_sizes must be a list or a tuple")
  12. # Number of layers
  13. L = len(hidden_sizes)
  14. for l in range(L):
  15. with tf.variable_scope("layer"+str(l)):
  16. # Create variable named "weights".
  17. if l == 0:
  18. weights = tf.get_variable("weights", shape= [x.shape[1], hidden_sizes[l]], dtype=tf.float32, initializer=None)
  19. else:
  20. weights = tf.get_variable("weights", shape= [hidden_sizes[l-1], hidden_sizes[l]], dtype=tf.float32, initializer=None)
  21. # Create variable named "biases".
  22. biases = tf.get_variable("biases", shape=[hidden_sizes[l]], dtype=tf.float32, initializer=None)
  23. # Pre-Actiation Layer
  24. if l == 0:
  25. pre_activation = tf.add(tf.matmul(x, weights), biases)
  26. else:
  27. pre_activation = tf.add(tf.matmul(activated_layer, weights), biases)
  28. # Activated Layer
  29. if l == L-1:
  30. activated_layer = pre_activation
  31. else:
  32. activated_layer = activation_fn(pre_activation)
  33. return activated_layer
  • Siamese Network in TensorFlow



  • Visualize learned features of Siamese Network with T-SNE



Section 4 Content - Deep Learning Advanced in Tensoflow - CNN and Tensorfoard (click to view full notebook)

  • Coding requirements:
  1. # Python 3.5+
  2. import numpy as np
  3. import scipy
  4. import scipy.io
  5. # tensorflow-gpu==1.0.1 or tensorflow==1.0.1
  6. import tensorflow as tf
  7. from matplotlib import pyplot as plt
  8. # Scikit-learn's TSNE is relatively slow, use BHTSNE as a faster alternative:
  9. # https://github.com/dominiek/python-bhtsne
  10. from sklearn.manifold import TSNE
  • Building and training a convolutional network in Tensorflow with tf.layers/tf.contrib

  • Building and training a convolutional network by hand in Tensorflow with tf.nn

  • Saving and Reloading Model Weights in Tensorflow

  • Fine-tuning a pre-trained network

  • Visualizations using Tensorboard:

    • Visualize Filters/Kernels

    • Visualize Loss

    • Visualize Accuracy

Section 5 Content - Deep Learning Advanced in Tensoflow - RNN, LSTM and RBM (click to view full notebook)