项目作者: baggepinnen

项目描述 :
Utilities for clustering of audio samples
高级语言: Julia
项目地址: git://github.com/baggepinnen/AudioClustering.jl.git
创建时间: 2019-10-04T09:46:28Z
项目社区:https://github.com/baggepinnen/AudioClustering.jl

开源协议:

下载


docs

AudioClustering

This package contains experiments and utilities for unsupervised learning on acoustic recordings. This package is a use case of SpectralDistances.jl

Installation

  1. using Pkg
  2. pkg"add https://github.com/baggepinnen/DetectionIoTools.jl"
  3. pkg"add https://github.com/baggepinnen/AudioClustering.jl"

Examples

Estimating linear models

The following code illustrates how to use SpectralDistances.jl to fit rational spectra to audio samples and extract the poles for use as features

  1. using SpectralDistances, Glob
  2. path = "/home/fredrikb/birds/" # path to a bunch of wav files
  3. cd(path)
  4. files = glob("*.wav")
  5. const fs = 44100
  6. na = 20
  7. fitmethod = LS(na=na)
  8. models = mapsoundfiles(files) do sound
  9. sound = SpectralDistances.bp_filter(sound, (50/fs, 18000/fs))
  10. SpectralDistances.fitmodel(fitmethod, sound)
  11. end

We now have a vector of vectors with linear models fit to the sound files. To make this easier to work with, we flatten this structure to a single long vector and extract the poles (roots) of the linear systems to use as features

  1. X = embeddings(models)

We now have some audio data, represented as poles of rational spectra, in a matrix X. See https://baggepinnen.github.io/SpectralDistances.jl/latest/examples/#Examples-1 for examples of how to use this matrix for analysis of the signals, e.g., classification, clustering and detection.

Graph-based clustering

Model based

A graph representation of X can be obtained with

  1. G = audiograph(X, 5; λ=0)

where k=5 is the number of nearest neighbors considered when building the graph. If λ=0 the graph will be weighted by distance, whereas if λ>0 the graph will be weigted according to adjacency under the kernel exp(-λ*d). The metric used is the Euclidean distance. If you want to use a more sophisticated distance, try, e.g.,

  1. dist = OptimalTransportRootDistance(domain=Continuous(), p=2)
  2. G = audiograph(X, 5, dist; λ=0)

Here, the Euclidean distance will be used to select neighbors, but the edges will be weighted using the provided distance. This avoids having to calculate a very large number of pairwise distances using the more expensive distance metric.

Any graph-based algorithm may now operate on G, or on the field G.weight. Further examples are available here.

Spectrogram based

The following snippets show how to preprocess data to a suitable form for clustering using this package:

  1. using GLob
  2. files = glob("*.wav") # Vector of file paths
  3. const fs = Int(wavread(files[1])[2]) # Rread the sampling time
  4. N = length(files)
  5. using TotalLeastSquares # For lowrankfilter
  6. function lrfilt(y)
  7. yf = lowrankfilter(y, min(250, length(y)-1100), lag=10)
  8. end
  9. "Perform some simple threshold filtering and calculate a spectrogram"
  10. function spec(sound)
  11. @. sound = Float32(100000 * clamp(sound, -0.015f0, 0.015f0)) # the 100000 multiplier is to normalize the Float32 data for better numerical performance. Tune all parameters to you use case.
  12. # sound = lrfilt(sound) # This is an alternative to the above which is *much* better, but also much slower
  13. melspectrogram(sound, 100, 70, nmels=30, fs=fs, fmin=5) # Spend some time making sure spectrogram representation is good.
  14. end
  15. using ThreadTools # For tmap
  16. spectrograms = tmap(files) do file
  17. sound = spec(vec(wavread(file)[1]))
  18. end
  19. matrices = [Float32.(max.(normalize_spectrogram(s), 1e-7)) for s in spectrograms]
  20. # matrices_masked = mask_filter.(matrices) # This is an alternative if the lowrankfilter is not used https://baggepinnen.github.io/SpectralDistances.jl/latest/distances/#SpectralDistances.mask_filter
  21. inds, D = initialize_clusters(dist, matrices; init_multiplier = 10, N_seeds = 100)
  22. patterns = matrices[inds] # These should be good cluster seeds

Distance matrix-based clustering

See docs entry Clustering using a distance matrix

Feature-based clustering

See docs entry Clustering using features

Accelerated k-nearest neighbor

  1. inds, dists, D = knn_accelerated(dist, X, k, Xe=X; kwargs...)

Find the nearest neighbor from using distance metric dist by first finding the k nearest neighbors using Euclidean distance on embeddings produced from Xe, and then using dist do find the smallest distance within those k.

X is assumed to be a vector of something dist can operate on, such as a vector of models from SpectralDistances. Xe is by default the same as X, or possibly something else, as long as embeddings(Xe) is defined. A vector of models or spectrograms has this function defined.

D is a sparse matrix with all the computed distances from dist. This matrix contains raw distance measurements, to symmetrize, call SpectralDistances.symmetrize!(D). The returned dists are already symmetrized.