项目作者: TeaPoly

项目描述 :
Computes the MWER (minimum WER) Loss with CTC beam search. Knowledge distillation for CTC loss.
高级语言: Python
项目地址: git://github.com/TeaPoly/CTC-OptimizedLoss.git
创建时间: 2021-02-19T04:25:03Z
项目社区:https://github.com/TeaPoly/CTC-OptimizedLoss

开源协议:

下载


CTC-OptimizedLoss

Some loss optimized for CTC:

TensorFlow

  • MWER (minimum WER) Loss with CTC beam search.
  • Knowledge distillation for CTC loss.
  • KL divergence loss for label smoothing.

PyTorch

  • Delay-penalized CTC implemented based on Finite State Transducer.
  • O-1: Self-training with Oracle and 1-best Hypothesis.
  • MWER (minimum WER) Loss with CTC beam search.

Example

  1. weight = 0.01 # interpolation weight
  2. beam_width = 8 # N-best
  3. mwer_loss = CTCMWERLoss(beam_width=beam_width)(
  4. ctc_logits, ctc_labels, ctc_label_length, logit_length)
  5. ctc_loss = CTCLoss()(
  6. ctc_logits, ctc_labels, ctc_label_length, logit_length)
  7. loss = mwer_loss + weight*ctc_loss

Citations

  1. @misc{prabhavalkar2017minimum,
  2. title={Minimum Word Error Rate Training for Attention-based Sequence-to-Sequence Models},
  3. author={Rohit Prabhavalkar and Tara N. Sainath and Yonghui Wu and Patrick Nguyen and Zhifeng Chen and Chung-Cheng Chiu and Anjuli Kannan},
  4. year={2017},
  5. eprint={1712.01818},
  6. archivePrefix={arXiv},
  7. primaryClass={cs.CL}
  8. }
  9. @misc{gao2021distilling,
  10. title={Distilling Knowledge from Ensembles of Acoustic Models for Joint CTC-Attention End-to-End Speech Recognition},
  11. author={Yan Gao and Titouan Parcollet and Nicholas Lane},
  12. year={2021},
  13. eprint={2005.09310},
  14. archivePrefix={arXiv},
  15. primaryClass={cs.LG}
  16. }
  17. @misc{baskar2023o1,
  18. title={O-1: Self-training with Oracle and 1-best Hypothesis},
  19. author={Murali Karthick Baskar and Andrew Rosenberg and Bhuvana Ramabhadran and Kartik Audhkhasi},
  20. year={2023},
  21. eprint={2308.07486},
  22. archivePrefix={arXiv},
  23. primaryClass={cs.LG}
  24. }