项目作者: lddias

项目描述 :
Python3 Alexa AVS Client
高级语言: Python
项目地址: git://github.com/lddias/python-avs.git
创建时间: 2017-01-31T17:14:17Z
项目社区:https://github.com/lddias/python-avs

开源协议:MIT License

下载


python-avs

A Python3 client for AVS API v20160207

All* AVS directives are supported. You can even send speech Recognize requests with streaming microphone input (NEAR/FAR_FIELD profiles).

Usage

  1. Create AVS client

    1. import avs
    2. a = avs.AVS('v20160207', 'access_token', 'refresh_token', 'client_id', 'client_secret', audio_device)

    see installation notes on how to create audio_device

  2. Create a thread for processing downchannel stream in parallel

    1. import threading
    2. def downchannel_stream_directives():
    3. for push in a._dc_resp.read_chunked():
    4. parts = multipart_parse(push, a._dc_resp.headers['content-type'][0].decode())
    5. a.handle_parts(parts)
    6. ddt = threading.Thread(target=downstream_directives, name='Downstream Directives Thread')
    7. ddt.setDaemon(False)
    8. ddt.start()
  3. Run main loop

    1. while True:
    2. a.run()
    3. # and other requests

Making Requests

Make speech recognize requests with pre-recorded PCM 16kHz audio file

  1. wav_pcm_s16le_file_like = open('test.wav', 'rb')
  2. a.recognize_speech(wav_pcm_s16le_file_like)

Make speech recognize requests with 5 second recording from mic

  1. # from https://gist.github.com/mabdrabo/8678538
  2. import pyaudio
  3. import io
  4. audio = pyaudio.PyAudio()
  5. record_seconds = 5
  6. chunk_size = 1024
  7. rate = 16000
  8. # start Recording
  9. stream = audio.open(format=pyaudio.paInt16, channels=1, rate=rate, input=True, frames_per_buffer=chunk_size)
  10. frames = []
  11. for i in range(0, int(rate / chunk_size * record_seconds)):
  12. data = stream.read(chunk_size)
  13. frames.append(data)
  14. a.recognize_speech(io.BytesIO(b''.join(frames)))

Make speech recognize requests with streaming mic audio

  1. import pyaudio
  2. import threading
  3. mic_stopped = threading.Event()
  4. paudio = pyaudio.PyAudio()
  5. # start Recording
  6. mic_stream = paudio.open(format=pyaudio.paInt16, channels=1, rate=16000, input=True, frames_per_buffer=1024)
  7. class StoppableAudioStream:
  8. def __init__(self, audio, stream):
  9. self._audio = audio
  10. self._stream = stream
  11. self._stopped = False
  12. def read(self, size=-1):
  13. if mic_stopped.is_set():
  14. self._stopped = True
  15. self._stream.stop_stream()
  16. self._stream.close()
  17. self._audio.terminate()
  18. mic_stopped.clear()
  19. if self._stopped:
  20. return b''
  21. # workaround for pyaudio versions before exception_on_overflow=False
  22. while True:
  23. try:
  24. return self._stream.read(size)
  25. except:
  26. logger.exception("exception while reading from pyaudio stream")
  27. a.recognize_speech(StoppableAudioStream(paudio, mic_stream), mic_stopped)

Installation

External Dependencies

This package depends on common python packages as well as my fork of https://github.com/Lukasa/hyper, which has some changes necessary for simultaneous Tx & Rx

  1. pip install -r requirements.txt

Add necessary audio files for playback during timers and alarms

  1. alarm.wav
  2. timer.wav

AudioDevice Setup

The AudioDevice is an abstraction of audio playback capability. The required interface is very simple and can be implemented in many ways.

An implementation using mplayer:

  1. import shutil
  2. import subprocess
  3. from audio_player import AudioDevice
  4. class MplayerAudioDevice(AudioDevice):
  5. def __init__(self):
  6. self._paused = False
  7. def check_exists(self):
  8. return shutil.which('mplayer')
  9. def play_once(self, file):
  10. try:
  11. return subprocess.Popen(["mplayer", "-ao", "alsa", "-really-quiet", "-noconsolecontrols", "-slave", file],
  12. stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.STDOUT)
  13. except Exception:
  14. logger.exception("Couldn't play audio")
  15. def play_infinite(self, file):
  16. try:
  17. return subprocess.Popen(
  18. ["mplayer", "-ao", "alsa", "-really-quiet", "-noconsolecontrols", "-slave", "-loop", "0", file],
  19. stdout=subprocess.PIPE, stdin=subprocess.PIPE, stderr=subprocess.STDOUT)
  20. except:
  21. logger.exception("Couldn't play audio")
  22. def stop(self, p):
  23. p.communicate(input=b'quit 0\n')
  24. def pause(self, p):
  25. if not self._paused:
  26. p.communicate(input=b'pause\n')
  27. self._paused = True
  28. def resume(self, p):
  29. if self._paused:
  30. p.communicate(input=b'pause\n')
  31. self._paused = False
  32. def ended(self, p):
  33. return p.poll() is not None

An implementation using afplay:

  1. import shutil
  2. import subprocess
  3. from audio_player import AudioDevice
  4. class AfplayAudioDevice(AudioDevice):
  5. def check_exists(self):
  6. return shutil.which('afplay')
  7. def play_once(self, file):
  8. try:
  9. return subprocess.Popen(["afplay", file])
  10. except Exception:
  11. logger.exception("Couldn't play audio")
  12. def play_infinite(self, file):
  13. try:
  14. return subprocess.Popen(["while :; do afplay {}; done".format(file)], shell=True)
  15. except Exception:
  16. logger.exception("Couldn't play audio")
  17. def stop(self, p):
  18. p.terminate()
  19. try:
  20. p.wait(5)
  21. except subprocess.TimeoutExpired:
  22. p.kill()
  23. def pause(self, p):
  24. p.send_signal(signal.SIGSTOP)
  25. def resume(self, p):
  26. p.send_signal(signal.SIGCONT)
  27. def ended(self, p):
  28. return p.poll() is not None

Test Client

The test client test.py is provided which has been tested on macOS and raspbian. It uses https://github.com/Kitt-AI/snowboy for detection of the hotword “Alexa” and pyaudio for microphone input.

  1. At least one of mplayer and afplay must be available on the system.
  2. The files tokens.txt and secrets.txt must be present in the working directory. The tokens.txt schema is shown in Notes; the secrets.txt schema is as follows:

    1. {
    2. "client_id": "my_client_id",
    3. "client_secret": "my_client_secret"
    4. }
  3. After installing the requirements, get https://github.com/Kitt-AI/snowboy and follow the general installation instructions and specific ones for swig for Python.

  4. pyaudio should now be installed; if not please install it. If you install a system package for pyaudio and you are using virtualenv and you are unable to import it, see http://stackoverflow.com/questions/3371136/revert-the-no-site-packages-option-with-virtualenv. Now symoblically link the following into the working directory:

    1. ln -s /path/to/snowboy/resources/ .
    2. ln -s /path/to/snowboy/examples/Python/snowboydecoder.py .
    3. ln -s /path/to/snowboy/examples/Python/snowboydetect.py .
    4. ln -s /path/to/snowboy/examples/Python/_snowboydetect.so .
  5. run python test.py

Notes

  • Tested with Python 3.4
  • whenever the AVS instance has to refresh the access token, the new access and refresh tokens will be JSON de-serialized to a file named tokens.txt in the schema:

    1. {
    2. "refresh_token": "new_refresh_token",
    3. "access_token": "new_access_token"
    4. }
    • this can be changed in the token refresh write_out method write_tokens_to_file
  • * work in progress. error handling, channel interactions, and the interaction model in general still need to be completed to meet AVS guidelines