This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
eg-speech [2018/02/06 16:16] waltminer |
eg-speech [2018/07/25 14:28] jsmoeller |
||
---|---|---|---|
Line 1: | Line 1: | ||
- | Goals for Expert Group | + | ===== Speech Expert Group ===== |
+ | |||
+ | ===== Goals for Expert Group ===== | ||
* Create a standardized set of speech recognition APIs that app developers can use regardless of underlying speech engine | * Create a standardized set of speech recognition APIs that app developers can use regardless of underlying speech engine | ||
* Natural language or grammar tree based | * Natural language or grammar tree based | ||
Line 11: | Line 13: | ||
* | * | ||
- | Agenda for kick off meeting | + | ===== Meetings ===== |
- | * Review expert group goals as captured above | + | |
- | * Proposal for speech recognition API or TTS API from Amazon? | + | |
- | * Meeting schedule (biweekly? what time?) | + | ==== July 25, 2018 ==== |
- | * Developer commitment from EG members. | + | Attendees: |
+ | |||
+ | LF: <del>Walt</del>, Jan-Simon \\ | ||
+ | Nuance: Christian, Paul Purcell, Mike C., <del>Vince</del>, Arijit \\ | ||
+ | Amazon: <del>Premal</del>, Ankur, Naveen, Kamal\\ | ||
+ | NTT Data MSE: Imamura \\ | ||
+ | Denso Ten: <del>Kusakabe</del> \\ | ||
+ | Microchip: Michael, Christian \\ | ||
+ | IoT.bzh: <del>Stephane</del>, Fulup \\ | ||
+ | Konsulko: Matt P., <del>Matt R.</del>\\ | ||
+ | Myscript: Olivier, Etienne \\ | ||
+ | |||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface) | ||
+ | * Not wired up to a speech or TTS engine, more of a loop back test. | ||
+ | * Will send the link to Konsulko and IoT.bzh to review the API for suggestions. | ||
+ | * Done. | ||
+ | * List of AGL services available can be seen at https://git.automotivelinux.org/ | ||
+ | * Need to figure out consent and privacy issues with AGL Identity Agent. | ||
+ | * No update. | ||
+ | * How to manage grammar and natural language APIs and split between services and apps? | ||
+ | * How to integrate cloud speech applications? | ||
+ | * Example: "Find me the closest pizza place" is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app? | ||
+ | * Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua | ||
+ | * Update 7/25: update for FF use 8-channel CSL usb dac, about to land in gerrit. | ||
+ | * Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the [[agl-distro:sep2018-f2f|Sep F2F in Santa Clara]] to resolve issues with the design. | ||
+ | * Information on IRC, mail list etc is available at [[start:getting-started|Getting Started with AGL]] | ||
+ | * Supported hardware can be found at [[agl-distro#supported_hardware|AGL Distribution]] | ||
+ | |||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | * Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together. | ||
+ | * Update 7/25: reviewing above draft, will share ideas/design to run multiple engines in parallel. No timeline, yet. Will review internally and present in next call. | ||
+ | * Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall. | ||
+ | |||
+ | Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone. | ||
+ | * Update 7/25: | ||
+ | * Received hardware from MicroSemi, Alexa stack already working. | ||
+ | * Hardware is mic+dsp connected to rpi running the stack | ||
+ | * Plan: frontend should be connected over USB, integrated with 4a (hal) and interacting with the stack | ||
+ | * Stack needs to pick-up conditioned signal (near/far/noise-cancelling) through alsa device | ||
+ | * Michrochip will provide the hal for 4a | ||
+ | * Michael: Interest to extend the API for beamforming, multiple seats, "1 channel per seat" ? | ||
+ | |||
+ | * Question from Nucance about audio streaming: | ||
+ | * esoundlib - do you need special calls to stream audio | ||
+ | * Fulup: no, reply of 4a role request is the alsa device to write to | ||
+ | * 4a-play /usr/share/4a/media/Happy_MBB_75.ogg (only script) | ||
+ | |||
+ | Action item: | ||
+ | * Move the github repo into AGL git to foster collaboration | ||
+ | |||
+ | ==== July 11, 2018 ==== | ||
+ | Attendees: | ||
+ | |||
+ | LF: Walt, <del>Jan-Simon</del>\\ | ||
+ | Voicebox: \\ | ||
+ | Nuance: Christian, Paul Purcell, Mike C., <del>Vince</del> \\ | ||
+ | Amazon: <del>Premal, Ankur,</del> Naveen\\ | ||
+ | NTT Data MSE: Imamura \\ | ||
+ | Denso Ten: <del>Kusakabe</del> \\ | ||
+ | Microchip: Michael, Christian \\ | ||
+ | IoT.bzh: <del>Stephane</del>, Fulup \\ | ||
+ | Konsulko: Matt P., <del>Matt R</del>.\\ | ||
+ | |||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface) | ||
+ | * Not wired up to a speech or TTS engine, more of a loop back test. | ||
+ | * Will send the link to Konsulko and IoT.bzh to review the API for suggestions. | ||
+ | * List of AGL services available can be seen at https://git.automotivelinux.org/ | ||
+ | * Need to figure out consent and privacy issues with AGL Identity Agent. | ||
+ | * How to manage grammar and natural language APIs and split between services and apps? | ||
+ | * How to integrate cloud speech applications? | ||
+ | * Example: "Find me the closest pizza place" is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app? | ||
+ | * Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua | ||
+ | * Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the [[agl-distro:sep2018-f2f|Sep F2F in Santa Clara]] to resolve issues with the design. | ||
+ | * Information on IRC, mail list etc is available at [[start:getting-started|Getting Started with AGL]] | ||
+ | * Supported hardware can be found at [[agl-distro#supported_hardware|AGL Distribution]] | ||
+ | |||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | * Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together. | ||
+ | * Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall. | ||
+ | |||
+ | Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone. | ||
+ | |||
+ | Action item: | ||
+ | * Move the github repo into AGL git to foster collaboration | ||
+ | |||
+ | |||
+ | |||
+ | ==== June 27, 2018 ==== | ||
+ | Attendees: | ||
+ | |||
+ | LF: Walt, <del>Jan-Simon</del>\\ | ||
+ | Voicebox: \\ | ||
+ | Nuance: <del>Christian,</del> Mike C., <del>Vince</del> \\ | ||
+ | Amazon: Premal, Ankur, Naveen\\ | ||
+ | NTT Data MSE: <del>Imamura</del> \\ | ||
+ | Denso Ten: <del>Kusakabe</del> \\ | ||
+ | Microchip: <del>Michael, Christian</del> \\ | ||
+ | IoT.bzh: <del>Stephane</del>, Fulup \\ | ||
+ | Konsulko: Matt P., Matt R.\\ | ||
+ | |||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface) | ||
+ | * Not wired up to a speech or TTS engine, more of a loop back test. | ||
+ | * Will send the link to Konsulko and IoT.bzh to review the API for suggestions. | ||
+ | * List of AGL services available can be seen at https://git.automotivelinux.org/ | ||
+ | * Need to figure out consent and privacy issues with AGL Identity Agent. | ||
+ | * How to manage grammar and natural language APIs and split between services and apps? | ||
+ | * How to integrate cloud speech applications? | ||
+ | * Example: "Find me the closest pizza place" is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app? | ||
+ | * Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua | ||
+ | |||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | * Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together. | ||
+ | * Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall. | ||
+ | |||
+ | Voicebox was acquired by Nuance so they will probably not be participating as a separate entity. | ||
+ | |||
+ | |||
+ | ==== June 7, 2018 ==== | ||
+ | Attendees: | ||
+ | |||
+ | LF: Walt, Jan-Simon\\ | ||
+ | Voicebox: \\ | ||
+ | Nuance: Christian, <del>Mike C., Vince</del> \\ | ||
+ | Amazon: \\ | ||
+ | NTT Data MSE: Imamura \\ | ||
+ | Denso Ten: Kusakabe \\ | ||
+ | Microchip: Michael, Christian \\ | ||
+ | IoT.bzh: Stephane, Fulup \\ | ||
+ | Konsulko: Matt P., Matt R.\\ | ||
+ | |||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface) | ||
+ | * Not wired up to a speech or TTS engine, more of a loop back test. | ||
+ | * Will send the link to Konsulko and IoT.bzh to review the API for suggestions. | ||
+ | * List of AGL services available can be seen at https://git.automotivelinux.org/ | ||
+ | * Need to figure out consent and privacy issues with AGL Identity Agent. | ||
+ | * How to manage grammar and natural language APIs and split between services and apps? | ||
+ | * How to integrate cloud speech applications? | ||
+ | * Example: "Find me the closest pizza place" is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app? | ||
+ | * Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua | ||
+ | |||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | * No one joined. | ||
+ | |||
+ | |||
+ | Face-to-Face meeting planned for June 19 in Tokyo. | ||
+ | |||
+ | |||
+ | ==== May 30, 2018 ==== | ||
+ | Attendees: | ||
+ | |||
+ | LF: Walt, Jan-Simon\\ | ||
+ | Voicebox: \\ | ||
+ | Nuance: Christian, Mike C., Vince \\ | ||
+ | Amazon: \\ | ||
+ | NTT Data MSE: \\ | ||
+ | Denso Ten: Kusakabe \\ | ||
+ | Microchip: Michael, Christian \\ | ||
+ | IoT.bzh: Stephane \\ | ||
+ | Qt Company: | ||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface) | ||
+ | * Not wired up to a speech or TTS engine, more of a loop back test. | ||
+ | * Will send the link to Konsulko and IoT.bzh to review the API for suggestions. | ||
+ | * List of AGL services available can be seen at https://git.automotivelinux.org/ | ||
+ | * Need to figure out consent and privacy issues with AGL Identity Agent. | ||
+ | * How to manage grammar and natural language APIs and split between services and apps? | ||
+ | * How to integrate cloud speech applications? | ||
+ | * Example: "Find me the closest pizza place" is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app? | ||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | * No one joined. | ||
+ | |||
+ | Video conference during the Lorient F2F meeting on June 7 | ||
+ | |||
+ | Face-to-Face meeting planned for June 19 in Tokyo. | ||
+ | |||
+ | |||
+ | -------- | ||
+ | |||
+ | ==== May 16, 2018 ==== | ||
+ | Attendees: | ||
+ | LF: Walt\\ | ||
+ | Voicebox: \\ | ||
+ | Nuance: Christian \\ | ||
+ | Amazon: Premal \\ | ||
+ | NTT Data MSE: Imamura \\ | ||
+ | Denso Ten: Kusakabe \\ | ||
+ | Qt Company: Alistair | ||
+ | |||
+ | Notes: | ||
+ | |||
+ | Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer. | ||
+ | |||
+ | Amazon looking at releasing a possible API in June and starting to work with the AGL App FW. | ||
+ | |||
+ | Video conference during the Lorient F2F meeting on June 7 | ||
+ | |||
+ | Face-to-Face meeting planned for June 19 in Tokyo. | ||
- | Meeting Feb 6 | + | ==== Feb 14, 2018 ==== |
+ | Attendees: \\ | ||
+ | LF: Walt, Dan\\ | ||
+ | Voicebox: Andrew \\ | ||
+ | Nuance: Christian, Mike \\ | ||
+ | Amazon: John | ||
+ | |||
+ | Notes: | ||
+ | |||
+ | * Amazon still internally discussing making their API available for AGL. | ||
+ | * Amazon SDKs that are available publicly require an agreement with Amazon to access | ||
+ | * Alexa Voice Service Device SDK | ||
+ | * Alexa Skills Kit | ||
+ | |||
+ | * Nuance interface still discussing internally. Christian made a presentation that he will send around with Nuance's ideas for the API. | ||
+ | * Walt will use the minutes from these calls to lead an AMM session next week that updates the community on the EG's work. | ||
+ | |||
+ | |||
+ | |||
+ | ==== Feb 6, 2018 ==== | ||
Attendees: \\ | Attendees: \\ | ||
LF: Walt, Jan-Simon\\ | LF: Walt, Jan-Simon\\ | ||
Line 35: | Line 271: | ||
-------- | -------- | ||
- | Kick off Meeting Jan 29 | + | ==== Kick off Meeting Jan 29, 2018 ==== |
Attendees: Walt, Jan-Simon, Mike Chachich, John Scumniotales, Andrew Fairly, Vince Iannotti, Christian Benien (attending AMM) | Attendees: Walt, Jan-Simon, Mike Chachich, John Scumniotales, Andrew Fairly, Vince Iannotti, Christian Benien (attending AMM) | ||
+ | |||
+ | Agenda for kick off meeting | ||
+ | * Review expert group goals as captured above | ||
+ | * Proposal for speech recognition API or TTS API from Amazon? | ||
+ | * Meeting schedule (biweekly? what time?) | ||
+ | * Developer commitment from EG members. | ||
* Reviewed agenda and notes from CES | * Reviewed agenda and notes from CES |