Automotive Linux Wiki

Speech Expert Group

Goals for Expert Group

Create a standardized set of speech recognition APIs that app developers can use regardless of underlying speech engine
Natural language or grammar tree based
On board or cloud based speech
Text to Speech API
Signal processing for noise reduction and echo cancellation (future roadmap)
Grammar development tools
Amazon has an open API that could be used as a starting point
Goal is to have a straw man by the AMM in Tokyo, Feb. 20-21 with work completed by ALS in June.
Next step is to set up a call with the engineers from all three companies who will be working on this.

Architecture and Design Documents

Architecture

Meetings

This EG no longer meets.

May 27, 2020

Attendees: Walt, Jan-Simon, Michael, Scott

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master
- Update 03/04: Waiting. Pending upstream branch support, YP 'dunfell' and SCP
- Update 04/01: No update from amazon so far
- Update 04/15: Amazon is updating the alexa service for the next SDK release.
- Update 4/29: Amazon update released last week was for thud, does not support zeus or dunfell.
- 5/27 - Will use Alexa Auto SDK 2.0 for dunfell/ Jellyfish unless Amazon pushes a dunfell version.

Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch
- Update 04/01: No update
- Update 04/15: No update
- 4/29 - “The only update is that there is no update” Michael has not heard anything about why they are delayed. Will follow up later today.
- 5/27 - Microphones are in assembly and should be delivered end of next week to Microchip. Will take another week to prepare them and they will be back “in-stock” in mid-June

April 29, 2020

Attendees: Walt, Jan-Simon, Michael, Scott, Swapnil

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master
- Update 03/04: Waiting. Pending upstream branch support, YP 'dunfell' and SCP
- Update 04/01: No update from amazon so far
- Update 04/15: Amazon is updating the alexa service for the next SDK release.
- Update 4/29: Amazon update released last week was for thud, does not support zeus or dunfell.

Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch
- Update 04/01: No update
- Update 04/15: No update
- 4/29 - “The only update is that there is no update” Michael has not heard anything about why they are delayed. Will follow up later today.

April 15, 2020

Attendees: Jan-Simon, Kusakabe-san, Michael, Scott

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master
- Update 03/04: Waiting. Pending upstream branch support, YP 'dunfell' and SCP
- Update 04/01: No update from amazon so far
- Update 04/15: Amazon is updating the alexa service for the next SDK release.

Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch
- Update 04/01: No update
- Update 04/15: No update

April 1, 2020

Attendees: Walt, Jan-Simon, Scott, Parth, Kusakabe-san

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master
- Update 03/04: Waiting. Pending upstream branch support, YP 'dunfell' and SCP
- Update 04/01: No update from amazon so far

Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch
- Update 04/01: No update

New:

Possibly flip meeting slot with virt-eg?

February 19, 2020

Attendees: Walt, Jan-Simon, Scott, Li, Kusakabe-san

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master
- Update 03/04: Waiting. Pending upstream branch support, YP 'dunfell' and SCP

Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch

New:

———-

February 19, 2020

Attendees: Walt, Jan-Simon, Michael, Scott, Li, Kusakabe-san

* Roadmap review
  * Other engines ?
    * Input from vendors/suppliers required.
  * TTS-API (e.g. for navi)?

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master

Scott: Note from devcall: no zeus support in any alexa auto sdk
- Update 02/19: patches in gerrit/next to make zeus work
Michael: New batch of microphones being produced
- Working on DSP improvements
- Update 02/19: 5-6 weeks for next batch

New:

February 5, 2020

Attendees: Jan-Simon, Michael, Scott, Li, Kusakabe-san

Updates since December and CES
- A lot of work during F2F in SFO
  - added wakework support back (but need closed binaries, push-to-talk works w/o)
  - fixed json-c handling issues in alexa-voiceagent and capabilities
    - ping Philip/Naveen/Ezequiel Cervantes to take patches
  - simple alexa-viewer (to replace parts of ICS app)
  - alexa device auth using QR-code (Raquel)
  - Kudos to Scott Murray
- CES demo using alexa wakeword and ptt over steering wheel
- Update 02/05: Scott will ping amazon wrt patches.

Plan to update to the new alexa auto SDK after release of icefish.
- Update 02/05: waiting for next to land in master

New:

Scott: Note from devcall: no zeus support in any alexa auto sdk
Michael: New batch of microphones being produced
- Working on DSP improvements

January 22, 2020

Attendees: Jan-Simon, Michael, Scott, Dennis, Kusakabe-san

Updates since December and CES
- A lot of work during F2F in SFO
  - added wakework support back (but need closed binaries, push-to-talk works w/o)
  - fixed json-c handling issues in alexa-voiceagent and capabilities
    - ping Philip/Naveen/Ezequiel Cervantes to take patches
  - simple alexa-viewer (to replace parts of ICS app)
  - alexa device auth using QR-code (Raquel)
  - Kudos to Scott Murray
- CES demo using alexa wakeword and ptt over steering wheel

Updates since November Karlsruhe meeting.
- Raquel finalizing new authorization method. Will be ready prior to SFO F2F meeting.
- Need wake word solution from Amazon. Waiting to see if Philip is back from his vacation,
- No word from ICS for app code.

New:

Plan to update to the new alexa auto SDK after release of icefish.

November 27, 2019

Attendees: Walt, Scott, Jan-Simon, Thierry, Kusakabe

Updates since November Karlsruhe meeting.
- Raquel finalizing new authorization method. Will be ready prior to SFO F2F meeting.
- Need wake word solution from Amazon. Waiting to see if Philip is back from his vacation,
- No word from ICS for app code.

New:

October 16, 2019

Attendees: Walt, Scott, Jan-Simon, Michael, Li

On alexa auto sdk v2.0 using an integrated platform build (kudos to Scott Murray)
Updates since Karlsruhe
- Audio input and output both work on master according to Jan-Simon. Raquel testing on halibut and we think it works there as well.
- Scott pushed rework to agl-service-voicehigh to remove hard dependencies on Alexa.
- Scott reworking home screen changes from Naveen.
- Need wake word solution from Amazon
- Waiting on ICS for app code.

New:

October 02, 2019

Attendees: Walt, Jan-Simon, Kusakabe-san, Kurokawa-san

On alexa auto sdk v2.0 using an integrated platform build (kudos to Scott Murray)
Integration Session in Karlsruhe:
- Mic works, audio out works, but not at same time.
- Fixes WIP for pipewire
- agl-service-voiceagent-alexa needs fixes for smack/multiuser
  - issues are known and resolved, but need to be integrate
  - AI: Jan-Simon/Scott write-up steps in confluence
  - AI: Walt: generic icon for TTS-button
- Steering wheel: we got first lin message but infrequent, need to research BCM (in-kernel scheduler for polling) and / or have the can-low-level listen (worst case configure BCM as well).
ICS-app: waiting for new binaries for AMM. (aarch64 + x86-64)

New:

September 18, 2019

Attendees: Walt, Jan-Simon, Scott, Kusakabe

Thierry still waiting for Naveen's patches that were discussed during the last call.
- 8/21: No update
- 9/4: Thierry not available today. We do not think he has received anything from Naveen.
Thierry needs a HH SDK that is stable and not a moving target. Will contact Jan-Simon to get the Halibut release folder created and the SDK populated.
- 8/21: Done.
Soya-san has the new front end and is testing it with a number of new features and has a Microsemi Field Apps Engineer colleague in the Tokyo office to get some support with tuning the device. This will include features such as barge-in and better responsiveness.
- 8/21: Mic being optimized (far-field/near-field adaptation)
- 9/4: Still in progress. Delayed by vacations and availability of the Microsemi FAE.
Todo: check with new version of the Alexa SDK
- Required: port to pipewire output (https://jira.automotivelinux.org/browse/SPEC-2761)
- Put https://github.com/alexa/alexa-auto-sdk/tree/1.6/platforms/agl/alexa-voiceagent-service build in recipe or improve build situation (currently container + external SDK is necessary)

New:

Planning an integration session the week after the Berlin F2F in Karlsruhe. Need Amazon and possibly ICS support.

September 4, 2019

Attendees: Walt, Jan-Simon, Michael, Fulup, Scott

Thierry still waiting for Naveen's patches that were discussed during the last call.
- 8/21: No update
- 9/4: Thierry not available today. We do not think he has received anything from Naveen.
Thierry needs a HH SDK that is stable and not a moving target. Will contact Jan-Simon to get the Halibut release folder created and the SDK populated.
- 8/21: Done.
Soya-san has the new front end and is testing it with a number of new features and has a Microsemi Field Apps Engineer colleague in the Tokyo office to get some support with tuning the device. This will include features such as barge-in and better responsiveness.
- 8/21: Mic being optimized (far-field/near-field adaptation)
- 9/4: Still in progress. Delayed by vacations and availability of the Microsemi FAE.
Todo: check with new version of the Alexa SDK
- Required: port to pipewire output (https://jira.automotivelinux.org/browse/SPEC-2761)
- Put https://github.com/alexa/alexa-auto-sdk/tree/1.6/platforms/agl/alexa-voiceagent-service build in recipe or improve build situation (currently container + external SDK is necessary)

New:

Planning an integration session the week after the Berlin F2F in Karlsruhe. Need Amazon and possibly ICS support.

August 21, 2019

Attendees: Jan-Simon, Thierry, Michael

Thierry still waiting for Naveen's patches that were discussed during the last call.
- 8/21: No update
Thierry needs a HH SDK that is stable and not a moving target. Will contact Jan-Simon to get the Halibut release folder created and the SDK populated.
- 8/21: Done.
Soya-san has the new front end and is testing it with a number of new features and has a Microsemi Field Apps Engineer colleague in the Tokyo office to get some support with tuning the device. This will include features such as barge-in and better responsiveness.
- 8/21: Mic being optimized (far-field/near-field adaptation)
Todo: check with new version of the Alexa SDK
- Required: port to pipewire output (https://jira.automotivelinux.org/browse/SPEC-2761)
- Put https://github.com/alexa/alexa-auto-sdk/tree/1.6/platforms/agl/alexa-voiceagent-service build in recipe or improve build situation (currently container + external SDK is necessary)

New:

June 12, 2019

Attendees: Walt, Thierry, Michael, Chris, Kusakabe

Thierry still waiting for Naveen's patches that were discussed during the last call.
Thierry needs a HH SDK that is stable and not a moving target. Will contact Jan-Simon to get the Halibut release folder created and the SDK populated.
Soya-san has the new front end and is testing it with a number of new features and has a Microsemi Field Apps Engineer colleague in the Tokyo office to get some support with tuning the device. This will include features such as barge-in and better responsiveness.

May 29, 2019

Attendees: Walt, Jan-Simon, Thierry, Naveen

Naveen has two patches to commit to the IoT.bzh code to enable the open source agent on guppy. Thierry will integrate the patches then Jan-Simon can test on the green machine in Germany prior to ALS (preferably before end of June).

Email between Walt and Naveen
1. Plan to use Alexa demo during Automotive Linux Summit in July. We are thinking we will reuse the CES demo and want to make sure we can do that [Naveen] Yes you can use that. This year we are mostly working on integrating our offline Alexa engine with Auto SDK. So, for next release of Auto SDK, which is tentatively planned for end of July, offline Alexa support will be added to AGL binding. This should add offline car control and local media capabilities to the Alexa binding and speech framework.
2. Plan for CES 2020. Will Amazon be participating in CES 2020 and if so what are your ideas for that? [Naveen] Currently we don't have any plans for CES 2020. However, since we are updating the binding actively with every Auto SDK release, IOT.bzh will be in a position to host the Alexa demo from Linux foundation perspective.
3. Plan for F2F meeting in September. I am planning a September F2F meeting and need to choose a location. If Amazon plans to participate in CES 2020 I assume you will want to participate like you did last year in Santa Clara. The only proposals I have on the table so far are for meetings in Europe. I know you have travel restrictions so if you plan to participate I will search harder for a US site for the meeting. [Naveen] Yes. I do have travel restrictions. However since our CES 2020 plans are not clear, please don't anchor F2F meeting based on Amazon's availability.

May 15, 2019

Attendees: Walt, Fulup, Thierry, Michael

Microphone arrays were distributed last week in Spain. Demo units and Amazon need to be exchanged still.

Thierry has been working on getting the Amazon Open Voice agent working. Should have a separate call with Naveen tomorrow. Issues with authentication and we will probably need a proprietary binary blob from Amazon for the wake word detection.

May 1, 2019

Meeting canceled due to holiday

April 17, 2019

Attendees: Walt, Jan-Simon, Michael, Anantha, Kusakabe

No update from Amazon available.

Michael - microphone arrays will be ready on Tuesday. Should be able to bring one to the F2F meeting in Spain.

April 3, 2019

Attendees: Walt, Michael, George, Thierry, Kusakabe, Anantha

Thierry working with Naveen on getting Alexa Voiceagent integrated. Running into issues.

Michael - microphone arrays will be ready week 16 or 17. Should be able to bring one to the F2F meeting in Spain.

Phone dial app updates - SPEC-2300 assigned to Konsulko to work on HVAC app updates - SPEC-2301 assigned to Konsulko to work on

March 20, 2019

Attendees: Walt, Jan-Simon, Michael, Eric, Anantha (Panasonic), Naveen, Imamura, George

Amazon formally open sourced the Alex Voiceagent code as well speech agent bindings. Working with IoT.bzh to rebuild the CES demo with the open source version. Running to issues with packaging the widgets that IoT.bzh is working on.

Now will work to speech enable some of the demo applications. Walt to create Jira tickets for HVAC and phone dialer app and will ask Konsulko to help with that.

Michael - waiting for production run of the microphone arrays to come back.

Michael brought up questions about how Alexa will interact with AGL services. Referenced Lucas's talk at the AMM. See this presentation.

Introduced George and his pipewire effort. Will have a further update on his plans for the next meeting.

March 6, 2019

Cancelled due to AMM

February 20, 2019

Attendees: Walt, Jan-Simon, Thierry, Fulup, Michael, Imamura, Sebastien

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

Amazon
- Currently working on officially open sourcing Alexa Voiceagent with their Auto SDK version 1.5 end of February.
- Version 1.6 will include off-line voice control (end of April release).

Embedded World (Feb 26–28)
- Will need to use the CES version of the demo and not the open source version.
- Naveen will need to send a config file that allows the demo to be run anywhere in the world as opposed to just in Las Vegas.
- Thierry has figured out why the Fiberdyne amplifier was not working at CES. Jan-Simon confirmed it works on the green machine.
- Waiting on the Amazon widgets. Expected tomorrow from Naveen.

Roadmap for 2019
- Incorporate Alexa Voiceagent into AGL device profiles
- Create open source Alexa demo
- Demo open source Alexa Voiceagent at ALS (July 17-19)? or wait for CES 2020?
- Refactor speech framework to improve modularization (HH)
- Local voice control that allows control of car functions when off-line

February 6, 2019

Attendees: Walt, Jan-Simon, Michael, Sebastien, Kusakabe, Chaitanya, Imamura, Naveen

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

Amazon
- Currently working on officially open sourcing Alexa Voiceagent with their Auto SDK version 1.5 end of February.
- Version 1.6 will include off-line voice control (end of April release).

Embedded World (Feb 26–28)
- Will need to use the CES version of the demo and not the open source version.
- Naveen will need to send a config file that allows the demo to be run anywhere in the world as opposed to just in Las Vegas.

Roadmap for 2019
- Incorporate Alexa Voiceagent into AGL device profiles
- Create open source Alexa demo
- Demo open source Alexa Voiceagent at ALS (July 17-19)? or wait for CES 2020?
- Refactor speech framework to improve modularization (HH)
- Local voice control that allows control of car functions when off-line

January 23, 2019

Attendees: Walt, Jan-Simon, Mike C, Michael F,

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

Discussed CES

November 31, 2018

Attendees: Walt, Jan-Simon, Paul, Naveen, Tanikawa, Shotaro, Kusakabe, Kurokawa, Fulup, Christian, Imamura, Supriya, Arijit

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

October 31, 2018

Attendees: Walt, Jan-Simon, Imamura, Michael F, Mike C, Ricardo, Arijit, Christian B, Thierry, Paul, Christian G, Naveen

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

Nuance announced that they need to pull people off of the CES demo work to support customer projects so they will not participate in the CES demo.
Reviewed Amazon gerrit submission https://gerrit.automotivelinux.org/gerrit/#/c/17877/

October 31, 2018

Attendees: Walt, Jan-Simon, Fulup, Paul, Adam, Imamura, Arijit, Naveen, Michael F, Mike C, Ricardo, Lily the barking pug,

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SpeechArch/Speech+EG%27s+CES+2019+Project

Prototype microphones from Microchip were distributed to IoT.bzh, Amazon, and Nuance in Dresden

Questions from Naveen's email
1. CES project 2019 overall plan. What will be the setup of the green boxes and how will the box running speech framework fit into the overall demo?
2. Will there be a car mockup? Is it possible to do car control capabilities? Like climate control or locking doors.
  1. Standard green machine set up with fans and HVAC actuators.
3. What is the process for submitting code to AGL? Is there a branch that we should submit code for review to or is it master?
4. Should we host the High level voice service code in a public github and have Amazon, Nuance and IOT.BZH as committees? And submit a recipe to AGL Gerrit?
5. Once the high level voice service code is in github we will not be immediately ready to open source Alexa voice agent code. So we will have to provide binary to Fulup's team for working on app integration. Will that work?
6. Fulup, do you have the tool chain for green box to compile the high level voice service code? What is the hardware spec of the green box?
7. Fulup, which apps are you planning to integrate immediately for CES? Will it include Navigation app?

October 17, 2018

Attendees: Canceled due to AMM in Dresden.

October 3, 2018

Attendees:

LF: Walt, ~~Jan-Simon~~
Nuance: ~~Christian~~, Paul Purcell, ~~Andrew~~, ~~Adam~~, Mike C, ~~Vince~~, Arijit, Matthew Tundo\\gggggf Amazon: ~~Premal, Ankur,~~ Naveen, ~~Kamal, Alain~~
NTT Data MSE: Imamura
Denso Ten: ~~Kusakabe~~
Microchip: ~~Michael~~, ~~Christian~~
IoT.bzh: Stephane, Fulup, Sebastien
Konsulko: ~~Matt Porter, Scott, M,~~ ~~Matt Ranostay~~.

Documents now in Confluence
- Architecture : https://confluence.automotivelinux.org/display/SPE/Speech+EG+Architecture
- CES Project : https://confluence.automotivelinux.org/display/SPE/Speech+EG%27s+CES+2019+Project

Reviewed questions and comments about latest Amazon proposal (v1.3).
- Wake word engine when multiple agents are present
- PTT versus wake word versus a mixed mode
- Multi-modal interactions when engaged in a voice session

Next Steps (for F2F next week)
- Further review of high level architecture contained in the document
- Review the proposed API in the document
- Start to define support binding APIs both reuse of existing ones and new ones that may be required
- Input audio architecture

September 19, 2018

Attendees:

LF: Walt, ~~Jan-Simon~~
Nuance: Christian, Paul Purcell, ~~Andrew~~, Adam, Mike C, ~~Vince~~, Arijit, Matthew Tundo
Amazon: Premal, Ankur, Naveen, Kamal, Alain
NTT Data MSE: Imamura
Denso Ten: ~~Kusakabe~~
Microchip: ~~Michael~~, ~~Christian~~
IoT.bzh: Stephane, Fulup
Konsulko: Matt Porter, Scott, M, ~~Matt Ranostay~~.

Walt working on getting a Confluence site set up for AGL
F2F meeting outcome from Santa Clara should be sent out publicly sometime this week.

Reviewed questions and comments about latest Amazon proposal (v1.1).
- Wake word engine when multiple agents are present
- PTT versus wake word versus a mixed mode
- Multi-modal interactions when engaged in a voice session

Next Steps (for F2F next week)
- Further review of high level architecture contained in the document
- Review the proposed API in the document
- Start to define support binding APIs both reuse of existing ones and new ones that may be required
- Input audio architecture

September 6, 2018

Attendees: LF: Walt, ~~Jan-Simon~~
Nuance: Christian, Paul Purcell, Andrew, Adam, Mike C, Vince, Arijit, Matthew Tundo
Amazon: Premal, Ankur, Naveen, Kamal, Alain
NTT Data MSE: Imamura
Denso Ten: Kusakabe
Microchip: Michael, Christian
IoT.bzh: Stephane, Fulup
Konsulko: Matt Porter, Scott, M, ~~Matt Ranostay~~.

Reviewed questions and comments about latest Amazon proposal (v1.1).
- Wake word engine when multiple agents are present
- PTT versus wake word versus a mixed mode
- Multi-modal interactions when engaged in a voice session

Next Steps (for F2F next week)
- Further review of high level architecture contained in the document
- Review the proposed API in the document
- Start to define support binding APIs both reuse of existing ones and new ones that may be required
- Input audio architecture

September 5, 2018

Attendees: Upcoming Meeting

LF: Walt, ~~Jan-Simon~~
Nuance: Christian, Paul Purcell, Andrew, Adam, Mike C, Vince, Arijit, Matthew Tundo
~~Amazon: Premal, Ankur, Naveen, Kamal, Alain~~
NTT Data MSE: Imamura
Denso Ten: Kusakabe
Microchip: Michael, Christian
IoT.bzh: Stephane, Fulup
Konsulko: Matt Porter, Scott, M, ~~Matt Ranostay~~.

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
- Done.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
- No update.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua
- Update 7/25: update for FF use 8-channel CSL usb dac, about to land in gerrit.
Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the Sep F2F in Santa Clara to resolve issues with the design.
Information on IRC, mail list etc is available at Getting Started with AGL
Supported hardware can be found at AGL Distribution
8/8
- Arijit received the M3 hardware and was able to get it running. Building a “Hello World” sample application using Virtual Box and M3 hardware.
8/21
- Nuance email list Automotive-Grade-Linux@nuance.com
9/5
- Matt Tundo working on test audio application. Having trouble getting microphone capture via ALSA and 4a.
- Document for getting AGL working in native Linux http://docs.automotivelinux.org/docs/devguides/en/dev/reference/host-configuration/docs/1_Prerequisites.html

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
- Update 7/25: reviewing above draft, will share ideas/design to run multiple engines in parallel. No timeline, yet. Will review internally and present in next call.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.
8/8
- Naveen presented some use cases and an architecture diagram that Amazon has been working on internally. Received good feedback from the team. Naveen and his team will update their internal wiki and present again at the next meeting. Will look into getting the info onto the AGL wiki after that.

Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone.

Update 7/25:
- Received hardware from MicroSemi, Alexa stack already working.
- Hardware is mic+dsp connected to rpi running the stack
- Plan: frontend should be connected over USB, integrated with 4a (hal) and interacting with the stack
  - Stack needs to pick-up conditioned signal (near/far/noise-cancelling) through alsa device
  - Michrochip will provide the hal for 4a
  - Michael: Interest to extend the API for beamforming, multiple seats, “1 channel per seat” ?
- Update 8/22
  - Received five eval kits from MicroSemi. So far so good. Prototypes will be delivered for AMM.
- Update 9/5
  - Dresden time-frame will some units ready for evaluation. After that they will mass-produce a since PCB devices that will be readily available for purchase.

Action item:

Walt to set up extra call for later this week.

August 22, 2018

Attendees: Upcoming Meeting

LF: Walt, Jan-Simon
Nuance: Christian</del>, Paul Purcell, ~~Mike C.~~, ~~Vince~~, Arijit, Matthew
Amazon: ~~Premal~~, Ankur, ~~Naveen~~, Kamal, Alain
NTT Data MSE: Imamura
Denso Ten: Kusakabe
Microchip: Michael, ~~Christian~~
IoT.bzh: ~~Stephane~~, ~~Fulup~~
Konsulko: ~~Matt P.~~, ~~Matt R.~~
Myscript: ~~Olivier, Etienne~~

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
- Done.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
- No update.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua
- Update 7/25: update for FF use 8-channel CSL usb dac, about to land in gerrit.
Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the Sep F2F in Santa Clara to resolve issues with the design.
Information on IRC, mail list etc is available at Getting Started with AGL
Supported hardware can be found at AGL Distribution
8/8
- Arijit received the M3 hardware and was able to get it running. Building a “Hello World” sample application using Virtual Box and M3 hardware.
8/21
- Nuance email list Automotive-Grade-Linux@nuance.com

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
- Update 7/25: reviewing above draft, will share ideas/design to run multiple engines in parallel. No timeline, yet. Will review internally and present in next call.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.
8/8
- Naveen presented some use cases and an architecture diagram that Amazon has been working on internally. Received good feedback from the team. Naveen and his team will update their internal wiki and present again at the next meeting. Will look into getting the info onto the AGL wiki after that.

Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone.

Update 7/25:
- Received hardware from MicroSemi, Alexa stack already working.
- Hardware is mic+dsp connected to rpi running the stack
- Plan: frontend should be connected over USB, integrated with 4a (hal) and interacting with the stack
  - Stack needs to pick-up conditioned signal (near/far/noise-cancelling) through alsa device
  - Michrochip will provide the hal for 4a
  - Michael: Interest to extend the API for beamforming, multiple seats, “1 channel per seat” ?
- Update 8/22
  - Received five eval kits from MicroSemi. So far so good. Prototypes will be delivered for AMM.

Question from Nuance about audio streaming:
- esoundlib - do you need special calls to stream audio
- Fulup: no, reply of 4a role request is the alsa device to write to
- 4a-play /usr/share/4a/media/Happy_MBB_75.ogg (only script)

Action item:

August 8, 2018

Attendees:

LF: Walt, ~~Jan-Simon~~
Nuance: ~~Christian~~, Paul Purcell, Mike C., ~~Vince~~, Arijit
Amazon: Premal, Ankur, Naveen, Kamal, Alain
NTT Data MSE: Imamura
Denso Ten: ~~Kusakabe~~
Microchip: ~~Michael, Christian~~
IoT.bzh: ~~Stephane~~, Fulup
Konsulko: ~~Matt P.~~, ~~Matt R.~~
Myscript: ~~Olivier, Etienne~~

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
- Done.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
- No update.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua
- Update 7/25: update for FF use 8-channel CSL usb dac, about to land in gerrit.
Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the Sep F2F in Santa Clara to resolve issues with the design.
Information on IRC, mail list etc is available at Getting Started with AGL
Supported hardware can be found at AGL Distribution
8/8
- Arijit received the M3 hardware and was able to get it running. Building a “Hello World” sample application using Virtual Box and M3 hardware.

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
- Update 7/25: reviewing above draft, will share ideas/design to run multiple engines in parallel. No timeline, yet. Will review internally and present in next call.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.
8/8
- Naveen presented some use cases and an architecture diagram that Amazon has been working on internally. Received good feedback from the team. Naveen and his team will update their internal wiki and present again at the next meeting. Will look into getting the info onto the AGL wiki after that.

Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone.

Update 7/25:
- Received hardware from MicroSemi, Alexa stack already working.
- Hardware is mic+dsp connected to rpi running the stack
- Plan: frontend should be connected over USB, integrated with 4a (hal) and interacting with the stack
  - Stack needs to pick-up conditioned signal (near/far/noise-cancelling) through alsa device
  - Michrochip will provide the hal for 4a
  - Michael: Interest to extend the API for beamforming, multiple seats, “1 channel per seat” ?

Question from Nucance about audio streaming:
- esoundlib - do you need special calls to stream audio
- Fulup: no, reply of 4a role request is the alsa device to write to
- 4a-play /usr/share/4a/media/Happy_MBB_75.ogg (only script)

Action item:

Move the github repo into AGL git to foster collaboration - Done

July 25, 2018

Attendees:

LF: ~~Walt~~, Jan-Simon
Nuance: Christian, Paul Purcell, Mike C., ~~Vince~~, Arijit
Amazon: ~~Premal~~, Ankur, Naveen, Kamal
NTT Data MSE: Imamura
Denso Ten: ~~Kusakabe~~
Microchip: Michael, Christian
IoT.bzh: ~~Stephane~~, Fulup
Konsulko: Matt P., ~~Matt R.~~
Myscript: Olivier, Etienne

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
- Done.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
- No update.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua
- Update 7/25: update for FF use 8-channel CSL usb dac, about to land in gerrit.
Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the Sep F2F in Santa Clara to resolve issues with the design.
Information on IRC, mail list etc is available at Getting Started with AGL
Supported hardware can be found at AGL Distribution

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
- Update 7/25: reviewing above draft, will share ideas/design to run multiple engines in parallel. No timeline, yet. Will review internally and present in next call.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.

Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone.

Update 7/25:
- Received hardware from MicroSemi, Alexa stack already working.
- Hardware is mic+dsp connected to rpi running the stack
- Plan: frontend should be connected over USB, integrated with 4a (hal) and interacting with the stack
  - Stack needs to pick-up conditioned signal (near/far/noise-cancelling) through alsa device
  - Michrochip will provide the hal for 4a
  - Michael: Interest to extend the API for beamforming, multiple seats, “1 channel per seat” ?

Question from Nucance about audio streaming:
- esoundlib - do you need special calls to stream audio
- Fulup: no, reply of 4a role request is the alsa device to write to
- 4a-play /usr/share/4a/media/Happy_MBB_75.ogg (only script)

Action item:

Move the github repo into AGL git to foster collaboration

July 11, 2018

Attendees:

LF: Walt, ~~Jan-Simon~~
Voicebox:
Nuance: Christian, Paul Purcell, Mike C., ~~Vince~~
Amazon: ~~Premal, Ankur,~~ Naveen
NTT Data MSE: Imamura
Denso Ten: ~~Kusakabe~~
Microchip: Michael, Christian
IoT.bzh: ~~Stephane~~, Fulup
Konsulko: Matt P., ~~Matt R~~.

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua
Starting a demo project internally led by Paul. End of Aug early Sep they plan to have a design document together internally and will be ready with any questions/issue. Would be a good idea to target the Sep F2F in Santa Clara to resolve issues with the design.
Information on IRC, mail list etc is available at Getting Started with AGL
Supported hardware can be found at AGL Distribution

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.

Microchip - AGL USB microphone front-end. Michael is working with MicroSemi on getting hardware available that is already available for Amazon Alexa. Would like to have a prototype available for the AMM in Dresden. Microchip plans to provide the HAL for the microphone.

Action item:

Move the github repo into AGL git to foster collaboration

June 27, 2018

Attendees:

LF: Walt, ~~Jan-Simon~~
Voicebox:
Nuance: ~~Christian,~~ Mike C., ~~Vince~~
Amazon: Premal, Ankur, Naveen
NTT Data MSE: ~~Imamura~~
Denso Ten: ~~Kusakabe~~
Microchip: ~~Michael, Christian~~
IoT.bzh: ~~Stephane~~, Fulup
Konsulko: Matt P., Matt R.

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Starting to look at AGL App FW binder implementation using audio HAL as a reference. Will work with IoT.bzh on how to put the configuration together.
Would like to put together an architecture picture based on the white board drawings from February AMM to see how the API fits into AGL overall.

Voicebox was acquired by Nuance so they will probably not be participating as a separate entity.

June 7, 2018

Attendees:

LF: Walt, Jan-Simon
Voicebox:
Nuance: Christian, ~~Mike C., Vince~~
Amazon:
NTT Data MSE: Imamura
Denso Ten: Kusakabe
Microchip: Michael, Christian
IoT.bzh: Stephane, Fulup
Konsulko: Matt P., Matt R.

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?
Sample config from softmixer https://github.com/iotbzh/4a-softmixer/blob/master/conf.d/project/lua.d/smixer-test-simple.lua

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

No one joined.

Face-to-Face meeting planned for June 19 in Tokyo.

May 30, 2018

Attendees:

LF: Walt, Jan-Simon
Voicebox:
Nuance: Christian, Mike C., Vince
Amazon:
NTT Data MSE:
Denso Ten: Kusakabe
Microchip: Michael, Christian
IoT.bzh: Stephane
Qt Company:

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer in github (https://github.com/Nuance-Mobility/agl-speech-interface)

Not wired up to a speech or TTS engine, more of a loop back test.
Will send the link to Konsulko and IoT.bzh to review the API for suggestions.
List of AGL services available can be seen at https://git.automotivelinux.org/
Need to figure out consent and privacy issues with AGL Identity Agent.
How to manage grammar and natural language APIs and split between services and apps?
How to integrate cloud speech applications?
- Example: “Find me the closest pizza place” is processed in the cloud and the location and name are returned to the ECU. How is this then transmitted to the POI and/or navi app?

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

No one joined.

Video conference during the Lorient F2F meeting on June 7

Face-to-Face meeting planned for June 19 in Tokyo.

May 16, 2018

Attendees: LF: Walt
Voicebox:
Nuance: Christian
Amazon: Premal
NTT Data MSE: Imamura
Denso Ten: Kusakabe
Qt Company: Alistair

Notes:

Nuance still discussing internally about releasing their API. Christian working with AGL App FW and working with writing an AGL Service layer.

Amazon looking at releasing a possible API in June and starting to work with the AGL App FW.

Video conference during the Lorient F2F meeting on June 7

Face-to-Face meeting planned for June 19 in Tokyo.

Feb 14, 2018

Attendees:
LF: Walt, Dan
Voicebox: Andrew
Nuance: Christian, Mike
Amazon: John

Notes:

Amazon still internally discussing making their API available for AGL.
Amazon SDKs that are available publicly require an agreement with Amazon to access
- Alexa Voice Service Device SDK
- Alexa Skills Kit

Nuance interface still discussing internally. Christian made a presentation that he will send around with Nuance's ideas for the API.
Walt will use the minutes from these calls to lead an AMM session next week that updates the community on the EG's work.

Feb 6, 2018

Attendees:
LF: Walt, Jan-Simon
Voicebox: Andrew and Adam
Nuance: Christian, Mike
Amazon: None

Notes:

No word from Amazon on the availability of their API as well the link to what is publicly.
Nuance interface still discussing internally. Still waiting for Amazon proposal.
Walt will follow up with Amazon about their API. Schedule a follow up call for Tuesday, Feb 13
Question about how AGL handles app creation and installation
Developer guide for app developers is available at https://docs.automotivelinux.org/docs/en/master/devguides/

Kick off Meeting Jan 29, 2018

Attendees: Walt, Jan-Simon, Mike Chachich, John Scumniotales, Andrew Fairly, Vince Iannotti, Christian Benien (attending AMM)

Agenda for kick off meeting

Review expert group goals as captured above
Proposal for speech recognition API or TTS API from Amazon?
Meeting schedule (biweekly? what time?)
Developer commitment from EG members.

Reviewed agenda and notes from CES
John said there is ongoing internal Amazon about releasing their code. There is already an open source version or public API version. Need to find out definitively what they are talking about as the release.
- John will send a link to what is publicly available now
- Should wrap up internal discussions end of this week. (Feb 2)
Michael may have something they can release. Will discuss internally once we see what Amazon has.
Attending AMM: Nuance: Christen, Amazon: Shitaro and Sanjay. VBT: TBD
Reserve time at AMM on Thursday (1 hour)
Follow up Feb 6

Automotive Linux Wiki

User Tools

Site Tools

Sidebar

AGL Software

Expert Groups

About Us

Table of Contents

Speech Expert Group

Goals for Expert Group

Architecture and Design Documents

Meetings

May 27, 2020

April 29, 2020

April 15, 2020

April 1, 2020

February 19, 2020

February 19, 2020

February 5, 2020

January 22, 2020

November 27, 2019

October 16, 2019

October 02, 2019

September 18, 2019

September 4, 2019

August 21, 2019

June 12, 2019

May 29, 2019

May 15, 2019

May 1, 2019

April 17, 2019

April 3, 2019

March 20, 2019

March 6, 2019

February 20, 2019

February 6, 2019

January 23, 2019

November 31, 2018

October 31, 2018

October 31, 2018

October 17, 2018

October 3, 2018

September 19, 2018

September 6, 2018

September 5, 2018

August 22, 2018

August 8, 2018

July 25, 2018

July 11, 2018

June 27, 2018

June 7, 2018

May 30, 2018

May 16, 2018

Feb 14, 2018

Feb 6, 2018

Kick off Meeting Jan 29, 2018

Page Tools