Skip to content

Clarification on how to create an Adapter for AWS Transcribe #108

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Gribbs opened this issue Mar 10, 2019 · 9 comments
Closed

Clarification on how to create an Adapter for AWS Transcribe #108

Gribbs opened this issue Mar 10, 2019 · 9 comments
Assignees
Labels
Enhancement a request for improvement question Further information is requested Speech To Text Adapters Speech To Text Adapters

Comments

@Gribbs
Copy link

Gribbs commented Mar 10, 2019

I've forked the repo and I'm trying to create a new Adapter for AWS Transcribe formats but I'm having trouble understanding the recommended steps here https://github.com/bbc/react-transcript-editor/blob/master/docs/guides/adapters.md

The recommended steps are:

  1. Create a folder with the name of the STT service - eg speechmatics
  2. add a adapters/${sttServiceName}/sample folder
  3. add a sample json file from the STT service in this last folder - this will be useful for testing.
  4. Name it ${name of the stt service}.sample.json
  5. add option in adapters/index.js

I can see an adapters folder in the project here src/lib/Util/adapters .

Is this where we should begin with step 1? I'm assuming yes, but just want to make sure this is right.

@Gribbs Gribbs added the bug Something isn't working label Mar 10, 2019
@pietrop
Copy link
Contributor

pietrop commented Mar 10, 2019

Hi @Gribbs,
Thanks for reaching out and the PR.
Yes, that’s the right location, will add a note to the docs make it explicit.

@pietrop pietrop added question Further information is requested and removed bug Something isn't working labels Mar 10, 2019
@pietrop pietrop closed this as completed Mar 10, 2019
@pietrop pietrop reopened this Mar 10, 2019
@pietrop pietrop added Enhancement a request for improvement Speech To Text Adapters Speech To Text Adapters labels Mar 10, 2019
@chrishutchinson
Copy link

Hey @Gribbs, I had started throwing together an AWS adapter too, but I haven't got round to finishing it up yet.

What sort of progress have you made so far? I'm keen to see this, so am happy to help if it's useful.

@Gribbs
Copy link
Author

Gribbs commented Mar 12, 2019

hey @chrishutchinson, I've completed the implementation (might be a bit untidy in there with some cut and pasted files in the sample folder though) on my fork and seems to work ok.

I haven't implemented speakers yet though. I also was a bit confused with how to use the generic generateEntitiesRanges() function on the Adapters guide:
4. And use the helper function generateEntitiesRanges to add the entityRanges to each block. - see above
The guidance recommends using it like generateEntitiesRanges(paragraph.words, 'text')

However, for Transcribe the "text" lives on an Object in each word like:

const generateEntitiesRanges = (words, wordAttributename) => {
 const text = word.alternatives[0].content //  <--- I can't use wordAttributename param here
  // rest of code
}

I figured I'd either have to remap each word or use my own implementation of generateEntitiesRanges() instead. I chose the latter. I'm not sure if I should be doing it different though?

@chrishutchinson
Copy link

Sounds good @Gribbs, this is about as far as I got with my prototype. I ended up with similar questions but didn't make any progress resolving them.

Perhaps it'd now be best to file a PR with what you have (or when you feel it's at a good point)? I'm happy to do some testing against it for my use case, and can contribute in any additional functionality with new PRs at that point. Having your base to work from would be really helpful.

@pietrop
Copy link
Contributor

pietrop commented Mar 12, 2019

Thanks @Gribbs and @chrishutchinson

Was looking at your branch and found the docs with example json for AWS transcribe so using this json example

 "items": [
            {
              "start_time": "12.282",
              "end_time": "12.592",
              "alternatives": [
                {
                  "confidence": "1.0000",
                  "content": "When"
                }
              ],
              "type": "pronunciation"
            },
            {
              "start_time": "12.592",
              "end_time": "12.692",
              "alternatives": [
                {
                  "confidence": "0.8787",
                  "content": "you"
                }
              ],
              "type": "pronunciation"
            },
            {
              "start_time": "12.702",
              "end_time": "13.252",
              "alternatives": [
                {
                  "confidence": "0.8318",
                  "content": "try"
                }
              ],
              "type": "pronunciation"
            },

The suggestion is that you could normalise the words/items list like this

const words = items.map((item)=>{
     return { 
             start: item.start_time, 
            end: item.end_time,
            text: item.alternatives[0].content
        }
   })

/** 
 console.log(words);
 [ { start: '12.282', end: '12.592', text: 'When' },
  { start: '12.592', end: '12.692', text: 'you' },
  { start: '12.702', end: '13.252', text: 'try' } ]
*/

Note, if there are items type other then pronunciation, you might need to check for that attribute when normalising, and then chaining a filter to the map to remove all the non matching items (eg null). If that makes sense? happy to flesh out a complete example if it doesn't.

another minor note, if the highest confidence value is reliably always the first one you can do item.alternatives[0].content and get the first one, otherwise you might need to further filter from that list and get the one closest to 1.0000 just to be sure.

normalising would enable you to re-use generateEntitiesRanges, keeping the code cleaner and DRYer.
and less of a need to test the generateEntitiesRanges function as it's shared with the other modules etc.

generateEntitiesRanges(paragraph.words, 'text')

@Gribbs
Copy link
Author

Gribbs commented Mar 12, 2019

Great! That makes sense. Thanks for the suggestions @pietrop . Working on this now

@pietrop
Copy link
Contributor

pietrop commented Mar 12, 2019

Awesome! Any questions let me know

@Gribbs
Copy link
Author

Gribbs commented Mar 13, 2019

Pull request #110

@pietrop
Copy link
Contributor

pietrop commented Mar 18, 2019

Addressed in #120

@pietrop pietrop closed this as completed Mar 18, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement a request for improvement question Further information is requested Speech To Text Adapters Speech To Text Adapters
Projects
None yet
Development

No branches or pull requests

3 participants