when I using confidence the punctuation is not working just like I am saying question mark it was typing question mark instant ? and when I say period it was typing period instant . I have make a checkbox when you click on the checkbox the punctuation will be on
SpeechConfig config = SpeechConfig.FromSubscription("key", "region");
config.OutputFormat = OutputFormat.Detailed;
if (Properties.Settings.Default.Punctuation)
{
config.SetServiceProperty("punctuation", "explicit", ServicePropertyChannel.UriQueryParameter);
}
recognizer = new SpeechRecognizer(config);
recognizer. Recognizer. Recognizedecognizer_Recognized;
...
private void SpeechRecognizer_Recognized(object sender, SpeechRecognitionEventArgs e)
{
if (e.Result.Reason == ResultReason.RecognizedSpeech)
{
if (e.Result.Text.ToLower().Equals("new line") || e.Result.Text.ToLower().Equals("newline"))
{
SendKeys.SendWait(Environment.NewLine);
}
else
{
var detailedResults = e.Result.Best();
if (detailedResults != null && detailedResults.Any())
{
var bestResults = detailedResults?.ToList()[0];
foreach (var word in bestResults.Words)
{
double per = word.Confidence * 100;
SendKeys.SendWait($"{word.Word} [{per:0.##}] ");
}
}
}
}
}
CodePudding user response:
What you are observing is by design. In most circumstances it not necessary or even helpful to inspect the details of recognized speech result. It looks like you have misinterpreted how to use the details.
You don't realise it but your example of detecting "new line"
or "newline"
as a key phrase and interpreting that as a request to inject a line feed into the output is the very same process at work.
For puntuation to be detected in the speech, the first thing that the classifier must do is resolve the words. It is only after the word has been resolved that the service can post process the results to classify the word as a natural word or punctuation.
The process is a bit like this:
- Detected the word "comma" with high confidence
- If the
punctuation
setting is set toexplicit
, then Is the word on its own or at the end of a recognized sequence that was followed by a pause - If yes, then interpret it as
","
and not"comma"
For this reason it is important to understand that when the punctuation
setting is set to explicit
, the punctuation must be isolated out of the normal sentence cadence of the spoken text.
Read this as a sentence with a constant pace without punctuation:
this is a sentence that doesn't have a comma or a full stop but an exclamation mark would look nice
If you read fast and fluent enough, there should be no punctation in the output, even if the words were recognized with high confidence. To get punctuation into the same text, you actually need to read this script:
This is a sentence that doesn't have a comma.
Comma.
Or a fullstop.
Comma.
But an exclamation mark would look nice.
exclamation mark.
This is a sentence that doesn't have a comma , or a full stop , but an exclamation mark would look nice !
The per-word analysis for my test looks like this:
word | confidence |
---|---|
this | 85.99% |
is | 95.93% |
a | 68.49% |
sentence | 96.99% |
that | 90.03% |
doesn't | 96.75% |
have | 94.57% |
a | 87.88% |
comma | 94.58% |
comma | 94.34% |
or | 67.14% |
a | 64.68% |
fullstop | 77.63% |
comma | 94.90% |
but | 91.17% |
an | 62.65% |
exclamation | 98.44% |
mark | 68.58% |
would | 86.15% |
look | 91.58% |
nice | 97.40% |
exclamation | 97.05% |
mark | 96.61% |
Notice that the words representing the punctuation all have a high confidence rating, but in the output not all of the words were actually interpreted as punctuation. This might be clearer in this screenshot where I have highlighted two commas that are in the output, but are correctly identified as words:
CodePudding user response:
Using cognitive services I cannot reproduce your issue. Setting the config.OutputFormat = OutputFormat.Detailed
or config.RequestWordLevelTimestamps();
does not affect the explicit punctuation recognition.
What is not clear from your example is the current state of your setting. When in doubt, if we are toggling logic using settings, and the behaviour that we observe is the same even when we change the setting values then the obvious code to check is the setting value itself.
Please try to comment out your logic to toggle the punctuation like this:
//if (Properties.Settings.Default.Punctuation)
{
config.SetServiceProperty("punctuation", "explicit", ServicePropertyChannel.UriQueryParameter);
}
If this solves it then there are two considerations:
What is the initial state of the
Properties.Settings.Default.Punctuation
setting? Is your application logic not updating the value when you expect it to? Any mutating logic that affects that setting may need to callProperties.Settings.Default.Save()
to save changes. An extension of this of course is that depending on where your mutating logic is executing from, you might need to callProperties.Settings.Default.Reload()
to ensure that the current values are loaded from the store, however this is not usually required if you are operating in the same thread space, which you most likely will be in WinForms.Is the config loaded once, and is that once before the setting value has been toggled? That step in the workflow is unclear from your description and the code example. If you are using continuous recognition or you are creating a single instances of
SpeechRecognizer
for the lifetime of your Form then changes to your setting will not be applied into the Speech Configuration.You will need to re-initialize the
SpeechRecognizer
as part of your logic that is handling the setting changed event or have some other routine in the speech event handlers that detects a change in this setting and restarts theSpeechRecognizer
connection and process.