Home > Enterprise >  Configure and use multiple language Analyzers on Azure Search
Configure and use multiple language Analyzers on Azure Search

Time:04-22

We are implementing Azure Cognitive Search for our website which will be available in multiple languages. We have created multiple fields in our Index for each language and using Microsoft Analyzers on the fields.

Example: Below, we have description field for two languages, English (Description) and French (Description_fr).

{
  "name": "hotels-sample-index",
  "fields": [
    {
      "name": "Description",
      "type": "Edm.String",
      "retrievable": true,
      "searchable": true,
      "analyzer": "en.microsoft"
    },
    {
      "name": "Description_fr",
      "type": "Edm.String",
      "retrievable": true,
      "searchable": true,
      "analyzer": "fr.microsoft"
    }
  ]
}

We already have the data for Description field in the database, but is there any way by which Azure can translate Description field (English) and load the data in Description_fr (French)?

CodePudding user response:

If you're using Azure Cognitive Search indexer, you can attach a Cognitive Service and it will translate automatically for you:

https://docs.microsoft.com/en-us/azure/search/search-indexer-overview#stage-3-skillset-execution

If you're adding documents through code, you can also use Cognitive Services in order to get the translations and proper set to the desired field:

using System;
using System.Net.Http;
using System.Text;
using System.Threading.Tasks;
using Newtonsoft.Json; // Install Newtonsoft.Json with NuGet

class Program
{
    private static readonly string key = "YOUR-KEY";
    private static readonly string endpoint = "https://api.cognitive.microsofttranslator.com/";

    // Add your location, also known as region. The default is global.
    // This is required if using a Cognitive Services resource.
    private static readonly string location = "YOUR_RESOURCE_LOCATION";
    
    static async Task Main(string[] args)
    {
        // Input and output languages are defined as parameters.
        string route = "/translate?api-version=3.0&from=en&to=de&to=it";
        string textToTranslate = "Hello, world!";
        object[] body = new object[] { new { Text = textToTranslate } };
        var requestBody = JsonConvert.SerializeObject(body);
    
        using (var client = new HttpClient())
        using (var request = new HttpRequestMessage())
        {
            // Build the request.
            request.Method = HttpMethod.Post;
            request.RequestUri = new Uri(endpoint   route);
            request.Content = new StringContent(requestBody, Encoding.UTF8, "application/json");
            request.Headers.Add("Ocp-Apim-Subscription-Key", key);
            request.Headers.Add("Ocp-Apim-Subscription-Region", location);
    
            // Send the request and get response.
            HttpResponseMessage response = await client.SendAsync(request).ConfigureAwait(false);
            // Read response as a string.
            string result = await response.Content.ReadAsStringAsync();
            Console.WriteLine(result);
        }
    }
}

source:

https://docs.microsoft.com/en-us/azure/cognitive-services/translator/quickstart-translator?tabs=csharp#translate-text

CodePudding user response:

The simplest way to automatically generate the French description is to link your Cognitive Search service to a Cognitive Services account and then add a skillset with a translation skill configured like this:

{
    "@odata.type": "#Microsoft.Skills.Text.TranslationSkill",
    "defaultToLanguageCode": "fr",
    "suggestedFrom": "en",
    "context": "/document",
    "inputs": [
      {
        "name": "text",
        "source": "/document/Description"
      }
    ],
    "outputs": [
      {
        "name": "translatedText",
        "targetName": "Description_fr"
      }
    ]
  }

More details in the documentation for the translation skill.

If you can't or don't want to use a skillset and are writing directly to your index, you can write code that does that, as outlined in Thiago's answer.

  • Related