Home > OS >  Why some words in the dictionary are not replaced?
Why some words in the dictionary are not replaced?

Time:06-15

translation = {'                                             Cloud AI ': 'ਕਲਾਊਡ AI',
 'Entity Extraction': 'ਇਕਾਈ ਐਕਸਟਰੈਕਸ਼ਨ'
 ' Architecture': 'ਆਰਕੀਟੈਕਚਰ',
 ' Conclusion': 'ਸਿੱਟਾ',
 ' Motivation / Entity Extraction': 'ਪ੍ਰੇਰਣਾ / ਹਸਤੀ ਕੱਢਣ',
 ' Recurrent Deep Neural Networks': 'ਆਵਰਤੀ ਡੂੰਘੇ ਨਿਊਰਲ ਨੈੱਟਵਰਕ',
 ' Results': 'ਨਤੀਜੇ',
 ' Word Embeddings': 'ਸ਼ਬਦ ਏਮਬੈਡਿੰਗਸ',
 'Agenda': 'ਏਜੰਡਾ',
 'Also known as Named-entity recognition (NER), entity chunking and entity identification': 'ਨਾਮ-ਹਸਤੀ ਮਾਨਤਾ (NER), ਇਕਾਈ ਚੰਕਿੰਗ ਅਤੇ ਇਕਾਈ ਪਛਾਣ ਵਜੋਂ ਵੀ ਜਾਣਿਆ ਜਾਂਦਾ ਹੈ',
 'Biomedical Entity Extraction': 'ਬਾਇਓਮੈਡੀਕਲ ਇਕਾਈ ਐਕਸਟਰੈਕਸ਼ਨ',
 'Biomedical named entity recognition': 'ਬਾਇਓਮੈਡੀਕਲ ਨਾਮੀ ਇਕਾਈ ਦੀ ਮਾਨਤਾ',
 'Critical step for complex biomedical NLP tasks:': 'ਗੁੰਝਲਦਾਰ ਬਾਇਓਮੈਡੀਕਲ NLP ਕਾਰਜਾਂ ਲਈ ਮਹੱਤਵਪੂਰਨ ਕਦਮ:',
 'Custom Entity Extraction': 'ਕਸਟਮ ਇਕਾਈ ਐਕਸਟਰੈਕਸ਼ਨ',
 'Custom models': 'ਕਸਟਮ ਮਾਡਲ'}

Slide of ppt. If you could see the first word "Custom" is not replaced even though it is present in the dictionary translation

I would like to know why does this happen for some words.

The code for replacing words

prs = Presentation('/content/drive/MyDrive/presentation2.pptx')


# To get shapes in your slides

slides = [slide for slide in prs.slides]
shapes = []
for slide in slides:
    for shape in slide.shapes:
        shapes.append(shape)


def replace_text(replacements: dict, shapes: List[str]):
    """Takes dict of {match: replacement, ... } and replaces all matches.
    Currently not implemented for charts or graphics.
    """
    for shape in shapes:
        for match, replacement in replacements.items():
            if shape.has_text_frame:
                if (shape.text.find(match)) != -1:
                    text_frame = shape.text_frame
                    for paragraph in text_frame.paragraphs:
                        for run in paragraph.runs:
                            cur_text = run.text
                            new_text = cur_text.replace(str(match), str(replacement))
                            run.text = new_text

            if shape.has_table:
                for row in shape.table.rows:
                    for cell in row.cells:
                        if match in cell.text:
                            new_text = cell.text.replace(match, replacement)
                            cell.text = new_text

replace_text(translation, shapes) 

prs.save('output5.pptx')

output from the function

Custom Entity Extraction - Custom ਇਕਾਈ ਐਕਸਟਰੈਕਸ਼ਨ

expected output

Custom Entity Extraction - ਕਸਟਮ ਇਕਾਈ ਐਕਸਟਰੈਕਸ਼ਨ

I think I have found the reason of this happening. In the dictionary there is "Entity Extraction" so I think is that whereever it finds this word it replaces it and once it is replaced word "Custom" doesnot have any translation therefore it is left as it is. Now I am not sure how to make the function avoid doing that.

CodePudding user response:

How about directly replacing the text when it finds a match? I.e.:

    for shape in shapes:
        for match, replacement in replacements.items():
            if shape.has_text_frame:
                if (shape.text.find(match)) != -1:
                    text_frame = shape.text_frame
                    for paragraph in text_frame.paragraphs:
                        for run in paragraph.runs:
                            run.text = replacement

            if shape.has_table:
                for row in shape.table.rows:
                    for cell in row.cells:
                        if match in cell.text:
                            cell.text = replacement

This is assuming you can replace entire texts, and not specific sections

CodePudding user response:

The dictionary was not ordered which is why the words were not replaced correctly. After ordering the keys in the dictionary the words were correctly replaced.

  • Related