Home > Software design >  Regex replace text with html
Regex replace text with html

Time:09-16

I am working on a simple programming project to parse a Java program written in a text editor. The program from the text editor is process by a python program using the tinkter library.

Here is an example java program

public class TestMax {
    public static void main(String args[]){
        System.out.println("Hello World");
    }
}

Here is the actual ouptut

public class TestMax {
        public static <span class="keyword">void</span> main(String args[]){
                System.out.println("Hello World");
        }
}

Here is the expected output

<span class="keyword">public</span> <span class="keyword">class</span> TestMax {
        <span class="keyword">public</span> <span class="keyword">static</span> <span class="keyword">void</span> main(String args[]){
                System.out.println("Hello World");
        }
}

Here is the python program

import tkinter as tk
from tkinter.filedialog import askopenfilename, asksaveasfilename
import re

keywords = ["public","class","static","void","int","if","else","return"]

def open_file():
    """Open a file for editing."""
    filepath = askopenfilename(
        filetypes=[("Text Files", "*.txt"), ("All Files", "*.*")]
    )
    if not filepath:
        return
    txt_edit.delete(1.0, tk.END)
    with open(filepath, "r") as input_file:
        text = input_file.read()
        txt_edit.insert(tk.END, text)
    window.title(f"Simple Text Editor - {filepath}")

def save_file():
    """Save the current file as a new file."""
    filepath = asksaveasfilename(
        defaultextension="txt",
        filetypes=[("Text Files", "*.txt"), ("All Files", "*.*")],
    )
    if not filepath:
        return
    with open(filepath, "w") as output_file:
        text = txt_edit.get(1.0, tk.END)
        output_file.write(text)
    window.title(f"Simple Text Editor - {filepath}")

def print_html():
    """Print html"""
    text = txt_edit.get(1.0, tk.END)
    for line in text.split("\n"):
        for word in line.split(" "):
            if word in keywords:
                text_after = re.sub(word, '<span class="keyword">' word '</span>', text)
    print(text_after)

window = tk.Tk()
window.title("Simple Text Editor")
window.rowconfigure(0, minsize=800, weight=1)
window.columnconfigure(1, minsize=800, weight=1)

txt_edit = tk.Text(window)
fr_buttons = tk.Frame(window, relief=tk.RAISED, bd=2)
btn_open = tk.Button(fr_buttons, text="Open", command=open_file)
btn_save = tk.Button(fr_buttons, text="Save As...", command=save_file)
btn_print_html = tk.Button(fr_buttons, text="Print as html", command=print_html)

btn_open.grid(row=0, column=0, sticky="ew", padx=5, pady=5)
btn_save.grid(row=1, column=0, sticky="ew", padx=5)
btn_print_html.grid(row=2, column=0, sticky="ew", padx=5)

fr_buttons.grid(row=0, column=0, sticky="ns")
txt_edit.grid(row=0, column=1, sticky="nsew")

window.mainloop()

The issue is within this method:

def print_html():
    """Print html"""
    text = txt_edit.get(1.0, tk.END)
    for line in text.split("\n"):
        for word in line.split(" "):
            if word in keywords:
                text_after = re.sub(word, '<span class="keyword">' word '</span>', text)
    print(text_after)

The issue is that the keyword is only getting replace once. Ideally, I'd to replace all of the key words with <span class="keyword">word</span>. Any help with this would be greatly appreciated. Thanks!

CodePudding user response:

you're not saving your changes and rewrite constantly text_after

re won't do here because you'll get loops you definitely don't want (you're adding <span class... while class is one of your keywords), so you'd better write a whole new text

def print_html():
    """Print html"""
    text = txt_edit.get(1.0, tk.END)
    
    new_text=[]
    for line in text.split("\n"):
        line_text = []
        for word in line.split(" "):
            if word in keywords:
                line_text.append('<span class="keyword">' word '</span>')
            else:
                line_text.append(word)
        new_text.append(' '.join(line_text))
    new_text = '\n'.join(new_text)
            
    print(new_text)
  • Related