there is a big file for translation. It has elements that need to be translated. They are labeled "msgid". the desired text is enclosed between "msgid" and "msgstr". The problem is that I don't know how to parse text where the desired content is on multiple lines.
I need to parse so that all spaces and indents of the text are preserved. Next, I want to enter them into a dictionary for output to a table.
I tried to do it manually - it didn't work. Now I think I need regular expressions, but don't know how to do it Help, please
#: superset-frontend/src/explore/components/controls/DndColumnSelectControl/Option.tsx:68
#: superset-frontend/src/explore/components/controls/OptionControls/index.tsx:323
msgid ""
"\n"
" This filter was inherited from the dashboard's context.\n"
" It won't be saved when saving the chart.\n"
" "
msgstr ""
"\n"
" Фильтр был наследован из контекста дашборда.\n"
" Это не будет сохранено при сохранении графика.\n"
" "
#: superset/tasks/schedules.py:184
#, python-format
msgid ""
"\n"
" <b><a href=\"%(url)s\">Explore in Superset</a></b><p></p>\n"
" <img src=\"cid:%(msgid)s\">\n"
" "
msgstr ""
"\n"
" <b><a href=“%(url)s”>Исследовать в Superset</a></b><p></p>\n"
" <img src=“cid:%(msgid)s”>\n"
" "
#: superset/reports/notifications/email.py:60
here is my python code
def Parse(file : io.TextIOWrapper):
text = ""
for lines in file:
if lines.startswith("msgid"):
text = f" {lines.strip()}"
elif lines.startswith("msgstr"):
text = f" {lines.strip()}"
else:
text = lines.strip()
text.split()
sourcetxt = {}
for index, word in enumerate(text):
if word = "msgid":
CodePudding user response:
Consider:
pairs = []
for line in ......
line = line.strip()
if not line or line.startswith('#'):
continue
if line.startswith('msgid'):
pairs.append([[], None])
elif line.startswith('msgstr'):
pairs[-1][1] = []
elif pairs[-1][1] is None:
pairs[-1][0].append(line)
else:
pairs[-1][1].append(line)
This creates a list of [msgid, msgstr]
pairs, where each part is a list of lines. In this form, it's easy to convert it to whatever is desired.