I have a column in a dataframe with text. The text has many dates, I want to add a newline before and after the date. I have tried with regex, I have found all the dates and replaced them with '\n' date '\n'. The problem is that now, in Excel, when I double-click on the cell containing (to expend it) and then click outside of it - Excel automatically wraps text in that cell.
So i thought that maybe the '\n' would be the issue and that maybe there is another character that Excel would interpret as ALT Enter. (I have seen lots of similar questions but they mostly give either the '\n' or wrap text answer).
df example:
{'type’': {4: 'a', 5: 'a', 6: 'a', 7:'a'},
'development': {4: "28/03. some text goes here. 08/04. some text goes here 06/04. some text goes here - 02/04. some text goes here.",
5: nan,
6: nan
7: "23/06. some text goes here. 08/04. some text goes here 06/04. some text goes here - 02/04. some text goes here."}}
What I have done is this (which I'm sure is quite a slow way of doing it, maybe it would be better with re.sub or with pandas string methods, but the df is small enough and this does the job!)
new_val = []
val = df['development'].tolist()
for s in val:
ms = re.findall('\d /\d ', str(s))
for m in ms:
s = s.replace(m, '\n' m '\n')
new_val.append(s)
df['development'] = new_val
CodePudding user response:
When you export a DataFrame
to Excel, the WrapText
property of the cells is set to False
by default (at least with openpyxl engine). So you see your text in Excel without breaks when open it first time.
But if only you activate edit cell mode in Excel, you force Excel to read the text. During this process, it handles line breaks (ASCII code 10) as a manual wrapping operator. That's why WrapText
property is turned on automatically after that.
As a workaround, you could adapt the SheetChange
event of a Workbook, or Change
event of a sheet. For example:
Sub Workbook_SheetChange(ByVal Sh As Object, ByVal Target As Range)
Target.WrapText = False
End Sub
Another option is to fix the row height. This won't prevent wrapping, but will keep the original layout. You can do it manually or programmatically.
P.S. About string processing. Try this:
df['development'].str.replace(r'\d /\d ', '\n\\g<0>\n', regex=True)
\g<0>
will put in the whole match during the replacement. See the docs for details.