I am needing to format forecast period columns to later merge with another data frame.
Columns of my data frame are:
current_cols = [
'01 11',
'02 10',
'03 09',
'04 08',
'05 07',
'06 06',
'07 05',
'08 04',
'09 03',
'10 02',
'11 01'
]
desired_out = [
'1 11',
'2 10',
'3 9',
'4 8',
'5 7',
'6 6',
'7 5',
'8 4',
'9 3',
'10 2',
'11 1'
]
Originally, I tried to split the list by split(' ')
, and use lstrip('0')
for each element in the list. Then recombine elements within tuple with
in between.
Is there a better approach? I'm having trouble combining elements in tuples back together, with
in between. Help would be much appreciated.
CodePudding user response:
current_cols =['01 11','02 10','03 09','04 08','05 07','06 06','07 05','08 04','09 03','10 02','11 01']
desired_out = []
for item in current_cols:
if item[0] == "0":
item = item[1:]
if " 0" in item:
item = item.replace(' 0', ' ')
desired_out.append(item)
CodePudding user response:
You can use re
module for the task:
import re
pat = re.compile(r"\b0 ")
out = [pat.sub(r"", s) for s in current_cols]
print(out)
Prints:
[
"1 11",
"2 10",
"3 9",
"4 8",
"5 7",
"6 6",
"7 5",
"8 4",
"9 3",
"10 2",
"11 1",
]
CodePudding user response:
To format the columns of your data frame, you can use the split and join methods on strings. Here is an example of how you could do this:
Copy code
# Split the columns by ' '
split_columns = [col.split(' ') for col in current_cols]
# Remove the leading zeros from the first element of each split column
split_columns = [(el[0].lstrip('0'), el[1]) for el in split_columns]
# Join the first and second elements of each split column with ' '
desired_out = [' '.join(el) for el in split_columns]
This code first splits each column by the ' ' character and stores the resulting list of elements in the split_columns variable. It then removes the leading zeros from the first element of each split column and stores the updated list of elements in the split_columns variable. Finally, it joins the first and second elements of each split column with the ' ' character and stores the resulting list of columns in the desired_out variable.
This approach avoids the need to create tuples and combine the elements within them. It also makes the code easier to read and understand, as it uses simple list comprehension and string manipulation operations.
CodePudding user response:
You can do it with nested comprehensions, conversion to int()
, and formatting using an f-string
:
current_cols = [
'01 11',
'02 10',
'03 09',
'04 08',
'05 07',
'06 06',
'07 05',
'08 04',
'09 03',
'10 02',
'11 01'
]
desired_out = [
f'{int(a)} {int(b)}' for (a, b) in [
e.split(' ') for e in current_cols
]
]
The code above will set desired_out
with:
['1 11', '2 10', '3 9', '4 8', '5 7', '6 6', '7 5', '8 4', '9 3', '10 2', '11 1']
This method is implementing your original thought of splitting each element using the
signal as separator, extracting the leading zeros from each pair element (done with the int()
conversion inside the f-string
), and combining them back, with a
sign in between (also using the f-string
).
The inner comprehension is just walking each element of the list, and splitting them by the
sign. The outer comprehension converts each element of each pair to int()
to get rid of the leading zeros.