I have a Python data frame that has a column in ranges that I would like to turn into individual rows. Is this possible? Essentially 'unbin' and maintain the other associated columns, even though they will contain the same data. Example below and attached.
Row 1 - Ranges (Ex. 100-105) Row 2-3 - Specific data associated with everything in that range.
I would like to turn this into individual rows.
100 - Associated data columns 101 - Associated data columns 102 - Associated data columns 103 - Associated data columns 104 - Associated data columns 105 - Associated data columns
CodePudding user response:
Let's assume this is your dataframe, where the range is a string type:
import pandas as pd
df0 = pd.DataFrame(
[
{
"range": "100-105",
"value": "Specific data associated with everything in that range.",
}
]
)
Than you can iterate over all rows and construct a new DataFrame, while parsing the string to an actual integer range:
result = []
for ix, row in df0.iterrows():
range_str = row["range"]
for range_ in range(int(range_str.split("-")[0]), int(range_str.split("-")[1]) 1):
result.append({"range": range_, "value": row["value"]})
df = pd.DataFrame(result)
CodePudding user response:
Split your range in two parts (start, stop) then generate the values and finally explode rows into individual rows:
expand_range = lambda x: range(int(x[0]), int(x[1]) 1)
out = df.assign(Range=df['Range'].str.split('-').map(expand_range)).explode('Range')
print(out)
Output:
>>> out
Range A
0 100 1
0 101 1
0 102 1
0 103 1
0 104 1
0 105 1
1 106 2
1 107 2
1 108 2
1 109 2
1 110 2
Input:
>>> df
Range A
0 100-105 1
1 106-110 2