i have a text file that looks like this:
Tag: liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson
Global path: /net/liosprod8.cvc-global.net/export/viewstore/liprod/liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson.vws
Server host: liosprod8.cvc-global.net
Region: cssall
Active: NO
View tag uuid:ccd335a4.fb8011eb.af37.00:50:56:bf:58:95
View on host: liosprod8.cvc-global.net
View server access path: /export/viewstore/liprod/liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson.vws
View uuid: ccd335a4.fb8011eb.af37.00:50:56:bf:58:95
View attributes: snapshot
View owner: tmn/liprod
Tag: liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson
Global path: /net/liosprod8.cvc-global.net/export/viewstore/liprod/liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson.vws
Server host: liosprod8.cvc-global.net
Region: cssall
Active: NO
View tag uuid:dc2ff6f7.fb8311eb.bb47.00:50:56:bf:58:95
View on host: liosprod8.cvc-global.net
View server access path: /export/viewstore/liprod/liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson.vws
View uuid: dc2ff6f7.fb8311eb.bb47.00:50:56:bf:58:95
View attributes: snapshot
View owner: tmn/liprod
I want to make an excel to have the output like this:
The output should have put the datas from the lines into correct columns, and when he find new "Tag" to put in on a new line.
example:
Tag GlobalPath ...
liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson /net/liosprod8.cvc-global.net/export/viewst...
liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson //net/liosprod8.cvc-global.net/export/v....
I have also made a script, but its not having the expected output: https://paste.pythondiscord.com/obulosuvuy
Maybe if you know another variant using pandas or you know what i did wrong in my script...
Ty!
CodePudding user response:
You made it overly complicated. If your file always have the same structure (records separated by an empty line and the same 11 fields for each record) you can just do something like that:
import pandas as pd
data=[]
with open ("input.txt") as file:
content=file.read()
records=content.split("\n\n")
for r in records:
line_data=[]
for line in r.splitlines():
value=line.split(":")[1].strip()
line_data.append(value)
data.append(line_data)
df=pd.DataFrame(data, columns=[
"Tag",
"Global path",
"Server host",
"Region",
"Active",
"View tag uuid",
"View on host",
"View server access path",
"View uuid",
"View attributes",
"View owner"
])
df.to_excel("output.xlsx", index=False)
Basically what it does is separating the records by using .split("\n\n"), then for each line of them extract what is after ":" and put in in an appropriate data structure (a list of lists) that is used to build the DataFrame.
CodePudding user response:
You can use StringMethods.split
to split into 2 cols then pandas.DataFrame.pivot
to reshape :
Try this :
import pandas as pd
df = pd.read_csv('test.txt', header=None)
df[['col1','col2']] = df[0].str.split(pat=":", n=1, expand=True)
cols = df['col1'].str.strip().drop_duplicates().tolist()
out = (
df.pivot(columns="col1", values="col2")
.assign(Tag = lambda x: x['Tag'].ffill())
.groupby('Tag').sum()
.replace(0, 'N/A')
.reset_index()
.rename_axis(axis=1, mapper=None)
.rename(columns= lambda x: x.strip())
.reindex(columns=cols)
)
out.to_excel('your_excel_name.xlsx', index=False) #to save a spreadsheet
# Output :
print(out)
Tag Global path Server host Region Active View tag uuid View on host View server access path View uuid View attributes View owner
0 liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson /net/liosprod8.cvc-global.net/export/viewstore/liprod/liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson.vws liosprod8.cvc-global.net cssall NO ccd335a4.fb8011eb.af37.00:50:56:bf:58:95 liosprod8.cvc-global.net /export/viewstore/liprod/liprod_Liosprod8_LIOS_12.3_NIGHT_linux8_hudson.vws ccd335a4.fb8011eb.af37.00:50:56:bf:58:95 snapshot tmn/liprod
1 liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson /net/liosprod8.cvc-global.net/export/viewstore/liprod/liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson.vws liosprod8.cvc-global.net cssall NO dc2ff6f7.fb8311eb.bb47.00:50:56:bf:58:95 liosprod8.cvc-global.net /export/viewstore/liprod/liprod_Liosprod8_LIOS_DF3_NIGHT_linux8_hudson.vws dc2ff6f7.fb8311eb.bb47.00:50:56:bf:58:95 snapshot tmn/liprod
2 liprod_simpleSlave_LIOS_86_LATEST_MNT_hudson_dynamic_view /net/rhliprod.cvc-global.net/export/viewstore/liprod_simpleSlave_LIOS_86_LATEST_MNT_hudson_dynamic_view rhliprod.cvc-global.net cssall NO de5ddadb.3e0011ed.91ef.00:0c:29:84:11:a7 rhliprod.cvc-global.net /export/viewstore/liprod_simpleSlave_LIOS_86_LATEST_MNT_hudson_dynamic_view de5ddadb.3e0011ed.91ef.00:0c:29:84:11:a7 N/A tmn/liprod