I have a txt
file that I would like to read and export in csv
format, but i have one problem.
Data in txt file looks like this |
Sales Organization|Distribution Channel|Sold-To #|Sold-To Name |Ship-To #|Ship-To Name |Mark-For #|Mark-For Name|Z1 : Sales Rep|Z1 : Sales Rep (Name)|Order Number|Sales Doc Type|Order Reason|PO Number|PO Type|Header Department|Delivery Block (H)|Billing Block (H)|Doc Date |RDD (H) |Cancel Date (H)|RDD (L) |Cancel date (L)|Division|Plant|Material |Sales Doc Item|Size |Schedule Line|Size Confirm Date|Item Category|Rej.Reason (SL)|Order Qty (SL)|Confirmed Qty (SL)|Unconfirmed Qty (SL)|Cancelled Qty (SL)|Open Qty (SL)|Reserved Qty (SL)|Fixed Qty (SL)|% Allocation (SL)|Delivered Qty (SL)|PGI Qty (SL)|Invoiced Qty (SL)|Net Unit Price|Confirmed Net Value (SL)|Dollars Shipped (SL)|Currency|% Shipped/Allocated (SL)|Delivery Block (SL)|Sales UOM|Credit Limit Status Text |EAN/UPC |Customer Material|
| EU01 |10 |10026276 | EU SARL|20056417 |Fulfillmemt Poland| | | | |1805338693 |ZOR |ZST |86LRD5JM |EDI | | | |14.02.2022|14.02.2022|03.03.2022 |14.02.2022|03.03.2022 |20 |3045 |35524-0004| 410|36 32| 1|14.02.2022 |ZTAN | | 1,000 | 1,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 1,000 | 0,000 | 0,000 | 41,600 | 41,600 | 0,000 |EUR | 100,000 | |EA |Credit check was executed, document OK|5400898540995| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | | | |1805338693 |ZOR |ZST |86LRD5JM |EDI | | | |14.02.2022|14.02.2022|03.03.2022 |14.02.2022|03.03.2022 |20 |3045 |35524-0004| 410|33 34| 2|14.02.2022 |ZTAN | | 1,000 | 1,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 1,000 | 0,000 | 0,000 | 41,600 | 41,600 | 0,000 |EUR | 100,000 | |EA |Credit check was executed, document OK|5400898540926| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | | | |1805338693 |ZOR |ZST |86LRD5JM |EDI | | | |14.02.2022|14.02.2022|03.03.2022 |14.02.2022|03.03.2022 |20 |3045 |35524-0004| 410|32 32| 3|14.02.2022 |ZTAN |P6 | 2,000 | 0,000 | 0,000 | 2,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 41,600 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400898508124| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | | | |1805338693 |ZOR |ZST |86LRD5JM |EDI | | | |14.02.2022|14.02.2022|03.03.2022 |14.02.2022|03.03.2022 |20 |3045 |85862-0041| 530|29 - | 1|14.02.2022 |ZTAN |P6 | 1,000 | 0,000 | 0,000 | 1,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 21,100 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400970111273| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | |72646 |John, Smith|1805339436 |ZOR |ZST |4QRNXHPH |EDI | | | |14.02.2022|14.02.2022|04.03.2022 |14.02.2022|04.03.2022 |10 |3045 |00501-3199| 10|36 34| 1|14.02.2022 |ZTAN |X9 | 17,000 | 0,000 | 0,000 | 17,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 47,800 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400970332180| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | |72646 |John, Smith |1805339436 |ZOR |ZST |4QRNXHPH |EDI | | | |14.02.2022|14.02.2022|04.03.2022 |14.02.2022|04.03.2022 |10 |3045 |04511-4432| 20|40 32| 1|14.02.2022 |ZTAN |J2 | 2,000 | 0,000 | 0,000 | 2,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 41,300 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400898076951| |
| EU01 |10 |10026276 |EU SARL|20056417 | Fulfillmemt Poland| | |72646 |John, Smith |1805339436 |ZOR |ZST |4QRNXHPH |EDI | | | |14.02.2022|14.02.2022|04.03.2022 |14.02.2022|04.03.2022 |10 |3045 |04511-5115| 30|36 32| 1|14.02.2022 |ZTAN |P6 | 5,000 | 0,000 | 0,000 | 5,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 47,800 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400970262012| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | |72646 |John, Smith |1805339436 |ZOR |ZST |4QRNXHPH |EDI | | | |14.02.2022|14.02.2022|04.03.2022 |14.02.2022|04.03.2022 |10 |3045 |04511-5155| 40|28 30| 1|14.02.2022 |ZTAN |X9 | 1,000 | 0,000 | 0,000 | 1,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 0,000 | 56,500 | 0,000 | 0,000 |EUR | 0,000 | |EA |Credit check was executed, document OK|5400970254963| |
| EU01 |10 |10026276 | EU SARL|20056417 | Fulfillmemt Poland| | |72646
and I would like to show each columns separately in csv
file. Now as you see columns are separated by |
. As an example - Sales Organization
should be a header and EU01
should be its value and so on.
df =pd.read_csv('1.txt', sep='delimiter', header= None, engine='python')
df =df.iloc[3:]
df.to_csv(path '123.csv', index=False, header=True)
CodePudding user response:
Well, Heres what i did:
Let's call this file, data.txt
. I add an additional '|' Before Sales Organisation
Now I strip white space off it:
with open('data.txt', 'r') as f:
lines = f.readlines()
Stripped = [line.replace(' ', '') for line in lines]
with open('data.txt', 'w') as f:
f.writelines(Stripped)
we then get a clean-looking data.txt:
|SalesOrganization|DistributionChannel|Sold-To|Sold-ToName|Ship-To|Ship-ToName|Mark-For|Mark-ForName|Z1:SalesRep|Z1:SalesRep(Name)|OrderNumber|SalesDocType|OrderReason|PONumber|POType|HeaderDepartment|DeliveryBlock(H)|BillingBlock(H)|DocDate|RDD(H)|CancelDate(H)|RDD(L)|Canceldate(L)|Division|Plant|Material|SalesDocItem|Size|ScheduleLine|SizeConfirmDate|ItemCategory|Rej.Reason(SL)|OrderQty(SL)|ConfirmedQty(SL)|UnconfirmedQty(SL)|CancelledQty(SL)|OpenQty(SL)|ReservedQty(SL)|FixedQty(SL)|%Allocation(SL)|DeliveredQty(SL)|PGIQty(SL)|InvoicedQty(SL)|NetUnitPrice|ConfirmedNetValue(SL)|DollarsShipped(SL)|Currency|%Shipped/Allocated(SL)|DeliveryBlock(SL)|SalesUOM|CreditLimitStatusText|EAN/UPC|CustomerMaterial|
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||||1805338693|ZOR|ZST|86LRD5JM|EDI||||14.02.2022|14.02.2022|03.03.2022|14.02.2022|03.03.2022|20|3045|35524-0004|410|3632|1|14.02.2022|ZTAN||1,000|1,000|0,000|0,000|0,000|0,000|0,000|0,000|1,000|0,000|0,000|41,600|41,600|0,000|EUR|100,000||EA|Creditcheckwasexecuted,documentOK|5400898540995||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||||1805338693|ZOR|ZST|86LRD5JM|EDI||||14.02.2022|14.02.2022|03.03.2022|14.02.2022|03.03.2022|20|3045|35524-0004|410|3334|2|14.02.2022|ZTAN||1,000|1,000|0,000|0,000|0,000|0,000|0,000|0,000|1,000|0,000|0,000|41,600|41,600|0,000|EUR|100,000||EA|Creditcheckwasexecuted,documentOK|5400898540926||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||||1805338693|ZOR|ZST|86LRD5JM|EDI||||14.02.2022|14.02.2022|03.03.2022|14.02.2022|03.03.2022|20|3045|35524-0004|410|3232|3|14.02.2022|ZTAN|P6|2,000|0,000|0,000|2,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|41,600|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400898508124||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||||1805338693|ZOR|ZST|86LRD5JM|EDI||||14.02.2022|14.02.2022|03.03.2022|14.02.2022|03.03.2022|20|3045|85862-0041|530|29-|1|14.02.2022|ZTAN|P6|1,000|0,000|0,000|1,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|21,100|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400970111273||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||72646|John,Smith|1805339436|ZOR|ZST|4QRNXHPH|EDI||||14.02.2022|14.02.2022|04.03.2022|14.02.2022|04.03.2022|10|3045|00501-3199|10|3634|1|14.02.2022|ZTAN|X9|17,000|0,000|0,000|17,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|47,800|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400970332180||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||72646|John,Smith|1805339436|ZOR|ZST|4QRNXHPH|EDI||||14.02.2022|14.02.2022|04.03.2022|14.02.2022|04.03.2022|10|3045|04511-4432|20|4032|1|14.02.2022|ZTAN|J2|2,000|0,000|0,000|2,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|41,300|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400898076951||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||72646|John,Smith|1805339436|ZOR|ZST|4QRNXHPH|EDI||||14.02.2022|14.02.2022|04.03.2022|14.02.2022|04.03.2022|10|3045|04511-5115|30|3632|1|14.02.2022|ZTAN|P6|5,000|0,000|0,000|5,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|47,800|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400970262012||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||72646|John,Smith|1805339436|ZOR|ZST|4QRNXHPH|EDI||||14.02.2022|14.02.2022|04.03.2022|14.02.2022|04.03.2022|10|3045|04511-5155|40|2830|1|14.02.2022|ZTAN|X9|1,000|0,000|0,000|1,000|0,000|0,000|0,000|0,000|0,000|0,000|0,000|56,500|0,000|0,000|EUR|0,000||EA|Creditcheckwasexecuted,documentOK|5400970254963||
|EU01|10|10026276|EUSARL|20056417|FulfillmemtPoland|||72646
Now I simply read it into pandas and drop the first column:
df = pd.read_csv('data.txt', sep='|' , engine='python')
df = df.drop(['Unnamed: 0'], axis = 1)
Heres the output!
SalesOrganization DistributionChannel ... CustomerMaterial Unnamed: 54
0 EU01 10 ... NaN NaN
1 EU01 10 ... NaN NaN
2 EU01 10 ... NaN NaN
3 EU01 10 ... NaN NaN
4 EU01 10 ... NaN NaN
5 EU01 10 ... NaN NaN
6 EU01 10 ... NaN NaN
7 EU01 10 ... NaN NaN
8 EU01 10 ... NaN NaN
You can now convert it to a CSV, if you like:
df.to_csv('data.csv', index=False)
CodePudding user response:
In your provided sample txt file. You have an |
at the start and end of every row. So you need remove that before you read csv. Otherwise, it will give you ParseError. Once this is fixed, you can use sep='|'
Like this:
df =pd.read_csv('1.txt', sep='|', header= None)