I have a csv file test.csv
with like 4 columns
A | B | C | D
======================
aed | etge | 3r4 | pu9
frt | eide | 9h4 | sd2
jey | edlr | 8d2 | bu6
Using python I will like to append column B under column A and append column D under column C so i have below
A | C
===========
aed | 3r4
frt | 9h4
jey | 8d2
etge | pu9
eide | sd2
edlr | bu6
CodePudding user response:
Would recommend using pandas for this.
Try something like this:
import pandas as pd
dataFrame = pd.DataFrame({"A":["aed","etge","3r4"],
"B":["aed","etawge","3r4"],
"C":["aed","etgase","3r4"],
"D":["aed","etgqee","3r4"],})
AB = pd.concat([dataFrame["A"],dataFrame["B"]])
CD = pd.concat([dataFrame["C"],dataFrame["D"]])
final_dataFrame = pd.concat([AB,CD], axis=1)
final_dataFrame.columns=["A","C"]
I didn't use the exact same data that you have, but this shows how to do this. You can use pandas.read_csv to read a csv file.
Edit: If you want to read from the file directly you will first have to change the file such that it does not have the "=====", so it should look like this:
A | B | C | D
aed | etge | 3r4 | pu9
frt | eide | 9h4 | sd2
jey | edlr | 8d2 | bu6
Once that is done, do something like this:
# read the file. If test.csv is not in the same folder, then you have to give the complete file path.
dataFrame = pd.read_csv("test.csv", sep="|")
# remove unnecessary white spaces.
dataFrame = dataFrame.apply(lambda x: x.str.strip() if x.dtype == "object" else x)
# create a new column by combining column 0 and 1.
AB = pd.melt(dataFrame.iloc[:, [0, 1]])["value"]
# create a new column by combining column 2 and 3.
CD = pd.melt(dataFrame.iloc[:, [2, 3]])["value"]
# combine the previous two columns
final_dataFrame = pd.concat([AB, CD], axis=1)
# give them names "A" and "C"
final_dataFrame.columns = ["A", "C"]
print(final_dataFrame)
If you are not worried about readability, you can combine the different steps like this:
dataFrame = pd.read_csv("file.csv", sep="|").apply(lambda x: x.str.strip() if x.dtype == "object" else x)
final_dataFrame = pd.concat([pd.melt(dataFrame.iloc[:, [0, 1]])["value"], pd.melt(dataFrame.iloc[:, [2, 3]])["value"]], axis=1)
final_dataFrame.columns = ["A", "C"]
print(final_dataFrame)
This gives the result:
A C
0 aed 3r4
1 frt 9h4
2 jey 8d2
3 etge pu9
4 eide sd2
5 edlr bu6