I'm trying to write out the first column of a xlsx into a txt file for input to another tool. I have this code that works, but I can't seem to get rid of the spacing in front of every entry. Here is my code.
import pandas as pd
import sys
df = pd.read_excel(sys.argv[1], sheet_name ='Sheet1', usecols="A", header=0)
my_file = open('hosts.txt','w')
for data in df.columns:
# df[data].str.lstrip()
my_file.write(df.replace(r'^\s*$', regex=True)[data].to_string(index=False) 'n')
my_file.close()
CodePudding user response:
import pandas as pd
# Read data with pandas
df = pd.read_excel('one_row.xlsx', sheet_name='Sheet1', usecols="A", header=0)
# method 1
# strip column A
df['A'] = df['A'].str.strip()
# Here you can use df.to_csv() to output to a txt file
df.to_csv('one_row.txt', sep='\t', index=False)
# method 2
data_list = df.iloc[:, 0].to_list() # Convert to list
data_no_space_list = [i.strip() for i in data_list] # strip space
data_str = '\n'.join(data_no_space_list) # Use \n for newlines
with open('one_row2.txt', 'w') as f:
f.write(data_str)
CodePudding user response:
Since you didn't really provide any data I made a sample dataframe to copy from. This will iterate through your columns and get ride of the space in front of each column if it exists
df = pd.DataFrame({
'Column1' : [' A', ' B', 'C', ' D'],
'Column2' : ['Word', ' Another Word', 'Third Word', ' Space Word']
})
for x in df.columns:
df[x] = df[x].str.lstrip(' ')