I have an excel file containing some columns and, in each column some values to be searched into a database.
I want to read this file (I am using pandas because its a very simple way to read excel files) and extract info into variables:
Desired extract information of each row
Company : Ebay (STR format)
company_name_for_search : [EBAY, eBay, Ebay] (list of strings)
company_register: [4722,4721] (list os ints)
Getting this info, I will run a search script. Some info must be lists because the script will do e search for every item inside the list (for loop).
When I read the excel file, each column is read as a object type in a dataframe, so I couldn't access each value inside such object.
How to split values, change formats and deal with that?
CodePudding user response:
Your variables are represented as single strings rather than rows of strings and numbers.
Instead of:
company_name | register |
---|---|
eBay | 4722 |
eBay | 4721 |
Amazon | 9999 |
You have:
company_name | register |
---|---|
ebay,ebay | 4722,4721 |
amazon | 9999 |
You can split each string and then explode the resulting Series containing arrays to get a long form DataFrame.
import pandas as pd
mess = pd.DataFrame(
{
"letters": ["A,B", "C,D", "E,F,G,H"],
"nums": ["100,200", "300,400", "500, 600, 700, 800"],
}
)
mess = mess.apply(lambda col: col.str.split(",").explode())