Home > front end >  Python split Excel into header and value
Python split Excel into header and value

Time:03-29

Is it possible to split a column into headers and values? They sort of look like Python dictionaries.

Say for example I import an excel sheet into python that looks like this:

Pets Drinks
Cat Water {'Food':'Taco', 'Dance': 'Ballet', 'Fruit':'Apple'}
Dog Soda {'Food':'Ramen', 'Dance': 'Breakdance', 'Fruit':'Orange'}

Could this in turn be made into:

Pets Drinks Food Dance Fruit
Cat Water Taco Ballet Apple
Dog Soda 'Ramen Breakdance Orange

If this isn't possible, what other ways could this unnamed column be organized? I hope this makes sense. Thanks!

CodePudding user response:

I think your best option to get the data in that column is probably pandas. (you have your choice of how to get the data from the column, pandas offers a variety of ways) https://pandas.pydata.org/docs/

once you have the data, you can turn it into a dictionary with json.loads(string) with the string being the value in the column. This will return a python dictionary.

If you end up going the pandas route you'd have to look more into what the best approach of turning the dict into columns would be, as looping a pandas dataframe is an antipattern.

CodePudding user response:

I think two ways you can do this is by using the json package in python, since it seems like json is similar to what you are trying to do in the unlabeled table cell. . Something like the following might work.

text = #your text to manipulate#
json_dictionary = json.loads(text)
food_type = json_dictionary['food']
dance_type = json_dictionary['dance']
fruit_type = json_dictionary['fruit']

Alternatively you can export the excel sheet as a .csv and use string manipulation to get the values you want that way. For this approach, I would ensure that the data would follow the following format for the snippet below to work.

pet:, dog, drinks:, soda, food:, taco etc.

with open ("yourcsvfile.csv",newline="") as f:
  info = f.readlines()
  for line in info:
    data = line.split(',')
    pet = data[1]
    drinks = data[3] 
...
    fruit = data[9]

If you use this second approach, the even indices of data will contain the labels. From here it is just a matter of doing things with your data. Pandas is a good choice for table manipulation.

  • Related