Home > Net >  Creating a DataFrame with lists within dictionary as value
Creating a DataFrame with lists within dictionary as value

Time:05-09

I'm scraping data from a website where there are multiple main categories and within them there are multiple secondary categories. I got the scraping part done but I am unsure how to store the data in a proper way such that when it's converted to a DataFrame object, the data is displayed properly.

Here's a breakdown of the data that I have:

List of main categories -> List of subcategories -> List of links corresponding to that subcategory

categories = ['Cat1', 'Cat2', ...]
subcat = ['Subcat1', 'Subcat2', ...] etc

This is how the final output when the data is scraped. My question is, how can I build a dataframe so that it becomes like this in the end:

Category1      Category2
Subcat1 Link1  Subcat1 Link1
Subcat2 Link2  Subcat2 Link2

I have thought of storing the data in a list of dictionaries, and within each dictionary a list of subcategories, but it's not displaying properly.

CodePudding user response:

I think that the best way to accomplish this is to use multiple indexes. Please refer to https://pandas.pydata.org/docs/user_guide/advanced.html#hierarchical-indexing-multiindex

  • Related