Home > Enterprise >  get multiple wiki tables pass to csv
get multiple wiki tables pass to csv

Time:11-11

How to get tables and pass to csv file?

import pandas as pd 

# 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'

url = 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'
tables = pd.read_html(url)

df = tables

df.to_csv("linux_distros.csv", sep=",")

AttributeError: 'list' object has no attribute 'to_csv'

CodePudding user response:

read_html returns a list of dataframes, one for each table found. You'd need to concatenate all relevant tables and then save that to a csv.

import pandas as pd

url = 'https://en.wikipedia.org/wiki/List_of_Linux_distributions'
tables = pd.read_html(url)

tables = [table for table in tables if ('Distribution' in table.columns and 'Description' in table.columns)]
# Concatenate all tables
df = pd.concat(tables, ignore_index=True)
# Save to CSV
df.to_csv("linux_distros.csv", sep=",")
  • Related