I have data stored in local desktop and I have read the file using Pandas libraries
a = pd.read_csv()
b = pd.read_csv()
c = pd.read_csv()
I want to run SQL query on the above 3 data frame.
Can someone guide me how to run SQL Query on Python when data is stored locally instead of any other Database?
CodePudding user response:
TL;DR: You could use pandasql
.
For simplicity's sake, I am only going to use two data frames rather than three.
Say you a file called person.csv
:
personId,lastName,firstName
1,Apple,Allen
2,Banana,Bob
Aswell as a file called address.csv
:
addressId,personId,city,state
1,2,New York City,New York
2,1,San Francisco,California
And you wanted to write an SQL query to report the first name, last name, city, and state of each person after importing the above CSVs as data frames.
One way you could do this is to use pandasql
:
import pandas as pd
import pandasql as pdsql
def main() -> None:
person = pd.read_csv('person.csv')
address = pd.read_csv('address.csv')
q = """
SELECT
firstname,
lastname,
city,
state
FROM
person
LEFT JOIN
address
ON person.personid = address.personid
"""
print(pdsql.sqldf(q, locals()))
if __name__ == '__main__':
main()
Output:
firstName lastName city state
0 Allen Apple San Francisco California
1 Bob Banana New York City New York
CodePudding user response:
i found this library querycsv which migth do the trick
CodePudding user response:
You can simply read your csv
import csv
with open('file.csv', 'r') as f:
reader = csv.reader(f, delimiter=',') #can be ';'
for row in reader:
print(row[0], row[1]....)