I have dataset called "Investor History". I want to create a list from the following dataset using the csv package manually (without using the Panda Dataframe). The dataset has the following 3 headers:
Stock_Price Exchange_Rate Invest
High Low Y
High High N
Low Low Y
From that dataset, I want to create a list that seems like this as an output:
{('Stock_Price', 'High'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')}
{('Stock_Price', 'High'), ('Exchange_Rate', 'High'), ('Invest', 'N')}
{('Stock_Price', 'Low'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')}
it's more like printing the headers with the elements vertically.
CodePudding user response:
I assume that you have real CSV with data separated by ,
You can use DictReader()
to get it as list of dictionares
f = open(filename)
reader = csv.DictReader(f)
rows = list(reader)
[
{'Stock_Price': 'High', 'Exchange_Rate': 'Low', 'Invest': 'Y'},
{'Stock_Price': 'High', 'Exchange_Rate': 'High', 'Invest': 'N'},
{'Stock_Price': 'Low', 'Exchange_Rate': 'Low', 'Invest': 'Y'}
]
And later you can use .items()
to convert every dictionary into list of tuples
rows = [list(r.items()) for r in rows]
[
[('Stock_Price', 'High'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')],
[('Stock_Price', 'High'), ('Exchange_Rate', 'High'), ('Invest', 'N')],
[('Stock_Price', 'Low'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')]
]
And if you really want as list of sets then you set()
instead of list()
-
but set()
doesn't have to keep order
rows = [set(d.items()) for d in rows]
[
{('Invest', 'Y'), ('Exchange_Rate', 'Low'), ('Stock_Price', 'High')},
{('Exchange_Rate', 'High'), ('Invest', 'N'), ('Stock_Price', 'High')},
{('Invest', 'Y'), ('Stock_Price', 'Low'), ('Exchange_Rate', 'Low')}
]
Full working example. I uses io
only to simulate file.
import csv
from pprint import pprint
text ='''Stock_Price,Exchange_Rate,Invest
High,Low,Y
High,High,N
Low,Low,Y'''
import io
f = io.StringIO(text)
#f = open(filename)
reader = csv.DictReader(f)
rows = list(reader)
pprint(rows)
rows1 = [list(d.items()) for d in rows]
pprint(rows1)
rows2 = [set(d.items()) for d in rows]
pprint(rows2)
Result:
[{'Exchange_Rate': 'Low', 'Invest': 'Y', 'Stock_Price': 'High'},
{'Exchange_Rate': 'High', 'Invest': 'N', 'Stock_Price': 'High'},
{'Exchange_Rate': 'Low', 'Invest': 'Y', 'Stock_Price': 'Low'}]
[[('Stock_Price', 'High'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')],
[('Stock_Price', 'High'), ('Exchange_Rate', 'High'), ('Invest', 'N')],
[('Stock_Price', 'Low'), ('Exchange_Rate', 'Low'), ('Invest', 'Y')]]
[{('Invest', 'Y'), ('Exchange_Rate', 'Low'), ('Stock_Price', 'High')},
{('Exchange_Rate', 'High'), ('Invest', 'N'), ('Stock_Price', 'High')},
{('Invest', 'Y'), ('Stock_Price', 'Low'), ('Exchange_Rate', 'Low')}]
CodePudding user response:
Oh, it is simple. Use csv library from Python.