I would like to build a data viewer, leveraging off EXCEL, for viewing the contents of a pandas DataFrame
. A simple working example (as a function) is below. However, rather than call this as view(df,10)
, I would like to call this as a method df.view(10)
, similar to how one would use head
, eg df.head(10)
.
I am new to python classes
. All the examples on the internet are of defining a new object and developing classes for that new object. But I think I need to add a method to an existing pandas DataFrame
. I would like this new method to be stored in my own private repository that I can then import, eg from brb import *
, and be able to use this on any arbitrary DataFrame
.
Is this possible to do?
import os
import numpy as np
import pandas as pd
import xlwings as xl
df = pd.DataFrame({
'id': (1,2,3,4,5,6,7,8,9,10)*10,
'year': tuple(np.arange(2011,2021))*10,
'a': np.random.choice(range(2),100),
'b': np.random.choice(range(100),100),
})
def view(df,NObs=None):
book = xl.Book()
book.sheets[0].range("A1").value = df[:NObs]
# What I have:
view(df,10)
# What I want:
df.view(10)
CodePudding user response:
You could define your own class like this:
import numpy as np
import pandas as pd
import xlwings as xl
class MyDataFrame:
def __init__(self, data):
self.df = pd.DataFrame(data)
def view(self, NObs=None):
book = xl.Book()
book.sheets[0].range("A1").value = self.df[:NObs]
if __name__ == "__main__":
df = MyDataFrame(
{
"id": (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) * 10,
"year": tuple(np.arange(2011, 2021)) * 10,
"a": np.random.choice(range(2), 100),
"b": np.random.choice(range(100), 100),
}
)
df.view(10)
And then run the script directly or import MyDataFrame
elsewhere.
Or you could even add your own method to Pandas objects and run it "natively", like this:
import numpy as np
import pandas as pd
import xlwings as xl
from pandas.core.base import PandasObject
def view(df, NObs=None):
book = xl.Book()
book.sheets[0].range("A1").value = df[:NObs]
PandasObject.view = view
if __name__ == "__main__":
df = pd.DataFrame(
{
"id": (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) * 10,
"year": tuple(np.arange(2011, 2021)) * 10,
"a": np.random.choice(range(2), 100),
"b": np.random.choice(range(100), 100),
}
)
df.view(10)