Home > front end >  Python Classes : How to add a custom method to an existing object (pandas dataframe)
Python Classes : How to add a custom method to an existing object (pandas dataframe)

Time:11-14

I would like to build a data viewer, leveraging off EXCEL, for viewing the contents of a pandas DataFrame. A simple working example (as a function) is below. However, rather than call this as view(df,10), I would like to call this as a method df.view(10), similar to how one would use head, eg df.head(10).

I am new to python classes. All the examples on the internet are of defining a new object and developing classes for that new object. But I think I need to add a method to an existing pandas DataFrame. I would like this new method to be stored in my own private repository that I can then import, eg from brb import *, and be able to use this on any arbitrary DataFrame.

Is this possible to do?

import os
import numpy as np
import pandas as pd
import xlwings as xl


df = pd.DataFrame({
    'id': (1,2,3,4,5,6,7,8,9,10)*10,
    'year': tuple(np.arange(2011,2021))*10,
    'a': np.random.choice(range(2),100),
    'b': np.random.choice(range(100),100),
})


def view(df,NObs=None):
    book = xl.Book()
    book.sheets[0].range("A1").value = df[:NObs]


#  What I have:
view(df,10)   

#  What I want:
df.view(10)

CodePudding user response:

You could define your own class like this:

import numpy as np
import pandas as pd
import xlwings as xl


class MyDataFrame:
    def __init__(self, data):
        self.df = pd.DataFrame(data)

    def view(self, NObs=None):
        book = xl.Book()
        book.sheets[0].range("A1").value = self.df[:NObs]


if __name__ == "__main__":
    df = MyDataFrame(
        {
            "id": (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) * 10,
            "year": tuple(np.arange(2011, 2021)) * 10,
            "a": np.random.choice(range(2), 100),
            "b": np.random.choice(range(100), 100),
        }
    )
    df.view(10)

And then run the script directly or import MyDataFrame elsewhere.

Or you could even add your own method to Pandas objects and run it "natively", like this:

import numpy as np
import pandas as pd
import xlwings as xl
from pandas.core.base import PandasObject


def view(df, NObs=None):
    book = xl.Book()
    book.sheets[0].range("A1").value = df[:NObs]


PandasObject.view = view


if __name__ == "__main__":
    df = pd.DataFrame(
        {
            "id": (1, 2, 3, 4, 5, 6, 7, 8, 9, 10) * 10,
            "year": tuple(np.arange(2011, 2021)) * 10,
            "a": np.random.choice(range(2), 100),
            "b": np.random.choice(range(100), 100),
        }
    )
    df.view(10)
  • Related