Home > Back-end >  How to test pandas data frames in a unit test
How to test pandas data frames in a unit test

Time:09-04

I wrote two pandas functions:

  • merge_df() merges two data frames.

  • subtract_column() subtracts two columns and saves the value in a new column.

I'm trying to unit test the functions using the code below:

import unittest
import pandas as pd
from pandas.testing import assert_frame_equal
from scripts import return_calculations as rt
  
class TestReturnCalculations(unittest.TestCase):

    def test_merge_df(self):
        # Setup
        df1 = pd.DataFrame({
                'col_a': ['a1', 'a2', 'a3'],
                'col_b': ['b1', 'b2', 'b3'],
            })
        df2 = pd.DataFrame({
                'col_c': ['c1', 'c2', 'c3'],
                'col_d': ['d1', 'd2', 'd3'],
            })
        # Expectation
        expected = pd.DataFrame({
                'col_a': ['a1', 'a2', 'a3'],
                'col_b': ['b1', 'b2', 'b3'],
                'col_c': ['c1', 'c2', 'c3'],
                'col_d': ['d1', 'd2', 'd3'],
            })

        # Call function
        actual = rt.merge_df(df1, df2)

        # Test
        assert_frame_equal(actual, expected)

    # Test subtraction of two pd columns
    def subtract_column(self):
        # Setup
        df1 = pd.DataFrame({
                'col_a': [50, 50, 50],
                'col_b': [30, 30, 30],
            })
        # Expectation
        expected = pd.DataFrame({
                'col_a': [50, 50, 50],
                'col_b': [30, 30, 30],
                'col_c': [20, 20, 20],
            })

        # Call function
        actual = rt.subtract_column(df1, 'col_a', 'col_b', 'col_c')

        # Test
        assert_frame_equal(actual, expected)

if __name__ == "__main__":
    unittest.main()

If I run the script, it will finish without errors.

The problem is:

When I change the 'expected result' dataframe to make the assert fail.

If I do it for the first method merge_df() it throws an error.

However if I do it for the second method subtract_column() it will still show:

----------------------------------------------------------------------
Ran 1 test in 0.005s

OK

CodePudding user response:

The problem was the name of the second method.

In the code it was:

 def subtract_column(self):

To work, it needed to be:

 def test_subtract_column(self):
  • Related