Home > Software engineering >  Polymorphism and with statements
Polymorphism and with statements

Time:09-25

Given a class that export .csv files to a database:

import luigi
import csv

class CsvToDatabase(luigi.Task):
  # (...)
  
  def run(self):    
    ## (...)
    
    with open(self.input().some_attribute, 'r', encoding='utf-8') as some_dataframe:
      y = csv.reader(some_dataframe, delimiter=';')
      
      ### (...) <several lines of code>
  
  # (...)

I'm having problems trying to export a file with ISO-8859-1 encoding.

When I exclude the encoding argument from then open() function, everything works fine, but I cannot make permanent changes in the class definition (firm's other sectors uses it). So I thinked about the possibility of using polymorphism to solve it, like:

from script_of_interest import CsvToDatabase

class LatinCsvToDatabase(CsvToDatabase):
  # code that uses everything in `run()` except the `some_dataframe` definition in the with statement

This possibility actually exists? How could I handle it without repeating the "several lines of code" inside the statement?

CodePudding user response:

Thank you @martineau and @buran for the comments. Based on them, I will request a change in the base class definition that didn't affect other sector's work. It would look like this:

import luigi
import csv

class CsvToDatabase(luigi.Task):
  # (...)
  encoding_param = luigi.Parameter(default='utf-8') # as a class attribute
  
  # (...)
  
  def run(self):    
    ## (...)
    
    with open(self.input().some_attribute, 'r', encoding=self.encoding_param) as some_dataframe:
      y = csv.reader(some_dataframe, delimiter=';')
      
      ### (...) <several lines of code>
  
  # (...)

And finally, in my script, something like:

from script_of_interest import CsvToDatabase

class LatinCsvToDatabase(CsvToDatabase):
  pass


LatinCsvToDatabase.encoding_param = None

CodePudding user response:

You might consider as an alternative modifying the original class to add a new method get_cvs_encoding, which is used by the run method:

class CsvToDatabase(luigi.Task):
    ...
    def get_cvs_encoding(self):
       # default:
       return 'utf-8'

  def run(self):    
    ## (...)
    
    with open(self.input().some_attribute, 'r', encoding=self.get_cvs_encoding()) as some_dataframe:
      y = csv.reader(some_dataframe, delimiter=';')
    ...
}

And then subclass this as follows:

class MyCsvToDatabase(CsvToDatabase):
    def get_cvs_encoding(self):
        return 'ISO-8859-1' # or None

And use an instance of the subclass. I just think this is neater and you can have multiple subclass instances "running" concurrently.

  • Related