Home > other >  Am I misusing try/except?
Am I misusing try/except?

Time:09-03

I am making a small program to transcribe my rota which gets sent to me in an oddly laid out excel document. I have written code to open the document and use the repeating patterns in the layout to extract and rearrange the information into a simple pandas dataframe.

In one step of this process I use a try/except loop to differentiate cells. The shifts are laid out in a regular format and generally have two cells containing start and end time. However, some days have the times replaced by a comment ('Annual leave', 'Teaching', etc). I found that I could sperate these out from the cells with actual times by trying the float() method on the cell content. Cells containing a number (although currently stored as a string) can pass this method. Cells containing text cannot. This allows me to use except: and else: blocks to perform different actions based on the cell content.

I can therefore put the data from cells with times in, into the start and end time in my new dataframe. Alternatively for cells containing text, I can move the text to a new column labelled 'Comment' and keep the information.

Now this method works absolutely fine and serves it purpose. Great!

However, PyCharm gives me a little wavy yellow line under except and suggests that I shouldn't be using such a broad except clause. I suspect it wants me to specify the exception. I am of course not actually using this as an exception, and in fact it is expected behaviour for the except block to run under normal circumstances. This leaves me with the niggling feeling that I am fudging my solution rather than using appropriate code.

As I am really keen to improve my python skills and to learn good habits, I wonder if anyone can point me towards a more elegant way of coding this?

See code below:

def sort_to_row(df):
    out_frame = pd.DataFrame(columns=["Date", "Start", "End", "Comment"])

    weekdays = ("Monday", "Tuesday", "Wednesday", "Thursday", "Friday", "Saturday", "Sunday")

    n = 0
    for i in range(len(df)):
        comment = ""
        date = ""
        start = ""
        end = ""

        if df.iloc[i, 0] in weekdays:
            try:
                float(df.iloc[i   2, 0])

            except:
                date = df.iloc[i   1, 0]
                start = 0.0
                end = 0.0
                comment = df.iloc[i   2, 0]

            else:
                date = df.iloc[i   1, 0]
                start = df.iloc[i   2, 0]
                end = df.iloc[i   2, 1]

            finally:
                out_frame.loc[n, "Date"] = date
                out_frame.loc[n, "Start"] = start
                out_frame.loc[n, "End"] = end
                out_frame.loc[n, "Comment"] = comment
                n  = 1

    out_frame["Date"] = pd.to_datetime(out_frame["Date"], format='%Y-%m-%d %H:%M:%S')
    out_frame = out_frame.sort_values(by="Date")
    out_frame['Date'] = out_frame['Date'].dt.strftime('%d/%m/%Y')
    return out_frame

CodePudding user response:

(Note: this question is probably more suited for https://codereview.stackexchange.com/ )

I've made a few changes, that make the intent of the code clear:

  • except ValueError: because you're really testing for the failure of the floating point value. Any other error should still show. For example, an empty dataframe might throw an IndexError, which should probably result in the program stopping

  • I've only put the assignment of the relevant variables in the except-else parts that differ. So date = df.iloc[i 1, 0] goes outside the try-except-else part, since it's the same for both. For consistenty, though, I've put comment = "" in the else part, even if it's not necessary

  • There is no need to a finally clause: the code will continue that part anyway. A finally clause is useful when you e.g. return from a function inside an except clause, but you still want to clean up (close a file, for example); or if you (re)throw an exception inside an except clause, which will also leave that part of the code. The finally clause will then still be executed.
    Here, however, the part in the finally clause is just part of the normal flow of the code, so it can/should probably go outside of the try-except part.

        if df.iloc[i, 0] in weekdays:
            try:
                float(df.iloc[i   2, 0])
            except ValueError:
                start = 0.0
                end = 0.0
                comment = df.iloc[i   2, 0]
            else:
                start = df.iloc[i   2, 0]
                end = df.iloc[i   2, 1]
                comment = ""
            date = df.iloc[i   1, 0]
            out_frame.loc[n, "Date"] = date
            out_frame.loc[n, "Start"] = start
            out_frame.loc[n, "End"] = end
            out_frame.loc[n, "Comment"] = comment
            n  = 1

It is basically an if-else statement, but the float conversion is harder to include in an if condition.

Moreover, if the floating point conversion fails not too often, then the except clause really becomes the exception to the rule (of a float conversion), so it suits the name.

If you prefer, you can make things even shorter inside the try-except statement, similar to what you already did with the comment = "" default near the top of your code, by putting the default (expected) assignments outside and before the try-except:

        if df.iloc[i, 0] in weekdays:
            date = df.iloc[i   1, 0]
            start = df.iloc[i   2, 0]
            end = df.iloc[i   2, 0]
            comment = ""   # if not near the top already
            try:
                float(df.iloc[i   2, 0])
            except ValueError:
                start = 0.0
                end = 0.0
                comment = df.iloc[i   2, 0]
            out_frame.loc[n, "Date"] = date
            ...

Finally, there is the issue that you don't assign the result of the float conversion to anything. You do use end = df.iloc[i 2, 0], or comment = df.iloc[i 2, 0]. In a sense, that has more to do with rethinking and restructuring your code, since you assign two variables to a single value, depending on whether that value can be converted to a float. Usually, it's the other way around: a single variable can be assigned one of two/multiple values, depending on a condition. But I think that would go well beyond this answer. See also my note at the top.

  • Related