Unable to spilt data using python-CodePudding

I have a data like below:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

Now, I want to sum up the elements that appear before the space and maintain the max_sum track with the sum of the next elements that appear before the empty line. So for me, it should be the sum of 1000,2000,3000 = 6000 compared with the initial max_sum for eg 0, and now sum the next element i.e 4000, and keep comparing with the max_sum which could be like max(6000, 4000) = 6000 and keep on doing the same but need to reset the sum if I encounter a empty line.

Below is my code:

max_num = 0
    sum = 0
    for line in data:
        # print(line)
        sum = sum   int(line)
        if line in ['\n', '\r\n']:
            sum=0
        max_num = max(max_num, sum)

This gives an error:

sum = sum   int(line)
ValueError: invalid literal for int() with base 10: '\n'

CodePudding user response：

You are trying to cast empty lines to int:

max_num = 0
sum = 0
for line in data:
    print(line)
    if line.strip():
        sum = sum   int(line)
    if line in ['\n', '\r\n']:
        sum=0
    max_num = max(max_num, sum)

CodePudding user response：

Here's a quick oneliner:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

max(
    sum(
        int(i) for i in l.split('\n')
    ) for l in data.split('\n\n')
)

which gives 24000

First it divides based on \n\n and then based on \n. Sums all elements in the groups and then chooses the biggest value.

CodePudding user response：

Note that int() is impervious to leading and trailing whitespace - e.g., int('\n99\n') will result in 99 without error. However, a string comprised entirely of whitespace will result in ValueError. That's what is happening here. You're trying to parse a string that just contains a newline character.

You can take advantage of ValueError for these data as follows:

data = """1000
2000
3000

4000

5000
6000

7000
8000
9000

10000"""

current_sum = 0
max_sum = float('-inf')

for t in data.splitlines():
    try:
        x = int(t)
        current_sum  = x
    except ValueError:
        max_sum = max(max_sum, current_sum)
        current_sum = 0

print(f'Max sum = {max(max_sum, current_sum)}')

Output:

Max sum = 24000

CodePudding user response：

There are lines that are just composed of '\n', which you are trying to convert into int. You should move your test for line up the int conversion, and continue without casting to int if the line is '\n' or '\r\n'

CodePudding user response：

Don't use builtin names like sum, here you need to split the data in \n you will get list then you can loop over and remove space using strip() then if line has some digits it will sum it else it will assign 0.

max_num = 0
sum_val = 0


for line in data.split("\n"):
    line = line.strip()
    sum_val = int(line)   sum_val if line and line.isdigit() else 0
    max_num = max(max_num, sum_val)
print(max_num)

CodePudding user response：

You can try:

data = """1000
    2000
    3000
    
    4000
    
    5000
    6000
    
    7000
    8000
    9000
    
    10000
    """

data = data.splitlines()

max_sum = 0
group = []

for data_index, single_data in enumerate(data):
    single_data = single_data.replace(" ","")
    if single_data == "":
        if max_sum < sum(group):
            max_sum = sum(group)
        group = []
    else:
        group.append(int(single_data))

print(max_sum)

Output: