Python: Computation in for loop doesn't match result of manual computation-CodePudding

I'm currently working on a project researching properties of some gas mixtures. Testing my code with different inputs, I came upon a bug(?) which I fail to be able to explain. Basically, it's concerning a computation on a numpy array in a for loop. When it computed the for-loop, it yields a different (and wrong) result as opposed to the manual construction of the result, using the same exact code snippets as in the for-loop, but indexing manually. I have no clue, why it is happening and whether it is my own mistake, or a bug within numpy.

It's super weird, that certain instances of the desired input objects run through the whole for loop without any problem, while others run perfectly up to a certain index and others fail to even compute the very first loop.

For instance, one input always stopped at index 16, throwing a:

ValueError: could not broadcast input array from shape (25,) into shape (32,)

Upon further investigation I could confirm, that the previous 15 loops threw the correct results, the results in loop of index 16 were wrong and not even of the correct size. When running loop 16 manually through the console, no errors occured...

The lower array shows the results for index 16, when it's running in the loop. These are the results for index 16, when running the code in the for loop manually in the console. These are, what one would expect to get.

The important part of the code is really only the np.multiply() in the for loop - I left the rest of it for context but am pretty sure it shouldn't interfere with my intentions.

def thermic_dissociation(input_gas, pressure):
# Copy of the input_gas object, which may not be altered out of scope
gas = copy.copy(input_gas)
# Temperature range
T = np.logspace(2.473, 4.4, 1000)
# Matrix containing the data over the whole range of interest
moles = np.zeros((gas.gas_cantera.n_species, len(T)))
# Array containing other property of interest
sum_particles = np.zeros(len(T))

# The troublesome for-loop:
for index in range(len(T)):
    print(str(index)   ' start')
    # Set temperature and pressure of the gas
    gas.gas_cantera.TP = T[index], pressure
    # Set gas mixture to a state of chemical equilibrium
    gas.gas_cantera.equilibrate('TP')
    # Sum of particles = Molar Density * Avogadro constant for every temperature
    sum_particles[index] = gas.gas_cantera.density_mole * ct.avogadro
    
    #This multiplication is doing the weird stuff, printed it to see what's computed before it puts it into the result matrix and throwing the error
    print(np.multiply(list(gas.gas_cantera.mole_fraction_dict().values()), sum_particles[index]))

    # This is where the error is thrown, as the resulting array is of smaller size, than it should be and thus resulting in the error
    moles[:, index] = np.multiply(list(gas.gas_cantera.mole_fraction_dict().values()), sum_particles[index])
    print(str(index)   ' end')

# An array helping to handle the results
molecule_order = list(gas.gas_cantera.mole_fraction_dict().keys())

return [moles, sum_particles, T, molecule_order]

Help will be very appreciated!

CodePudding user response：

This particular issue is not related to numpy. The call to mole_fraction_dict returns a standard python dictionary. The number of elements in the dictionary depends on the optional threshold argument, which has a default value of 0.0.

The source code of Cantera can be inspected to see what happens exactly.

mole_fraction_dict

getMoleFractionsByName

In other words, a value ends up in the dictionary if x > threshold. Maybe it would make more sense if >= was used here instead of >. And maybe this would have prevented the unexpected outcome in your case.

As confirmed in the comments, you can use mole_fraction_dict(threshold=-np.inf) to get all of the desired values in the dictionary. Or -float('inf') can also be used.

In your code you proceed to call .values() on the dictionary but this would be problematic if the order of the values is not guaranteed. I'm not sure if this is the case. It might be better to make the order explicit by retrieving values out of the dict using their key.

CodePudding user response：

If you want the array of all species mole fractions, you should use the X property of the cantera.Solution object, which always returns that full array directly. You can see the documentation for that method: cantera.Solution.X`.

The mole_fraction_dict method is specifically meant for cases where you want to refer to the species by name, rather than their order in the Solution object, such as when relating two different Solution objects that define different sets of species.