I am looking for an efficient way to sort the data in a 2D array. The array can have many rows and columns but, in this example, I will just limit it to 6 rows and 5 columns. The data is strings as some are words. I only include one word below but in the real data there are a few columns of words. I realise if we sort, we should treat the data as numbers?
string[,] WeatherDataArray = new string[6,5];
The data is a set of weather data that is read every day and logged. This data goes through many parts of their system which I cannot change and it arrives to me in a way that it needs sorting. An example layout could be:
Day number, temperature, rainfall, wind, cloud
The matrix of data could look like this
3,20,0,12,cumulus
1,20,0,11,none
23,15,0,8,none
4,12,0,1,cirrus
12,20,0,12,cumulus
9,15,2,11,none
They now want the data sorted so it will have temperature in descending order and day number in ascending order. The result would be
1,20,0,11,none
3,20,0,12,cumulus
12,20,0,12,cumulus
9,15,2,11,none
23,15,0,0,none
4,12,0,1,cirrus
The array is stored and later they can extract it to a table and do lots of analysis on it. The extraction side is not changing so I cannot sort the data in the table, I have to create the data in the correct format to match the existing rules they have.
I could parse each row of the array and sort them but this seems a very long-handed method. There must be a quicker more efficient way to sort this 2D array by two columns? I think I could send it to a function and get returned the sorted array like:
private string[,] SortData(string[,] Data)
{
//In here we do the sorting
}
Any ideas please?
CodePudding user response:
I would suggest parsing the data into objects that can be sorted by conventional methods. Like using LINQ:
myObjects.OrderBy(obj => obj.Property1)
.ThenBy(obj=> obj.Property2);
Treating data as a table of strings will just make processing more difficult, since at every step you would need to parse values, handle potential errors since a string may be empty or contain an invalid value etc. It is a much better design to do all this parsing and error handling once when the data is read, and convert it to text-form again when writing it to disk or handing it over to the next system.
If this is a legacy system with lots of parts that handle the data in text-form I would still argue to parse the data first, and do it in a separate module so it can be reused. This should allow the other parts to be rewritten part by part to use the object format.
If this is completely infeasible you either need to convert the data to a jagged array, i.e. string[][]
. Or write your own sorting that can swap rows in a multidimensional array.
CodePudding user response:
I agree with the other answer that it's probably best to parse each row of the data into an instance of a class that encapsulates the data, creating a new 1D array or list from that data. Then you'd sort that 1D collection and convert it back into a 2D array.
However another approach is to write an IComparer
class that you can use to compare two rows in a 2D array like so:
public sealed class WeatherComparer: IComparer
{
readonly string[,] _data;
public WeatherComparer(string[,] data)
{
_data = data;
}
public int Compare(object? x, object? y)
{
int row1 = (int)x;
int row2 = (int)y;
double temperature1 = double.Parse(_data[row1, 1]);
double temperature2 = double.Parse(_data[row2, 1]);
if (temperature1 < temperature2)
return 1;
if (temperature2 < temperature1)
return -1;
int day1 = int.Parse(_data[row1,0]);
int day2 = int.Parse(_data[row2,0]);
return day1.CompareTo(day2);
}
}
Note that this includes a reference to the 2D array to be sorted, and parses the elements for sorting as necessary.
Then you need to create a 1D array of indices, which is what you are actually going to sort. (You can't sort a 2D array, but you CAN sort a 1D array of indices that reference the rows of the 2D array.)
public static string[,] SortData(string[,] data)
{
int[] indexer = Enumerable.Range(0, data.GetLength(0)).ToArray();
var comparer = new WeatherComparer(data);
Array.Sort(indexer, comparer);
string[,] result = new string[data.GetLength(0), data.GetLength(1)];
for (int row = 0; row < indexer.Length; row)
{
int dest = indexer[row];
for (int col = 0; col < data.GetLength(1); col)
result[dest, col] = data[row, col];
}
return result;
}
Then you can call SortData
to sort the data:
public static void Main()
{
string[,] weatherDataArray = new string[6, 5]
{
{ "3", "20", "0", "12", "cumulus" },
{ "1", "20", "0", "11", "none" },
{ "23", "15", "0", "8", "none" },
{ "4", "12", "0", "1", "cirrus" },
{ "12", "20", "0", "12", "cumulus" },
{ "9", "15", "2", "11", "none" }
};
var sortedWeatherData = SortData(weatherDataArray);
for (int i = 0; i < sortedWeatherData.GetLength(0); i)
{
for (int j = 0; j < sortedWeatherData.GetLength(1); j)
Console.Write(sortedWeatherData[i,j] ", ");
Console.WriteLine();
}
}
Output:
1, 20, 0, 11, none,
3, 20, 0, 12, cumulus,
12, 20, 0, 12, cumulus,
9, 15, 2, 11, none,
23, 15, 0, 8, none,
4, 12, 0, 1, cirrus,
Note that this code does not contain any error checking - it assumes there are no nulls in the data, and that all the parsed data is in fact parsable. You might want to add appropriate error handling.
Try it on .NET Fiddle: https://dotnetfiddle.net/mwXyMs