What you need to know:
My application uses a database for foods, which exists in a .txt
file. Each food has about 170 datas (2-3 digit numbers) which are separated by tabstops, and each food is again separated by \n, so each line in this .txt
file has the datas for 1 food.
The applications target platform is Android, it needs to work offline and I use Unity with c# for coding.
My 2 Problems are:
- Getting access to the
.txt
file
As it is not possible for android applications to access a .txt
file by
$"{Application.DataPath}/textFileName.txt"
I assigned the .txt file as a TextAsset
(name: txtFile
) in the Inspector. When the app gets started for the first time I load all the data of the TextAsset
file into a json
(name: jsonStringList
), which contains a List of strings:
for (int i = 0; i < amountOfLinesInTextFile; i ); { jsonStringList.Add(txtFile.text.Split('\n')[i]) }
Technically that does work, but unfortunately the txtFile
has a total of about 15000 lines, which makes it really slow (Stopwatch
time for the for-loop
: ≈750000 ms, which is about 12.5 minutes...)
Obviously it is not an option to let the user wait for that long when opening the app for the first time...
- Searching in that
jsonList
In that app it is possible to make an own food by putting multiple foods together. To do that the user has to search for a food and can then press the result to add it.
Currently I check in a
for-loop
if the input of the user-searchbarInputField
(name:searchbar
) matches a food of thejsonStringList
and if that food is not already displayed.If both is true, I add the name of the food to a
List<string>
(name:results
), which is what I use to display the matching foods. (As the datas (including the name) of the foods are separated by tabstops I use.Split('\t')
to get the correct data for the name of the food)for (int i = 0; i < amountOfLinesInTextFile; i ) { string name = jsonStringList[i].Split('\t')[nameIndex].ToLower(); if (name.Equals(searchBar.text.ToLower()) && !results.Contains(name)) { results.Add(name); } }
Again: That technically works, but it is also too slow (even tough it's much faster then problem 1)
(Stopwatch
for the for-loop
: ≈1600 ms)
I'd be very happy for any help to improve the time of those two actions! Maybe there is a whole different approach for handling such large .txt files, but every bit of decreasing the time would be helpful!
CodePudding user response:
15000 is not a big file, really. You just do too many unnecessary reading/transformations. You need to do it once, cache it (save in variable in your case), reuse it.
var foodIndex = txtFile
.text
.Split('\n') //get rows
.Select(x=> x.Split('\t')) //get columns for each row
.ToDictionary(x=> x[nameIndex], StringComparer.OrdinalIgnoreCase); //build case-insensitive search index
var myFood = foodIndex["aPpLe"];
This produce Dictionary<string, string[]>
Better approach
Deserialize CSV format (your file is obviously CSV table) into POCO row:
public class Food
{
[DataMember(Order=1)] //here is your nameIndex
public string Name {get;set;}
[DataMember(Order=2)]
public int Amount {get;set;}
//...
}
var foodIndex = SomeCSVParse<Food>(txtFile.text)
.ToDictionary(x=> x.Name, StringComparer.OrdinalIgnoreCase);
var myFood = foodIndex["aPpLe"];
This produce Dictionary<string, Food>
search index, which look better, easier to use.
This way all conversion from string to int/double/datetime/etc, order of columns, separators (comma, tab, whitespace), cultures (in case there is float/double), efficient reading, headers, etc can be just ditched to 3rd party framework. Someone did this here - Parsing CSV files in C#, with header
There is also plethora of frameworks on nuget, just pick whatever is smaller/popular or copypaste from sources - https://www.nuget.org/packages?q=CSV
And read more about data structures in C# - https://docs.microsoft.com/en-us/dotnet/standard/collections/