Home > Blockchain >  How to speed up searching in List<string> with Regex
How to speed up searching in List<string> with Regex

Time:02-16

yes, I search any name/surname/patronymic starts with "jo"

Taking regex out might simplify that. If we upgrade FullName with a method that returns true if any of the names it knows start with any of the prefixes it is given:

public struct FullName
{
    public string Name;
    public string Surname;
    public string Patronymic;

    public override string ToString()
    {
        var str = string.Join(' ', Surname?.Trim(), Name?.Trim(), Patronymic?.Trim()).Trim();
        return str;
    }
}

And we then use it to filter the list:

//speed - 65902
public List<string> Search(string prefix)
{
    var items = _fullNames.ToArray();
    List<string> result = new List<string>();

    if (string.IsNullOrEmpty(prefix) || string.IsNullOrWhiteSpace(prefix))
        throw new ArgumentNullException();

    if (prefix.Length > 100)
        throw new ArgumentException();


    for(int i = 0; i < items.Length; i  )
    {
        if ((items[i].Surname != null | items[i].Name != null | items[i].Patronymic != null) && items[i].ToString().Contains(prefix))
            result.Add(items[i].ToString());
    }
    return result;
}

I moved some of your code around - better to check input is null before you split it than after..

CodePudding user response:

yes, I search any name/surname/patronymic starts with "jo"

Taking regex out might simplify that. If we upgrade FullName with a method that returns true if any of the names it knows start with any of the prefixes it is given:

public class FullName
{
    private string[] _names = new string[3];

    public string Name { get => _names[0]; set => _names[0] = value?.Trim(); }
    public string Surname { get => _names[1]; set => _names[1] = value?.Trim(); }
    public string Patronymic { get => _names[2]; set => _names[2] = value?.Trim(); }

        
    public string GetPersonalDataAsString()
    {
        var str = string.Join(' ', Surname, Name, Patronymic).Trim();
        return str;
    }
    public bool AnyStartsWithAny(string[] prefixes){
        return _names.Any(n => prefixes.Any(p => n?.StartsWith(p, StringComparison.OrdinalIgnoreCase) ?? false));
    }
}

And we then use it to filter the list:

public List<string> Search(string prefixStr)
{
    if (string.IsNullOrEmpty(prefixStr))
        throw new ArgumentNullException(nameof(prefixStr));

    var prefixes = prefixStr.Split();

    if (prefixes.Length > 100)
        throw new ArgumentException("Cannot supply more than 100 prefixes; be more decisive");

    return _fullnames.Where(fn => fn.AnyStartsWithAny(prefixes)).ToList();

}

I moved some of your code around - better to check input is null before you split it than after..

CodePudding user response:

Since you only need to perform a StartsWith, you do not need regex. Instead, you can do something of the like of

.Any(f => (f != null) && f.StartsWith(prefix))

for your checking. If you use a database, then it is highly advisable that you index your table by your sort criteria, that would boost up speed for real.

If this is not in a database, then, instead of an array as data-structure, you could have a tree, of which each node is a letter. So, if you search for jo, then you would find the child in the root whose value is j and then find the child of j whose value is o and then flatten its subtree to get the results.

  • Related