Home > front end >  Remove non-numeric characters from string in Delphi
Remove non-numeric characters from string in Delphi

Time:01-19

I have these three functions that successfully remove all non-numeric characters from a given string:

The first function loops through the characters of the input string, and if the current character is a number, it adds it to a new string that is returned as the result of the function.

  function RemoveNonNumericChars(const s: string): string;
  begin
    Result := '';
    for var i := 1 to Length(s) do
    begin
      if s[i] in ['0'..'9'] then
        Result := Result   s[i];
    end;
  end;

The second function loops through the characters of the input string from right to left, and if the current character is not a number, it uses the Delete function to remove it from the string

  function RemoveNonNumericChars(const s: string): string;
  begin
    Result := s;
    for var i := Length(Result) downto 1 do
    begin
      if not(Result[i] in ['0'..'9']) then
        Delete(Result, i, 1);
    end;
  end;

The third function uses a regular expression to replace all non-numeric characters with nothing, thus removing them. TRegEx is from the System.RegularExpressions unit.

  function RemoveNonNumericChars(const s: string): string;
  begin
    var RegEx := TRegEx.Create('[^0-9]');
    Result := RegEx.Replace(s, '');
  end;

All three of them do what I need, but I want to know if there is maybe a built-in function in Delphi for this... Or maybe even a better way to do it than the way I'm doing it. What's the best and/or fastest way to remove non-numeric characters from a string in Delphi?

CodePudding user response:

Both your approaches are slow because you constantly change the length of the string. Also, they only recognise Arabic digits.

To solve the performance issue, preallocate the maximum result length:

function RemoveNonDigits(const S: string): string;
begin
  SetLength(Result, S.Length);
  var LActualLength := 0;
  for var i := 1 to S.Length do
    if CharInSet(S[i],  ['0'..'9']) then
    begin
      Inc(LActualLength);
      Result[LActualLength] := S[i];
    end;
  SetLength(Result, LActualLength);
end;

To support non-Arabic digits, use the TCharacter.IsDigit function:

function RemoveNonDigits(const S: string): string;
begin
  SetLength(Result, S.Length);
  var LActualLength := 0;
  for var i := 1 to S.Length do
    if S[i].IsDigit then
    begin
      Inc(LActualLength);
      Result[LActualLength] := S[i];
    end;
  SetLength(Result, LActualLength);
end;

To optimise even further, as suggested by Stefan Glienke, you can bypass the RTL's string handling machinery and write each character directly with some loss of code readability:

function RemoveNonDigits(const S: string): string;
begin
  SetLength(Result, S.Length);
  var ResChr := PChar(Result);
  var LActualLength := 0;
  for var i := 1 to S.Length do
    if CharInSet(S[i],  ['0'..'9']) then
    begin
      Inc(LActualLength);
      ResChr^ := S[i];
      Inc(ResChr);
    end;
  SetLength(Result, LActualLength);
end;
  • Related