Home > other >  C# Not able to extract cell address with $ sign from formula
C# Not able to extract cell address with $ sign from formula

Time:07-22

I have a excel formula and i am extracting list of cell address from that formula and store in list. my code is working fine but when there is $ sign with cell address then regex fail to extract that cell address.

here is my sample code

List<string> lstConstituent = null;
string formula "=S$10 X11";
lstConstituent = e.Cell.Formula.GetFormulaConstituents().Distinct().ToList();

Here is body of GetFormulaConstituents() function

public static List<string> GetFormulaConstituents(this string formula)
{
    List<string> strCellAddress = new List<string>();

    string rxCellPattern = @"(?<![$])       # match if prefix is absent: $ symbol (prevents matching $A1 type of cells)
                                    # (if all you have is $A$1 type of references, and not $A1 types, this negative look-behind isn't needed)
                    \b              # word boundary (prevents matching Excel functions with a similar pattern to a cell)
                    (?<col>[A-Z] )  # named capture group, match uppercase letter at least once
                                    # (change to [A-Za-z] if you have lowercase cells)
                    (?<row>\d )     # named capture group, match a number at least once
                    \b              # word boundary
                    ";


    Regex rxCell = new Regex(rxCellPattern, RegexOptions.IgnorePatternWhitespace);

    if (rxCell.IsMatch(formula))
    {
        //Console.WriteLine("Formula: {0}", formula);
        foreach (Match cell in rxCell.Matches(formula))
        {
            strCellAddress.Add(cell.Value);
        }
    }

    return strCellAddress;
}

Though I removed this (?<![$]) from regex but still could not extract the cell address S$10 from formula.

what to alter in my routine? as a result i should be able to extract cell address with $ sign.

CodePudding user response:

You can match an optional dollar sign in in the first group for example, without using a lookbehind assertion:

\b(?<col>[A-Z] \$?)(?<row>\d )\b

Regex demo

  • Related