Home > Blockchain >  Google sheets - Extract numbers with their units of measurment
Google sheets - Extract numbers with their units of measurment

Time:11-18

I want a function that can extract numbers with their units of measurment from a text.

For example in A2 i have:

This box weights 5kg and the other box weights 10 kg.

So i want a function that will return:

5kg 10kg

NOTE: I want the function to work with any unit of measurment, not just "kg".

I am a begginer in google sheets so it would be really helpful if you could provide me with a working function.

CodePudding user response:

You can use this sample custom function that extracts words that starts with a number followed by a character/s.

/**
 * @customfunction
 */
function EXTRACTMEASUREMENT(input) {
  // match all words which starts with a number
  var result = input.match(/\d [a-zA-Z] \S*/g)
  // combine array into a string separated with spaces
  result = result.join(' ');
  // Remove special characters(except whitespace) in the string
  result = result.replace(/[^a-zA-Z0-9\s]/g, '')
  return result;
}

Output:

enter image description here

Limitations:

  • Measurements with spaces between the value and the unit cannot be detected. (See result in cell A5 when space exist in 10 kg.)
  • Regardless whether the character/s after the number is a valid unit or not, it will be extracted. (See result in cell A5 where 20yy is not a valid measurement unit)

Note:

This can be improved if you can list valid measurement units that should be supported.

CodePudding user response:

try:

=INDEX(SPLIT(FLATTEN(QUERY(TRANSPOSE(IFERROR(SUBSTITUTE(
 REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A, "(\d .\d |\d )", "×$1"), "×"), 
 TEXTJOIN("|", 1, {"\d .\d  ","\d .\d ","\d  ","\d "}&
 SORT({"nm";"mm";"cm";"dm";"km";"m";"t";"kg";"dg";"g";"l";"ml";"dl"}, 
  LEN({"nm";"mm";"cm";"dm";"km";"m";"t";"kg";"dg";"g";"l";"ml";"dl"}), 0))), 
 " ", ))),,9^9)), " "))

enter image description here

CodePudding user response:

One more option:

=ArrayFormula(IF(LEN(A:A),
  SPLIT(
   REGEXREPLACE(
    REGEXREPLACE(A:A,"("&
      REGEXREPLACE(
        REGEXREPLACE(
         REGEXREPLACE(A:A,"(\d [.,]*\d*(?:\w |\s\w ))",""),
        "\s \.","|\\."),
      "\s ","|")
    &")",""),
   "(\d)\s","$1")
 ," ",,1)
,))

enter image description here

CodePudding user response:

Try

=arrayformula(substitute(transpose(query(flatten(split(
REGEXREPLACE(A1,"([0-9.,/] [ ]{0,1}[a-z1-3/.\-""] )","♣♦$1♣")
,"♣")),"select * where Col1 like '♦%' ")),"♦",""))

enter image description here

  • Related