I want a function that can extract numbers with their units of measurment from a text.
For example in A2 i have:
This box weights 5kg and the other box weights 10 kg.
So i want a function that will return:
5kg 10kg
NOTE: I want the function to work with any unit of measurment, not just "kg".
I am a begginer in google sheets so it would be really helpful if you could provide me with a working function.
CodePudding user response:
You can use this sample custom function that extracts words that starts with a number followed by a character/s.
/**
* @customfunction
*/
function EXTRACTMEASUREMENT(input) {
// match all words which starts with a number
var result = input.match(/\d [a-zA-Z] \S*/g)
// combine array into a string separated with spaces
result = result.join(' ');
// Remove special characters(except whitespace) in the string
result = result.replace(/[^a-zA-Z0-9\s]/g, '')
return result;
}
Output:
Limitations:
- Measurements with spaces between the value and the unit cannot be detected. (See result in cell
A5
when space exist in10 kg.
) - Regardless whether the character/s after the number is a valid unit or not, it will be extracted. (See result in cell
A5
where20yy
is not a valid measurement unit)
Note:
This can be improved if you can list valid measurement units that should be supported.
CodePudding user response:
try:
=INDEX(SPLIT(FLATTEN(QUERY(TRANSPOSE(IFERROR(SUBSTITUTE(
REGEXEXTRACT(SPLIT(REGEXREPLACE(A1:A, "(\d .\d |\d )", "×$1"), "×"),
TEXTJOIN("|", 1, {"\d .\d ","\d .\d ","\d ","\d "}&
SORT({"nm";"mm";"cm";"dm";"km";"m";"t";"kg";"dg";"g";"l";"ml";"dl"},
LEN({"nm";"mm";"cm";"dm";"km";"m";"t";"kg";"dg";"g";"l";"ml";"dl"}), 0))),
" ", ))),,9^9)), " "))
CodePudding user response:
One more option:
=ArrayFormula(IF(LEN(A:A),
SPLIT(
REGEXREPLACE(
REGEXREPLACE(A:A,"("&
REGEXREPLACE(
REGEXREPLACE(
REGEXREPLACE(A:A,"(\d [.,]*\d*(?:\w |\s\w ))",""),
"\s \.","|\\."),
"\s ","|")
&")",""),
"(\d)\s","$1")
," ",,1)
,))
CodePudding user response:
Try
=arrayformula(substitute(transpose(query(flatten(split(
REGEXREPLACE(A1,"([0-9.,/] [ ]{0,1}[a-z1-3/.\-""] )","♣♦$1♣")
,"♣")),"select * where Col1 like '♦%' ")),"♦",""))