I from an XML file into a column, the content of this column is [3.0, 1], [3.0, 2], [3.0, 3] the two-dimensional array, after a number 1, 2, 3 of them are its index, I use the following assignment simulated out,
Scala> Val df1=sc. Parallelize (List (" [34.0, 1], [34.0, 2], [175.0, 3] ", 30), (" [3.0, 1], [3.0, 2], [3.0, 3] ", 36), (" [127.0, 1], [127.0, 2], [127.0, 3] ", 27))). ToDF (" infoComb ", "age")
Df1: org. Apache. Spark. SQL. DataFrame=[infoComb: string, age: int]
Scala> Df1. Show (false)
+ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + - +
| infoComb | age |
+ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + - +
| [34.0, 1], [34.0, 2], [175.0, 3] 30 | |
| [3.0, 1], [3.0, 2], [3.0, 3] 36 | |
| [127.0, 1], [127.0, 2], [127.0, 3] 27 | |
+ -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- -- - + - +
My question is: how do I look in scala is to extract the 127.0 [127.0, 1]?
Troublesome everybody can see,
CodePudding user response:
Although is simulated, but different data types, this field is in the database infoComb arrayCodePudding user response:
You may want to try a regularCodePudding user response:
Back an answer before, since the spark for the corresponding column has the provisions on the type, is of course canTo traverse,
I remember as if wrapped series
You can try to map operations for the df
Using getAS [T] function converts the corresponding column you need template sequence type,
After the type of change you want to reload (calling number to function)
If you have intellji idea this editor, you can going to use the type of direct display
Come out, such as in intelllji idea use Alt + enter
Stop the cursor on the object will show its type,