Home > Net >  spark do not show all content in line
spark do not show all content in line

Time:05-18

I have a data frame with content

scala> true_nomar.show(1)
 -------- -------------- -------------------- ------ ------ -------------------- 
|category|topicUpPredict|               topic|ciTrue|upTrue|              normal|
 -------- -------------- -------------------- ------ ------ -------------------- 
|the_thao|      the_thao|[the_thao, the_gioi]|  true|  true| Khi các mục sư m...|
 -------- -------------- -------------------- ------ ------ -------------------- 
only showing top 1 row

but when i show all, the content of column normal is not full text, another columns has no content

scala> true_nomar.show(1,false)
 -------- -------------- -------------------- ------ ------ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
|category|topicUpPredict|topic               |ciTrue|upTrue|normal                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              |
 -------- -------------- -------------------- ------ ------ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
Thích thú trước hai vị học trò đặc biệt này, ông Eriksson nói: "Bóng đá cần nhiều người như là hai vị mục sư Charles và Tim để tạo cho trẻ em thật nhiều cơ hội đến với bóng đá”. Thậm chí Geoff Hurst, cựu ngôi sa|ổi lại, hai mục sư Crosland và Smith cùng các con chiên sẽ cầu nguyện cho đội tuyển Anh trong VCK World Cup 2006 mà trước mắt là cầu nguyện cho chấn thương của tiền đạo Michael Owen sớm hồi phục. 
 -------- -------------- -------------------- ------ ------ -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- 
only showing top 1 row

CodePudding user response:

This is most likely due to one or more carriage return (CR) symbols (\r in Scala string literals) embedded somewhere in the text. When a CR is encountered, the terminal moves the caret to the beginning of the line, which messes up the output:

scala> "123\r456"
4560: String = 123

Here, the output should be res0: String = 123..., but the caret position gets reset after 123 and 456 overwrites res. The same happens when a dataframe is printed:

scala> Seq(("baz", "foofoofoo\rbarbar")).toDF("cat", "normal").show(false)
 --- ---------------- 
|cat|normal          |
 --- ---------------- 
barbar|ofoofoo
 --- ---------------- 

If you look closer at your output, you'll find the closing |, so it is full text, just garbled:

--------------------

--------------------
 cựu ngôi sa|ổi lại,
--------------------
            ^
            ^
   end of "normal" column

Use regexp_replace($"normal", "\r", "\\\\r") to replace all CRs with the escaped representation \r:

scala> val df = Seq(("baz", "foofoofoo\rbarbar")).toDF("cat", "normal")
df: org.apache.spark.sql.DataFrame = [cat: string, normal: string]

scala> df.show(false)
 --- ---------------- 
|cat|normal          |
 --- ---------------- 
barbar|ofoofoo
 --- ---------------- 


scala> df.withColumn("normal", regexp_replace($"normal", "\r", "\\\\r")).show(false)
 --- ----------------- 
|cat|normal           |
 --- ----------------- 
|baz|foofoofoo\rbarbar|
 --- ----------------- 
  • Related