Home > other >  A 40 billion lines of text files, how to quickly calculate the value of each field distribution
A 40 billion lines of text files, how to quickly calculate the value of each field distribution

Time:09-23

A text file that is structured data; A total of five columns, vertical segmentation, a total of 40 billion lines,
I'm going to analysis of the value distribution of each column, not empty proportion, excuse me what kind of technology is faster,

I thought to resolve in python, but I don't know if the efficiency is the best? There are other processing technology?
  • Related