Home > other > Programming and algorithm about big data ~ ~ ~ is very confused, help graphs and Spark a great god,
Programming and algorithm about big data ~ ~ ~ is very confused, help graphs and Spark a great god,
Time:09-23
Today just began to study large data, an algorithm to write programs, but many do not understand, thought of a problem today, here to help, first thanked ~
Suppose there are 80 kinds of equipment, 10 kinds of fault types and 50000 kinds of parts, the data in the database, using (FP - growth, for example) for mining association rules, have a kind of equipment fault type with more than N a certain parts, set N> 600 were recorded, This use usually select count (*) cycle where database query can also do it, but the cycle of 80 * 10 * 50000 times and than can do it, But dimensions increase, the increase in the number, process time and the database IO are problem,
So to ask, this kind of algorithm should do, 1, database data how to deal with? If you want to handle to the mapping file how to do? 2, if use the way how to implement graphs? 3, if use the Spark to do? (Spark Mlib seems to have a FP - growth algorithm)
This is me with my big data beginner very confused, I hope you "lead" walked pass by to give directions, if you can please say something in detail, thank you.