Table (1) the sample data
Item ID
T1 A, B, C, D
T2. A, B, E, F
T3, A, C, E, F
T4 B, C, D, E,
...
Sample data tables (2)
Item ID
T1 A, B, C, E
T2. A, B, D, F
T3, A, C, E, K
T4 A, C, D, E,
...
Table (3) after the connection table
Item ID
T1 A, B, C, D, E,
T2. A, C, E, F, K
...
In the traditional table join operations, requires the time complexity is O (n2),
Requirements: design algorithm, and reduce the complexity of the join operation, improve efficiency,
CodePudding user response:
What is item? Four fields separated by commas or a field content?CodePudding user response:
Is a field of contentCodePudding user response:
Since is the database, or should be how to make effective index, effective as long as the index to reduce the complexity of O (n),Can consider to intercept respectively before 3 items and the last item, establish joint index, and then use the explain see scan lines
CodePudding user response:
If you don't consider the index, but the problem can be converted to merge two order table and remove duplicate elements, have what good method?CodePudding user response:
The same, also direct merger have to find out the first three of the same data, hide not open connections,As long as doing connection driver table O (n) is necessary, optimization is to find a way to avoid traverse table driven into O (n2), so the core still make effective index
Connection and then delete the duplicates will only more slowly
CodePudding user response:
Before my understanding of this subject is based on ID to connect the two tables, after the connection to merge, I this should understanding is wrong, please bosses to talk about your overall train of thought?CodePudding user response:
The complexity of the convenient to explain the n square is how come?CodePudding user response:
Imagine table 1 in table 2 are two text files, if let you manually for this match, what do you do,Removed first table 1 first, before going to the second list to find three same, if don't have to search, then read the data in table 2 from the beginning again, this complexity is n, the data in table 1 is n times, each one all check is n, n x
And then you can find using the search, is to take the first of A, B, C to directly search in table 2, 4 and see after finding matches, this operation according to the big O notation is O (1), ignored, table 1 data after each take check complexity is n,
Database to remove all kinds of optimization, the basic principle of connection is the same with the train of thought,
CodePudding user response:
Well ok, understand, bosses, is there no additional space or take up less space extra way?CodePudding user response:
No additional footprint is the reconstruction of table 2, interception of first three Settings made a field and a primary key, and item 4 as the second fieldCodePudding user response: