Time complexity of lookup and then insertion in a std::map<std::string, std::set<std::string&g-CodePudding

So I have a graph (std::map) that stores strings as the keys and the values are sets of strings (adjacency lists for each node in a graph):

std::map<std::string, std::set<std::string>> graph

My insertion code is simply:

graph[from].insert(to) where from and to are both strings. So I insert to in the set of strings associated at the key from.

For the average and worst-case time complexities for this insertion bit of code, I think they are O(x*log(V) * y*log(n)), where x is the length of the string from, y is the length of the string to, V is the number of nodes in the graph, and n is the number of edges in the graph (I included the string lengths because I think we also have to consider std::string comparisons which are O(length of either string))?

And the best-case time complexity is either the same as the average/worst-cases or it's O(x*y), which occurs if we get lucky and the root node of the red-black tree for the map and sets implementations are the strings from and to, respectively -- although, I'm the least confident about this one.

I'd appreciate some help. Thanks.

CodePudding user response：

Time complexity is in terms of something, for example key comparisons. Finding a key in a std::map or std::set is in O(log n) key comparisons. If your key comparison function compares two strings character by character, then you can say that finding a key in this case is in O(s log n) with s the average size of the key strings (and not just the key you are trying to find).

In your scenario, you first find from in O(s log V) and then insert to in O(t log N). s is the average size of the key strings in the map, t is the average size of the key strings in the specific set (the set that corresponds to from), V is the size of the map (or number of vertices), and N is the size of the specific set.

These are two separate steps, which is why I say "specific set" and not "all the sets in the map", and so the total time complexity is simply O(s log V) O(t log N) or O(s log V t log N).

If you don't know the characteristics of the set you will insert to in advance, then you could do some more averaging and end up with O(s log V u log(E/V)), where u is the average size of the key strings in all the sets in the map, and E is the number of these keys (number of edges). Reminder: this complexity is in terms of character comparisons. It may or may not be useful, especially because it has four variables.