im struggleling with some "best practice" idea's
only posting a small piece of the code the original is very and complicated.
See below a litte test function
TEST1 runs in 5ms
TEST2 runs in 1405ms
to me TEST2 feels like the best practice but the performace diffence are so big! in the my full code the Functions are in the Header file and the main in the source only the function will ever be writing to the "TEST123" the Main will only read it right after it is called, the code is not running 100000 times in the full code but around 24 times, but the faster the better (inverse kinematics of a 6 Axis Robot)
What is the best way to do this? Or is there even better ways?
Any advice is appreciated
double TEST123[12];
void TESTFUNTC1A(int ID) {
for (int x = 0; x < 12; x ) {
TEST123[x] = 1.123;
}
}
void TESTFUNTC1A() {
int64 starttimetest2 = timeSinceEpochMillisec();
vector<double> TEST125(12);
double test123456;
for (int y = 0; y < 100000; y) {
TESTFUNTC1A(0);
for (int x = 0; x < 12; x ) {
test123456 = TEST123[x];
}
}
std::cout << "TEST1 " << timeSinceEpochMillisec() - starttimetest2 << endl;
}
vector<double> TESTFUNTC2A(int ID) {
vector<double> TEST124(12);
for (int x = 0; x < 12; x ) {
TEST124[x] = 1.123;
}
return TEST124;
}
void TESTFUNTC2A() {
int64 starttimetest2 = timeSinceEpochMillisec();
vector<double> TEST125(12);
double test123456;
for (int y = 0; y < 100000; y) {
TEST125 = TESTFUNTC2A(0);
for (int x = 0; x < 12; x ) {
test123456 = TEST125[x];
}
}
std::cout << "TEST2 " << timeSinceEpochMillisec()- starttimetest2 << endl;
}
int main()
{
TESTFUNTC1A();
TESTFUNTC2A();
}
CodePudding user response:
First of all, best approach does not make much sense; it normally calls for personal opinion. Secondly, there are no eternal rules; each project must be considered individually and handled based on its constraints. If you need to work on a limited MCU - with outdated toolset and ineffective toolchain - things can be different.
But:
- Having all functions defined in header means either they are template, or inline or the toolchain cannot handle multiple TU per project, or you are just doing wrong.
- Global objects(more generally, static duration objects) are supposed to be avoided; they create lots of maintenance issues due to implicit couplings generated. I can not name and count them, but implicit shared reference is one of them; initialization order is another.
Next:
If you need an array with compile-time known size, std::array
can be a good choice. If the size is supposed to be immutable but relatively large, const std::unique_ptr<T[]>
can be considered. std::vector
is good when the size is supposed to mutate during runtime; such size mutations can be runtime costly. Raw arrays are not value types; they can't be directly transferred by-value across function boundaries: you'll need to wrap them in struct
to make them by-value and that is part of what std::array
does.
CodePudding user response:
Your code has some code smell:
vector<double> TESTFUNTC2A(int ID) {
vector<double> TEST124(12);
for (int x = 0; x < 12; x ) {
TEST124[x] = 1.123;
}
return TEST124;
}
You create TEST123
with size 12 and initialize every element to 0.0
. Then you overwrite it with 1.123
. Why not just write vector<double> TEST124(12, 1.123);
? If you need to initialize the elements to different values then you should create an empty vector, reserve the right size and then emplace_back()
.
void TESTFUNTC2A() {
int64 starttimetest2 = timeSinceEpochMillisec();
vector<double> TEST125(12);
double test123456;
for (int y = 0; y < 100000; y) {
TEST125 = TESTFUNTC2A(0);
for (int x = 0; x < 12; x ) {
test123456 = TEST125[x];
}
}
std::cout << "TEST2 " << timeSinceEpochMillisec()- starttimetest2 << endl;
}
You create TEST125
as vector with 12 elements initialized to 0.0. But then you assign it a new vector. The initial vector should be empty so you don't needlessly allocate heap memory. Also you could create a new vector inside the loop so the vector is created in-place instead of move assigned.
The size of the vector is known at compile time. Why not use std::array
?
Note that the difference in time is a failure of the compiler to properly optimize the code. It's easier in TEST1 for the compiler to see that the code has no side effect so you probably get an empty loop there but not in the TEST2 case. Did you try using clang? It's a bit better at eliminating unused std::vector
.
When you write your tests you should always run your loops for different number of iterations to see if the time spend increases with the number of iterations as expected. Other than plain looking at the compiler output that's the simplest way to see if your testcase is optimized away or not.