How to know when a char* library function arg needs an array it can modify, not a char pointer?-CodePudding

I am new to C programming. and I know char * and char[] array are different. Yet, you can deduct the char[] to char * when it comes to a function param. So function declarations could be the same.

But how do I know if the function is specifically expecting a char array versus char * by looking at the signature (declaration)?

For example, if I am using a library header file and the function is below. How do I know which one to pass?

// somelib.h
void foo(char *bar);

Because if the function is modifying the bar parameter and if I pass a char *, it will get a segfault. This is for C, but would it be the same for C ?

CodePudding user response：

If the function modifies the array, the argument should be declared char *. If it doesn't modify the array, the argument should be declared const char *.

When the argument is declared const char * it will be safe to pass a string literal. Otherwise you should pass an array or a pointer to memory dynamically allocated with malloc() (in C) or new (in C ).

In C you'll get a compilation error if you try to pass a literal to a parameter without the const modifier, because string literals are const. But for historical reasons, the type of string literals in C is non-const, even though modifying them results in undefined behavior, so there's no error (but some compilers will produce warnings).

CodePudding user response：

as function param

I know char * and char[] array are different

Although char * and char[] are different types, they are not different as function a function parameter, because all function parameters declared as an array are adjusted to be pointers to element of such array. As such, char[] function parameter is adjusted by the compiler to be char *. This adjustment applies only to function parameters and not to other contexts.

But how do I know if the function is specifically expecting a char array versus char * by looking at the signature (declaration)?

A function that expects a char array is a function that expects a char *. I suppose that you mean to ask how to know whether the function expects/requires that the char * will be pointing to an element of an array. And also, perhaps whether the array needs to be null terminated, or have a minimum size.

The answer is that you cannot know it by looking at the signature alone.

how do I know which one to pass?

By reading the documentation that describes what the function does. If there is no documentation, then read the implementation of the function. If there is no implementation either, then you could attempt reverse-engineering the binary. If you don't know what the function does, then you shouldn't call it.

because if the function is modifying the bar parameter and if I pass a char *, it will get a segfault.

I suppose that you mean that the function modifies the pointed char or chars. Modifying the pointer itself should be fine in this case.

If the array whose element is pointed by char * is modifiable, then passing it into a function that modifies the array shouldn't cause a segfault. If the array is const, then you shouldn't pass it into a function that modifies the array.

CodePudding user response：

When used as a function argument, char *x and char x[] are exactly equivalent. In a number of other contexts, they are not.

For an argument described as char *x (or char x[]) it is impossible for the function to know what the caller actually passed. The caller might have passed an array (which is passed as a pointer to the first element of that array). It might have passed &v where v is a variable of type char. It might have passed something else.

All the function can do is make an assumption, and proceed with that assumption. It might assume it is passed the address of a single char, and only modify that char. Alternatively, it might assume it has been passed an array of 1000 char, and modify all of those characters. The function might assume (as the standard C string functions do in a lot of cases) that the caller has passed a nul-terminated array (an array that may have any length, but the end is marked by a sentinel char with numeric value zero). The function might use another argument to specify the length of array (so rely on an assumption that the caller passed both a valid array AND a correct length).

Regardless of what assumptions it makes, there is potential for the function to have undefined behaviour (e.g. accessing or overwriting non-existent characters) if the caller passes something different from what the function assumes.

The situation is the same for the caller (or the programmer trying to write code to call the function). There are only two ways to know what CAN be passed safely to the function.

Read the documentation for the function, and pass something consistent with what the documentation says. The risk of this (even if there are no bugs in the code) is that programmers and code maintainers are notoriously terrible at keeping function documentation consistent with what the function actually does - the documentation may say one thing, but the function actually do something else.
Examine the code of the function. This is definitive, unless the compiler has a bug and translates the code incorrectly. The bigger problem is that it can take significant effort to work out what the function does and is error prone (for humans) if the code is complicated. There is also a lot of code written that is difficult to read.

Some style guides encourage authors of functions to use char *x for an argument that is treated as the address of a single char, and char x[] for a function that expects an array. To a compiler, it makes no difference. And a lot of programmers don't adhere to such guidelines.

CodePudding user response：

You read the documentation. Short of using a proof language like Agda, the type system will never fully describe the function's contract. In C, read the docs.

In C you should never use char*. Functions expecting a string should take an std::string. Functions expecting a character that they wish to modify should use a char&, and in the rare case where you need an array of characters that is not a valid string, std::vector<char> should be used. There's never a use case for a raw pointer as a public function argument in modern C .

CodePudding user response：

This all stems from C where pointers are the only choice. In short: you can't. Which is why you shouldn't use this.

But you can do some guessing:

Any function reading a C string should use const char *. Passing a pointer to a single const char makes no sense as you could just pass the char directly. So it's safe to assume it's a 0 terminated string.

Any function writing to a char array will have to also know the size. So if you see char *ptr, size_t len you can be sure it's a pointer to a buffer of memory.

Now the next question is: Will the function free the pointer? Will it store the pointer for later use? Those you can't even guess.

Which brings us to the not using it part and entering the world of modern C :

The C Core Guidelines make some suggestion about how to write your interface to make it clear in code what the semantics are.

First forget all about C string and arrays. They are bad. Don't use them.

For strings use std::string. For string literals you can use

using namespace std::literals
auto s = "this is a std::string literal"s;

Simple C string literals ("Hello World!") will be converted to std::string automatically when needed. But they can just as easily be passed to a function taking a char *. The s suffix makes it explicit that you want a std::string so no accidents can happen.

For arrays use std::vector or std::array instead. And even if you have to use C arrays for some reason use std::span to pass them around. From that follows that any char * should be a pointer to a single char.

If the function takes ownership of the pointer it should be std::unique_ptr<char> instead and if the function stores the pointer for later use but doesn't handle freeing it then it should be std::shared_ptr<char>. So any char * means a pointer to single char that is only used for the duration of the function and you have to free it later.

Note: this applies to any type, not just char.

Note2: If you can passing a char& would be better as that signals the "pointer" can't be nullptr. A char* argument should always be checked for nullptr.