Follow

Hi guys! Programming question:

In C, is an array _exactly_ the same thing as a pointer to the first element of the array? How does the program know the size of the array? Where is this information stored?

I just found out the difference between the length of a string and the size of the buffer reserved for that string. But if `char s[]` and `char *s` are the same thing, who controls the size of that?

(boost appreciated)

@camelo003 They're similar - but the one difference is that the compiler knows the size of the array but nothing knows the size of the things a pointer points to.

@penguin42 @camelo003 The constructive answer is that for the program to know the lengths and sizes, they must be tracked explicitly. So, every function always gets 2 arguments, not just one: thing(size_t len, uint8_t *buf); or the same in a struct with 2 members.

@camelo003 Pointers are just pointers (address to a memory location). In contrast to other programming languages, arrays don't store their own length, though you could create your own type, e.g. `struct Arr { uint8_t *mem, size_t size };` though you are responsible for writing functions that operate on them.

An array declared as `char s[256]` just tells the compiler to reserve 256 bytes of memory at that location. The size is known at compile time, which is why sizeof works.

@camelo003

No, C actually has the concept of a traditional array hidden in it. Consider:

float a[3][4], *b = a;

(Your compiler won't like that assignment to b!)

and try saying a[1][2] versus b[1][2], and so forth.

Also take a look at sizeof(a)

C understands types, including multi-dimensional arrays. It has for a long, long time.

Arrays and pointers intersect in C, but are not identical.

@camelo003 if you mean size of one element, the compiler predetermines that and transforms that into pointer arithmetic for runtime. as for the total size of an array, there are no records kept for that. allocated heap is accounted for by the standard library, but that's only so you can free() after a malloc(), and so the stdlib knows when to increase the memory break if it needs more memory from the operating system

@camelo003

Welcome to the worst part of C.

The [] syntax is used for arrays. The * syntax is used for pointers. You can treat pointers and arrays the same way, most of the time, but not when declaring them. That's because when you are declaring an array, the compiler sets aside some space for the array's items. But when you are declaring a pointer, you don't get any extra space set aside. The pointer is just a place for you to store a memory address.

In other words,

// Sets aside space for one memory address. (Likely 4 or 8 bytes).
char *a;

// Sets aside space for an array of ten characters. (Likely 10 bytes).
char b[10];

// This is an error because the compiler doesn't know how much space to set aside.
char c[];

// If you say "b", the compiler will automatically return to you
// the memory address of b's first entry, so you can do this:
a = b;

@camelo003

Meanwhile, functions are perfectly happy to accept arrays of unknown size, because they don't have to set aside any space for the array -- that was their caller's job.

// Here's a function that takes the memory address of a character.
int f1(char *aa) {
}

f1(a); // Passing a memory address to this function -- works fine.

f1(b); // Passing "an array" to this function -- works because b is automatically converted from an array to a memory address before it is passed to f1. Same reason you were able to assign a = b earlier. We call this "the array decaying to a pointer".

// Here's another function. You can use this syntax when you are declaring
// a function, because it doesn't try to set aside any storage. It is
// expecting to be passed an existing array. You can use it the same way
// as f1.
int f2(char bb[]) {
}

int f3(char bb[10]) {
// But giving a size here is a misnomer.
// The compiler will not do size checking for you.
}

@camelo003 And here's an English explanation instead of code.

char s[] and char *s are both unsized.

char s[10] is sized. The compiler knows its size, but it doesn't store the size anywhere for your running program to look up later! After all, storing the size would cost a whole extra word of memory, and C thinks it's still the year 1975, so that kind of overhead is not acceptable! (cries)

So the C run-time doesn't give you array bounds checking. The C compiler might be able to perform some primitive bounds-checks on your char s[10], but any code that accepts a user-supplied offset (known only at runtime) won't be automatically checked.

Does that help at all? It's super hard to explain this stuff in a way that makes sense to people who don't already know it.

@camelo003 Finally, other C programmers feel free to correct any errors that I made here.

Sign in to participate in the conversation
mograph.social

mograph.social is a Mastodon server for the motion design community. VFX artists, 3D artists, animators, designers and illustrators with an interest in moving images are all welcome.