License : Creative Commons Attribution 4.0 International (CC BY-NC-SA 4.0)
Copyright : Hervé Frezza-Buet, CentraleSupelec
Last modified : April 9, 2024 11:25
Link to the source : index.md

Table of contents

Life cycle in RAM

Talking about RAM

Variables

// A variable is something stored in the RAM.
int i;
i = i + 1; // means 40 <- 24 + 1
Symbol and variables
Symbol and variables

Adresses

int i = 24;
int* p;
p = &i;

// *p and i are the same thing.

*p = *p + 1; // 40 < 24 + 1, i.e. i = i + 1
Pointers
Pointers
int f(int x, int y) {
  // some text
  return ...;
}

// f is a symbol with no l-value
// Its value is the address of the text.

int (*p)(int, int); // This C-like is a function pointer, rather use C++ higher level tools instead.
p = f;
int c = (*p)(3, 8);
Function symbols
Function symbols

References

int j = 38;

int& i;     // The compiler complains, references must be initialized.
int& i = j; // This is not affeactation, but an initialization.

/* 
   There is a _i hidden pointer, not accessible... 
      int* _i = &j;
   Then, in every piece of code, *_i <=> j
*/

j = 5; // These two lines are exactly...
i = 5; // ...the same operations.
References as a hidden pointer
References as a hidden pointer

The constness of values

int i,ii;
int*       j = &i;
const int* k = &i;
int&       l = i;
const int& m = i;

*j = 5; // ok
*k = 5; // compiling error
 l = 5; // ok 
 m = 5; // compiling error

ii = *j + *k + l + m; // reading is ok.

// References can be pointers to the text data segment...

int& i       = 38; // fails, 38 has no l-value, it is written in the text.
const int& i = 38; // ok... since the reference is const.

void display(const Matrix& m) {...} // avoids a copy, but m is write protected.

void f(Matrix& m)       {...} 
void g(Matrix& m)       {...} // Polymorphism allows for 2 definitions...
void g(const Matrix& m) {...} // .. of g, since args are from distict types.

void h(const Matrix& m) {
  f(m); // Fails at compiling time;
  g(m); // Calls the second g.
}
	

Constness is only checkings at compiling time, the compiled code is not influenced by this.

Value categories

C++ expression mainly refer to memory chunks. These chunks may have different status, that need to be distinguished and clearly understood. These status are called value categories (see cppreference.com).

Primary categories

lvalue

The so called “lvalue” expessions are those which may appear on the left (l means left in lvalue) of an affectation. It represent a value that is stored somewhere in the stack or in the heap. In other words, the expressions below are legual for an expresion expr which is an lvalue.

expr = ...;
&expr   

prvalue

The so called “prvalue” expessions are “pure right values”, i.e. they cannot be on the left of an affectation.

&i    // where i is a lvalue
3
a+b   // where a and b are of type int  
f     // where f is the name a a function
a++   // if the post increment is classically implemented.
this
[...](...) {}

xvalue

The so called “xvalue” expessions refer to memory chunks that will not be available soon (i.e. “expiring”).

f(g(x)+h(x))   // The result of g(x)+h(x) will be released after the computation done by f.
std::move(p)   // This is similar to p, but the compiler considers it as expiring.
               // It may apply optimization of memory use accordingly.
           
SomeClass(a,b) 
SomeClass(a,b).somme_attr 

Mixed categories

glvalue = lvalue or xvalue

So called “glvalue” expressions (generalized left value) refer to any expression that has an address in the heap or the stack.

rvalue = prvalue or xvalue

So called “rvalue” expressions (right value) refer to any expression that cannot be affected.

Setting up a value in the RAM.

The code used in this section are exemple-*.cpp files in the archive life-cycle-in-RAM.tar.gz. To test code, unzip and dive into the directory first

mylogin@mymachine:~$ tar zxvf life-cycle-in-RAM.tar.gz
mylogin@mymachine:~$ rm life-cycle-in-RAM.tar.gz
mylogin@mymachine:~$ cd life-cycle-in-RAM

And then compile and run, as done here for the first example

mylogin@mymachine:~$ g++ -o test -std=c++17 num.cpp example-001-001-allocation.cpp
mylogin@mymachine:~$ ./test

The system allocates the memory for you first, and then so called “constructors” initialize it.

When the system releases the memory, it calls beforhand the so called “destructors” on the piece of memory that is about to be be released, for cleaning up.

Default and external allocation : building memory from scratch

// example-001-001-allocation.cpp
#include "num.hpp"

int main(int argc, char* argv[]) {
  rem("Default allocation");
  num x;
  
  rem("Allocations from another type");
  num a("a", 10); 
  num b   {"b", 11}; 
  num c = {"c", 12};
  
  rem("Affectation");
  num d = {"d", 13}; // Not here, this is allocation...
  d = c;             // ... but here, since d already exists now.

  ___;
  
  return 0;
}

Building memory from an existing instance, i.e use the copy/move constructors.

Sometimes, you need to “clone” a value, i.e. to duplicate its memory representation. This is done quite often ic c++ execution, and when the memory is allocated for the new value, it becomes equal to the original by the call of the copy contructor.

// example-001-002-copy.cpp
#include "num.hpp"

void f(num arg) {
  fun_scope;
  rem("the function does nothing...");
}

int main(int argc, char* argv[]) {
  
  num a {"a", 10};
  num x {"x", 10};

  ___;
  
  rem("Copy at declaration time");
  num b(a);
  num c {a};
  num d = a; // This is not an affectation.
  
  ___;
  
  rem("Copy at function call");
  f(x);
  ___;
  
  return 0;
}

The question is “Do you really need the orginal memory when after the cloning ?”. Indeed, sometimes, some memory is cloned for any reasons, but then, it will be released… The cloning process can, in this case, take advantage of this, i.e. it can reuse some internal stuff of the original.

Such cloning is called “moving”, i.e. you move some data from the origin to the clone. After moving, the origin has to be considered as “empty”, i.e. it may be quite similar to an constructed by default data.

Cloning by moving rather than copying can be triggered explicitly (see std::move) or implicitly, when the compiler knows thet the original data is about to be released (i.e. it is expiring).

// example-002-001-move.cpp
#include "num.hpp"

int main(int argc, char* argv[]) {
  num a {"a", 10};
  num b {"b", 11};
  
  ___;

  rem("This does nothing to a.");
  std::cout << scope_indent << "a = " << std::move(a) << std::endl;
  ___;
  rem("This does nothing to a.");
  num c = a; // Usual copy
  rem("This alterates a and b.");
  num aa = std::move(a); // Move construction
  c      = std::move(b); // Move affectation
  ___;
  
  return 0;
}

The use is relevant with function polymorphism.

// example-002-002-movefun.cpp
#include "num.hpp"

void f(const num& x) {
  scope("f(const num& x)");
  num i = x; // x is not considered as expiring.
}

void f(num&& x) {
  scope("f(num&& x)");
  num i = x; // x is not considered as expiring, in spite of the && type.
}

void g(const num& x) {
  scope("g(const num& x)");
  num i = std::move(x); // std::move leads to non expiring value since x is const.
}

void g(num&& x) {
  scope("g(num&& x)");
  num i = std::move(x); // std::move lead to expiring value, i is built by moving.
}

int main(int argc, char* argv[]) {
  num a {"a", 10};
  num b {"b", 11};
  num c {"c", 12};
  num d {"d", 13};

  f(a);            // calls f(const num& x)
  f(std::move(b)); // calls f(num&& x)
  g(c);            // calls g(const num& x)
  g(std::move(d)); // calls g(num&& x)
  
  return 0;
}

Efficient binary operators is a classical usecase.

// example-002-003-moveop.cpp
#include "num.hpp"

// Operators +,* exist for type num, we use ours for the sake of
// illustration.

num plus(const num& a, const num& b)  {fun_scope; return {"res", (int)a + (int)b};}

num add(const num&  a, const num&  b) {fun_scope; return {"res", (int)a + (int)b};}
num add(      num&& a, const num&  b) {fun_scope; num res = std::move(a); res += b; return res;}
num add(const num&  a,       num&& b) {fun_scope; num res = std::move(b); res += a; return res;}
num add(      num&& a,       num&& b) {fun_scope; num res = std::move(a); res += b; return res;}


int main(int argc, char* argv[]) {
  {
    scope("Plus");
    num a {"a",    1};
    num b {"b",   10};
    num c {"c",  100};
    num d {"d", 1000};
    num res = plus(a, plus(b, plus(c, d)));
  }
  {
    scope("Add");
    num a {"a",    1};
    num b {"b",   10};
    num c {"c",  100};
    num d {"d", 1000};
    num res = add(a, add(b, add(c, d)));
  }

  return 0;
}

Building memory in the text ?!?!

The text contains constant values (as 3 is the code i = i + 3). It can be done for any type, using contructors.

// example-003-001-text.cpp
#include "num.hpp"
#include <iostream>

int main(int argc, char* argv[]) {
  {
    scope("arguments in the text");
    auto c = num("zero", 0) * num() + num("one", 1); // auto is guessed as type num....
    ___;
    std::cout << scope_indent << "c = " << c << std::endl;
  }
  {
    scope("copy constructors from text values");
    num a = num("a", 10); // useless, since ...
    num b = {"a", 10};    // ... does the same.

    // But with auto, you can have this.
    auto c = num("c", 10); // nota : direct copy here, not a move construction.
    auto d = c;
  }
  {
    scope("The compiler is clever");
    auto a = num(num(num(num(num(num(num(num("c", 10))))))));
  }

  return 0;
}

Building memory in the heap (using new)

There is no garbage collector in C++. Rather use smart pointers rather than the basic pointers introduced here.

// example-004-001-new.cpp
#include "num.hpp"

int main(int argc, char* argv[]) {
  num* a_ptr = new num("a", 10);
  auto b_ptr = new num(*a_ptr);  // copy...

  {
    scope("Pointer affectation");
    auto c_ptr = a_ptr;
    c_ptr = b_ptr;
  } // No num values released here.

  {
    scope("Pointer free");
    auto c_ptr = a_ptr;
    delete c_ptr; 
  }

  // Uncomment this to avoid a memory leak.
  // delete b_ptr;

  // Keep this commented out in order to avoid double free errors.
  // delete a_ptr;

  return 0;
}

Range of contiguous values

When size is available at compiling time

// example-005-001-array.cpp
#include "num.hpp"
#include <array>

#define ARRAY_SIZE 3 // Avoid magic numbers in your code...

// Only constants can be used where we use ARRAY_SIZE. The value had
// to be known at compiling time (i.e. it is hard-coded in the receipe).

int main(int argc, char* argv[]) {
  {
    scope("non-STL arrays");
    num tab[ARRAY_SIZE]; // tab has no l-value !
    tab[0] = 3;
    tab[2] = 2;
    num* first_ptr  = tab;
    num* second_ptr = tab + 1;
    num* last_ptr   = tab + (ARRAY_SIZE - 1);
  }
  {
    scope("STL arrays");
    std::array<num, ARRAY_SIZE> tab;
    tab[0] = 3;
    tab[2] = 2;
    auto first_ptr  = tab.begin();
    auto second_ptr = tab.begin() + 1;
    auto last_ptr   = tab.end()   - 1;
  }
  {
    scope("Range initialization");
    std::array<num, ARRAY_SIZE> tab {{{"a", 0}, {"b", 1}, {"c", 2}}};
  }
  
  return 0;
}

Usually, arrays live in the stack, as the examples above. In this case, there is no need for an explicite memory release.

When size is available at execution time only

// example-006-001-vector.cpp
#include "num.hpp"
#include <vector>
#include <iostream>
#include <string>

int main(int argc, char* argv[]) {
  int nb = 0;
  std::cout << "Enter a number of elements (>= 3): " << std::flush;
  std::cin >> nb;

  if(nb <= 3) nb = 3;
  
  // Nb is kown at execution time only ! We need the heap.

  {
    scope("non-STL dynamical arrays");
    num* tab = new num[nb]; // tab is a pointer, it has a l-value !
    tab[0] = 3;
    tab[2] = 2;
    num* first_ptr  = tab;
    num* second_ptr = tab + 1;
    num* last_ptr   = tab + (nb - 1);
    delete [] tab; // Do not forget releasing when you are done...
    //                ... and DO NOT FORGET THE [] !!!
  }
  {
    scope("STL dynamical arrays (i.e. vectors)");
    std::vector<num> tab(nb);
    tab[0] = 3;
    tab[2] = 2;
    auto first_ptr  = tab.begin();
    auto second_ptr = tab.begin() + 1;
    auto last_ptr   = tab.end()   - 1;

    // No delete here, the vector class does the release of the memory
    // it handles internally in the heap.
  }
  {
    scope("Range initialization");
    // Ok... here we know the size at compiling time.
    std::vector<num> tab {{{"a", 0}, {"b", 1}, {"c", 2}}};
    ___;
    // But we can add more elements, for example according to nb.
    for(int i = 0; i < nb; ++i)
      // We add {"new_i", i} at the end... reallocation of the whole vector may occur !
      tab.push_back({std::string("new_")+std::to_string(i), i});
    ___;
  }
  
  return 0;
}

You can put anything in vectors, so you can put arrays. In this case, their content lives in the heap, while the size is known at compiling time.

// example-006-002-vector.cpp
#include "num.hpp"
#include <array>
#include <vector>

#define ARRAY_SIZE 3 

int main(int argc, char* argv[]) {
  int nb = 0;
  std::cout << "Enter a number of elements (>= 3): " << std::flush;
  std::cin >> nb;

  if(nb <= 3) nb = 3;

  ___;
  std::array<num, ARRAY_SIZE> val {{{"a", 13}, {"b", 14}, {"c", 13}}};
  ___;

  std::vector<std::array<num, ARRAY_SIZE>> tab(nb);
  // There are nb*ARRAY_SIZE num in the heap here.
  
  tab[1] = val;
  ___;
  

  return 0;
}

Inititializing an already existing memory (in place construction)

Keep in mind that constructors are only initializers.

So they can apply to a memory that already exists.

// example-007-001-emplace.cpp
#include "num.hpp"

#define BUF_SIZE 1024
#define OFFSET    341

int main(int argc, char* argv[]) {
  char buffer[BUF_SIZE]; // an std::array could have been used here.

  void* storage_place = buffer + OFFSET;

  ___;
  rem("In place initialization.");
  // Let us initialize the bytes from address storage_place with a num
  // constructor.
  num* a_ptr = new (storage_place) num("a", 10);
  std::cout << scope_indent << a_ptr << " should be equal to " << storage_place << std::endl;
  ___;
  rem("Example of standard use as a num.");
  num b = *a_ptr; // Copy
  ___;
  rem("Cleaning up.");

  // delete a_ptr;  THIS IS A BUG !!!
  //                We just need to clean before buffer is released.
  a_ptr->~num();
  ___;
  rem("The buffer is released here too !");
  
  
  return 0;
}

The STL dynamic collection take benefit of this.

// example-007-002-emplace-back.cpp
#include "num.hpp"
#include <vector>

int main(int argc, char* argv[]) {
  std::vector<num> v;

  // Let us allocate at once a big chink of memory.
  ___;
  v.reserve(1024);
  std::cout << scope_indent << "Vector size = " << v.size() << std::endl;
  ___;
  
  for(int i = 0; i < 5 /* avoid magic numbers */; ++i)
    // The arguments of emplace_back should fit the arguments of one of the num constructors.
    v.emplace_back(std::string("x_")+std::to_string(i), i);

  // No memory allocation is done here, since we did it when we have
  // reserved 1024 nums previously.
  ___;
  std::cout << scope_indent << "Vector size = " << v.size() << std::endl;
  ___;
  
  
  return 0;
}

The efficient way for iterations

Pointers are iterators… so avoid tab[i]-like expressions in loops.

#define SIZE_1 10
#define SIZE_2 20

#include <list>
#include <algorithm> // for std::copy


int main() {

  int tab1[SIZE_1];
  // tab1 is an int*, values are allocated in the stack.
  // tab1 has no l-value (tab = ... is illicit).

  int* tab2 = new int[SIZE_2];
  // tab2 is an int*, values are allocated in the heap.
  // tab1 has a l-value. It can be affected to any int* value.
  // For example : tab2 = tab1;
  // After such an affectation, access to the memory in the heap
  // containing the SIZE_2 integers is lost (memory leaks).

  int* ptr_4 = tab1 + 3;
  // A pointer to the 4th integer in the stack.
  // Its computation involves a product : ptr_4 = tab1 + 3*sizeof(int).
  // The "*sizeof(int)" is implicit.... but computed.

  int* ptr_2 = tab1 + 1;
  // No implicit product here : ptr_2 = tab1 + sizeof(int).

  int v4 = *(tab1 + 3); // or v4 = *ptr_4
  // Get the value of the 4th integer in the stack.

  *(tab1 + 3) = 12; // or *prt_4 = 12
  // Set 12 as the content of the 4th integer in the stack.

  // Brackets simplify the pointer notation.
  v4 = tab1[3];
  tab1[3] =  12;

  // The same could have been written for tab2 which refers to the heap.

  ///////////
  //       //
  // Loops //
  //       //
  ///////////
  /* nota : next code contain variable redefinition... it won't compile properly.
     Using redefinitions enables direct copy-pasting of pieces of code. */

  // Bad for loop.
  for(int i = 0; i < SIZE_1; ++i)
    tab1[i] = i; // Implicite products here.

  // Good for loop.
  int* begin = tab1;
  int* end   = tab1 + SIZE_1;
  int  i     = 0;
  for(int* it = begin; it != end; ++it, ++i) *it = i;

  // A more compact one...
  int* begin = tab1;
  int* end   = tab1 + SIZE_1;
  int* it    = begin;
  while(it != end) *(it++) = i++; // it acts as an OUTPUT ITERATOR !

  // Copying the first 5 elements in the stack to the heap, starting at
  // the 3rd integer in the heap.
  int* begin = tab1;
  int* end   = tab1 + 5;
  int* out   = tab2 + 2; // out points the 3rd element in the heap.
  for(int* it = begin; it != end; *(out++) = *(it++));

  // This can be done by on-the-shelf STL algorithms.
  std::copy(tab1, tab1 + 5, tab2 + 2);

  // STL collection provide iterators that have the pointer semantics.
  std::list<int> l = {1, 2, 3, 4, 5};
  auto begin = l.begin();
  auto end   = l.end();
  auto it    = *begin;
  int second_elem = *(++it);
  // ...

  ////////////////////
  //                //
  // Freeing memory //
  //                //
  ////////////////////

  // Nothing to be done for tab1, it will be popped out of the stack
  // at return time (when the closing brace is encountered).

  // The heap must be released.
  delete [] tab2;

  // Nothing to be done for l. When l is popped out from the stack,
  // internal functions (i.e. list destructors) clean the memory that
  // was eventually allocated in the heap. This is the case for STL
  // containers.
  
  return 0;
}

Returning values from functions

The compiler optimizes the use of memory, it invokes the most efficient use of copy and moves.

// example-008-001-return.cpp
#include "num.hpp"

num f() {
  fun_scope;
  num res {"res", 10};
  return res;
}

num g() {
  fun_scope;
  return {"res", 10};
}

num h(num x) {
  fun_scope;
  return x;
}


int main(int argc, char* argv[]) {
  {
    scope("Testing f");
    num a = f();
    ___;
    num b;
    b = f();
    ___;
  }
  {
    scope("Testing g");
    num a = g();
    ___;
    num b;
    b = g();
    ___;
  }
  {
    scope("Testing h");
    num a {"a", 10};
    ___;
    num b = h(a);
    ___;
  }

  return 0;
}

Functions returning references may be smart.

// example-008-002-returnref.cpp
#include "num.hpp"

num& max(num& a, num& b, num& c) {
  fun_scope;
  if(b >= a && b >= c) return b;
  if(c >= a && c >= b) return c;
  return a;
}

int main(int argc, char* argv[]) {
  num a {"a", 10};
  num b {"b", 20};
  num c {"c",  9};

  ___;
  
  max(a, b, c) = 18;
  ___;

  return 0;
}

Common bugs

Bad usage of reference return type

#include <num.hpp>

// Error : reference to a local returned.
num& f(int i) {
  num res("i", i);
  return res;
}

// Error : reference to a temporary returned.
const num& g(int i) {
  return num("i", i);
}

// No warnings, but the call of f leads to a memory leak.
num& h(int i) {
  return *(new num("i", i));
}

int main(int argc, char* argv[]) {
  h(10) = 3;
  return 0;
  // The num ("i", 10) allocated in h is not released. Memory leak.
}

			

Bad usage of delete

#include <num.hpp>


int main(int argc, char* argv[]) {
  {
    num i;
    num* ptr = &i;
    delete ptr; // Error, ptr is not the address of something in the stack.
  }
  {
    num* ptr1 = new num("i", 10);
    num* ptr2 = new num[10];
    delete [] ptr1; // Error, delete [] used while ptr1 is note allocated as an array.
    delete    ptr2; // Error, ptr2 is allocated as an array, delete [] should have been used.
  }
  {
    num* a    = new num("a", 10);
    num* copy = a;  // This is not a copy of the ("a", 10) object.
    delete a;
    delete copy;    // double free
  }
  {
    num* a = nullptr;
    num* b; 
    delete a; // Ok
    delete b; // b may contain a non 0 value, in this case, delete is an error.
  }

  return 0;
}

Uncoherent memory size.

#include <num.hpp>
#include <array>


int main(int argc, char* argv[]) {

  std::array<num, 10> tab;
  tab[10] = num("a", 10); // Error, out of index.

  int*  i = new int; // line #alloc
  void* p = i; 
  num* j = (num*)p;

  num& a = *j; // a refers to more memory than the size allocated at line #alloc.

  return 0;
}
Hervé Frezza-Buet,