C++Guns – RoboBlog blogging the bot

13.09.2018

FORTRAN: GDB conditional watchpoint

Filed under: Allgemein — Tags: — Thomas @ 13:09

To set a conditional watchpoint on local variable i.
Example Code:

subroutine func
  integer :: i

  do i=1, 100
    write(*,*) i
  end do
end subroutine

program test
  implicit none

  call func()
end program

Compile with -ggdb
Set a breakpoint on subroutine "func". After the debugger stop on this point, the local variable "i" is in scope so one can set a watchpoint.

$ gdb ./example
(gdb) break func
Breakpoint 1 at 0x8bb: file gdb.F90, line 4.

ODER
(gdb) break example.F90:4

(gdb) run
Starting program: /home/kater/example
Breakpoint 1, func () at example.F90:4
4 do i=1, 100
(gdb) watch i if i.gt.10
Hardware watchpoint 2: i
(gdb) c
continuing.
1
2
3
4
5
6
7
8
9
10

Hardware watchpoint 2: i

Old value = 10
New value = 11
0x0000555555554948 in func () at gdb.F90:4
4 do i=1, 100

30.08.2018

FORTRAN: 3x3 matmul intrinsic vs. hand crafted

Filed under: Allgemein — Tags: — Thomas @ 20:08

Verglichen wird der erzeugte Assembler Code der internen Fortran matmul() Routine mit einer von Hand geschriebenen Matrix Multiplikation.
Getestet wird für 3x3 Matrizen mit zur Compilezeit bekannter Größe. Sonst könnte man den vom FORTRAN Compiler erzeugten Assembler Code gar nicht mehr nachvollziehen.

subroutine intrinsicmatmul(a,b,c) 
  implicit none
  real(8), intent(in) :: a(3,3), b(3,3)
  real(8), intent(inout) :: c(3,3)
  
  c = matmul(a,b)
end subroutine

Und hier von Hand

subroutine handmatmul(a,b,c)
  implicit none
  real(8), intent(in) :: a(3,3), b(3,3)
  real(8), intent(inout) :: c(3,3)    
  integer :: i,j,k
  
  c(:,:) = 0
  do i=1, 3
    do j=1, 3
      do k=1, 3
        c(j,i) = c(j,i) + a(j,k) * b(k,i)
      end do
    end do
  end do
end subroutine

Getestet wird mit gfortran 7.3.0 -O1. Die Hand Version hat 13 Instruktionen mehr. Hiervon sind 6 Instruktionen für das Sichern und Wiederherstellen von Register. Der Rest geht wohl auf eine andere Art Adressberechnung drauf. Da die intrinsic matmul Routine selbst bei -O1 geinlint wird, ich sehe kein Funktionsaufruf im Assembler Code, wird sie wohl schneller sein.
Tests mit Praxiscode zeigen aber, dass das nicht so sein muss. Es wurden mehrere Matrizen multipliziert.

Es gilt: messen, messen messen. Bei dem nächsten Compiler/CPU kann es wieder anders sein.

intrinsic

intrinsicmatmul_:
	movq	$0, (%rdx)  
	movq	$0, 8(%rdx) 
	movq	$0, 16(%rdx)
	movq	$0, 24(%rdx)
	movq	$0, 32(%rdx)
	movq	$0, 40(%rdx)
	movq	$0, 48(%rdx)
	movq	$0, 56(%rdx)
	movq	$0, 64(%rdx)
	leaq	24(%rdx), %rcx
	addq	$24, %rsi
	leaq	96(%rdx), %r9
	leaq	96(%rdi), %r10
.L4:
	leaq	24(%rdi), %rdx
	movq	%rsi, %r8
.L3:
	movsd	-24(%r8), %xmm1
	movl	$0, %eax
.L2:
	addq	$1, %rax
	movapd	%xmm1, %xmm0
	mulsd	-32(%rdx,%rax,8), %xmm0
	addsd	-32(%rcx,%rax,8), %xmm0
	movsd	%xmm0, -32(%rcx,%rax,8)
	cmpq	$3, %rax
	jne	.L2
	addq	$8, %r8
	addq	$24, %rdx
	cmpq	%r10, %rdx
	jne	.L3
	addq	$24, %rcx
	addq	$24, %rsi
	cmpq	%r9, %rcx
	jne	.L4
	rep ret
Hand

handmatmul_:
	pushq	%r12               # -
	pushq	%rbp               # |- Register r12, rbp, rbx für Laufvariablen i,j,k freimachen
	pushq	%rbx               # -
	movq	$0, (%rdx)         # -
	movq	$0, 8(%rdx)        # |
	movq	$0, 16(%rdx)       # |
	movq	$0, 24(%rdx)       # |
	movq	$0, 32(%rdx)       # |- c(:,:) = 0
	movq	$0, 40(%rdx)       # |
	movq	$0, 48(%rdx)       # |
	movq	$0, 56(%rdx)       # |
	movq	$0, 64(%rdx)       # -
	leaq	24(%rsi), %r9
	leaq	96(%rsi), %r12
	movq	%rdx, %rcx
	subq	%rsi, %rcx
	leaq	-24(%rcx), %rbp
	movq	%rcx, %rbx
.L4:                               # i Schleife Anfang
    leaq	0(%rbp,%r9), %r8
	movq	%rdi, %r10
	leaq	(%rbx,%r9), %rdx
.L3:                               # j Schleife Anfang
	movq	%r8, %r11 
	movsd	(%r8), %xmm1       # c(i,j) nach xmm1 laden
	movq	%rsi, %rax
	movq	%r10, %rcx
.L2:                               # k Schleife Anfang
	movsd	(%rcx), %xmm0      # -
	mulsd	(%rax), %xmm0      # |- c(j,i) + a(j,k) * b(k,i)
	addsd	%xmm0, %xmm1       # -
	addq	$24, %rcx          # 3x double = 24Byte
	addq	$8, %rax           # 1x double = 8byte. 
	cmpq	%r9, %rax          # rax wird gleichzeitig als Pointer und Laufvariable benutzt.
                               # In r9 steht der Wert den rax erreicht, wenn die Schleife terminiert
	jne	.L2                # k Schleife Ende
	movsd	%xmm1, (%r11)      # c(j,i) = xmm1 
	addq	$8, %r8
	addq	$8, %r10
	cmpq	%rdx, %r8
	jne	.L3                # j Schleife Ende
	addq	$24, %rsi
	addq	$24, %r9
	cmpq	%r12, %r9
	jne	.L4                # i Schleife Ende
	popq	%rbx               # -
	popq	%rbp               # |- Register aus dem Stack wieder herstellen
	popq	%r12               # -
	ret 

Für Optimierung O2 gilt 35 Instruktionen für intrinsic und 43 für Hand.

27.07.2018

C++ Guns: use FORTRAN debug techniques in C++

Filed under: Allgemein — Tags: , — Thomas @ 15:07

On of the first things one learn about FORTRAN: you can very easy write stuff to files without getting worried about file creation of filenames.
Just write to a random number, the file handle. You can do this everywhere in the code. The output is append to the file. And the file gets automatic cleaned on program start.
Here is one example:

program debug
  integer :: x
  x = 1
  write(1001,*) "This is string", x
  write(1001,*) "The End"
end program

$ gfortran debug.f90
$ ./a.out
$ cat fort.1001
This is string 1
The End

As you can see, the file handle number get prefixed with "fort.". This is pretty nice for some nasty debug session, so lets do this with C++!

The implementation in C++ is surprisingly easy, just a single line of code and we are done:

#include <fstream>

template<int fhdl>
inline std::ofstream& write() {
    static std::ofstream ostrm("cpp."+std::to_string(fhdl), std::ios::out);
    return ostrm;
}

int main() {
    int x = 1;
    write<1001>() << "This is string " << x << "\n";
    write<1001>() << "The End\n";
}

$ g++ debug.cpp
$ ./a.out
$ cat cpp.1001
This is string 1
The End

The file handle number is passed as template parameter. I guess it is always know at compile time during the debug session. Then the write() function simply returns a reference to a static standard ofstream object. Which is created during the first call to write(). The created files are prefixed with "cpp." And you can use this with any custom type with overloaded operator<<. Thats all.

10.05.2018

C++ Guns: schlecht generiertes Assember von Fortran Code

Filed under: Allgemein — Tags: , — Thomas @ 13:05

Analog zum letzten Beispiel mit C++ hier die angefangene Funktion mit Fortran. Auf die Getter Funktionen und ctors habe ich erst mal verzichtet. In Fortran ist es ohnehin nicht möglich immer vereinheitlichten Code zu schreiben, und es ist auch so schon schlimm genug.

module geometrymodule
  implicit none 
  type Point2D_t
    real(8) :: xp, yp
    
    contains
    
    procedure :: Point2Dminus
    generic :: operator(-) => Point2Dminus
  end type
  
  type Line2D_t
    type(Point2D_t) :: pt1, pt2
  end type

  contains
  
  pure function Point2Dminus(this, point1) result(point2)
    implicit none
    class(Point2D_t), intent(in) :: this
    type(Point2D_t), intent(in) :: point1
    type(Point2D_t) :: point2

    point2%xp = this%xp - point1%xp
    point2%yp = this%yp - point1%yp
  end function
end module

function func(line1, line2) result(denominator) 
  use geometrymodule
  implicit none
  type(Line2D_t), intent(in) :: line1, line2
  type(Point2D_t) :: a, b
  real(8) :: denominator
  
  a = line1%pt2 - line1%pt1
  b = line2%pt1 - line2%pt2
  denominator = a%yp * b%xp - a%xp * b%yp
end function

Und hier der Fortran Code. Zur Erinnerung: Wir brauchen fünf Subtraktionen, zwei Multiplikationen und vier explizite Kopier-Befehle in C++. In Fortran zähle ich nur vier Subtraktionen, dafür eine Addition, zwei Multiplikationen, 15 Kopier (MOV) Befehle, ein push/pop Paar für den Stack, zwei Funktionsaufrufe, mit Point2D_t sind es sogar vier, und vier leaq Aufrufe. LEA steht für Load Effective Address. Hell NO. Das war mit Optimierung O1.

Point2d_t:
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	movq	%rax, (%rsi)
	movq	%rdx, 8(%rsi)
	ret

point2dminus:
	movq	(%rdi), %rax
	movsd	8(%rax), %xmm1
	movsd	(%rax), %xmm0
	subsd	(%rsi), %xmm0
	subsd	8(%rsi), %xmm1
	ret

func_:
	pushq	%rbx
	subq	$48, %rsp
	movq	%rsi, %rbx
	movq	Point2d_t, 24(%rsp)
	leaq	16(%rdi), %rax
	movq	%rax, 16(%rsp)
	movq	%rdi, %rsi
	leaq	16(%rsp), %rdi
	call	point2dminus
	movsd	%xmm0, (%rsp)
	movsd	%xmm1, 8(%rsp)
	movq	Point2d_t, 40(%rsp)
	movq	%rbx, 32(%rsp)
	leaq	16(%rbx), %rsi
	leaq	32(%rsp), %rdi
	call	point2dminus
	mulsd	8(%rsp), %xmm0
	mulsd	(%rsp), %xmm1
	subsd	%xmm1, %xmm0
	addq	$48, %rsp
	popq	%rbx
	ret

Mit Optimierung O2 wird es besser. Auf einmal sind es sieben Subtraktionen, dafür keine Addition mehr. Es bleibt bei zwei Multiplikationen. 13 Kopier MOV Befehle. Der Stack, LEA und die Funktionsaufrufe fallen weg. Hmmm, aber die Funktionen Point2d_t und point2dminus sind immer noch da. Dead Code elimination funktioniert also nicht. Damit bleiben effektiv fünf Subtraktionen, zwei Multiplikationen und 6 Kopier/MOV übrig. Naja immerhin.

Point2d_t:
	movq	(%rdi), %rax
	movq	8(%rdi), %rdx
	movq	%rax, (%rsi)
	movq	%rdx, 8(%rsi)
	ret

point2dminus:
	movq	(%rdi), %rax
	movsd	8(%rax), %xmm1
	movsd	(%rax), %xmm0
	subsd	8(%rsi), %xmm1
	subsd	(%rsi), %xmm0
	ret

func_:
	movsd	(%rsi), %xmm0
	movapd	%xmm0, %xmm1
	movsd	24(%rdi), %xmm0
	subsd	16(%rsi), %xmm1
	subsd	8(%rdi), %xmm0
	mulsd	%xmm1, %xmm0
	movsd	8(%rsi), %xmm1
	movapd	%xmm1, %xmm2
	movsd	16(%rdi), %xmm1
	subsd	24(%rsi), %xmm2
	subsd	(%rdi), %xmm1
	mulsd	%xmm2, %xmm1
	subsd	%xmm1, %xmm0
	ret

Werden hingegen Getter Funktionen für Point2D x und y implementiert, um ein einheitliches Interface für alle geometrischen Datentypen bereit zu stellen, was in C++ immer ohne Overhead möglich ist, kommen wieder Funktionsaufrufe und LEA Instruktionen in den Code. Und die gehn auch nicht mehr weg, egal mit welcher Optimierungsstufe.

08.09.2017

C&P Bug of the Day

Filed under: Allgemein — Tags: — Thomas @ 15:09

Keine Ahnung wie das passiert ist. git diff zeigte es dann. Ich wusste garnicht, dass so etwas überhaupt compiliert. FORTRAN ist scheiße :D


-      ContAdvection(1) = NABLA_U(1,2) + NABLA_U(2,3)
+      ContAdvection(1) =  (1,2) + NABLA_U(2,3)

04.09.2017

C++ Guns - Semantik und concepts - Part 2

Filed under: Allgemein — Tags: , — Thomas @ 14:09

Im letzten Post wollte ich deutlich machen, welche Art von Fehler passieren, wenn zwischen den TypenPoint3D und Vector3D keinen semantischen Unterschied gemacht wird. Also, wenn ein Rechenergebnis von der Bedeutung ein Vektor ist, es aber in einem Punkt Datentyp gespeichert wird.

Die erste Lösung war, einen zusätzlichen Datentyp Vector3D zu erstellen, welcher sich genau wie ein Point3D verhält, aber von der Bedeutung eine andere hat. Dies lässt sich auch noch ohne concepts realisieren. Dazu das selbe Beispiel von Part1 diesmal in FORTRAN. Und im nächsten Teil gibts dann das erste concept.

Da ich nicht weiß ob es in FORTRAN möglich ist, von einem Array zu erben, noch ob man operator()() überladen kann, hier eine Version mit etwas mehr Tipparbeit.

Wichtig ist aber erstmal nur, dass die selbe Art von Fehlermeldung erzeugt werden kann.

conceptpart2.F90
normal = normalenVector(tri)
1
Error: Can't convert TYPE(vector3d_t) to TYPE(point3d_t) at (1)

conceptpart2.F90.gz

module test_m
 implicit none
 
  type Point3D_t
    real(8) :: xyz(3)
  end type

  type Vector3D_t
    real(8) :: xyz(3) 
  end type


  type Triangle_t
      type(Point3D_t) :: points(3)
  end type

  interface operator(-)
    module procedure minus1
  end interface
  
  contains

  function minus1(p1, p2) result(p3)
    implicit none
    type(Point3D_t), intent(in) :: p1, p2
    type(Point3D_t) :: p3

    p3 = Point3D_t( (/ p1%xyz(1)-p2%xyz(1), p1%xyz(2)-p2%xyz(2), p1%xyz(3)-p2%xyz(3) /) )
  end function

  function normalenVector(tri) result(normal)
    implicit none
    type(Triangle_t), intent(in) :: tri
    type(Point3D_t) :: vec1, vec2
    type(Vector3D_t) :: normal
    
      vec1 = tri%points(2)-tri%points(1)
      vec2 = tri%points(3)-tri%points(1)

      normal = Vector3D_t( (/ vec1%xyz(2)*vec2%xyz(3)-vec1%xyz(3)*vec2%xyz(2), &
&                  vec1%xyz(3)*vec2%xyz(1)-vec1%xyz(1)*vec2%xyz(3), &
&                  vec1%xyz(1)*vec2%xyz(2)-vec1%xyz(2)*vec2%xyz(1) /) )
  end function
 
end module

program main
  use test_m
  implicit none
  type(Triangle_t) :: tri
  type(Point3D_t) :: normal
    
  tri = Triangle_t((/ Point3D_t((/0, 0, 0/)), Point3D_t((/10, 0, 0/)), Point3D_t((/10, 10, 0/)) /))
  normal = normalenVector(tri)
end program


15.06.2017

FORTRAN: parameter variables in module evaluated at compile or runtime?

Filed under: Allgemein — Tags: — Thomas @ 10:06

Suggest you habe a fortran module with a collection of mathematical constant like PI.


module constant_m
  real, parameter :: PI = 3.1314_8
end module

Now you use this module in a function which calculate the circumcenter of a circle with given radius.


function U(r) 
  use constant_m
  implicit none
  real, intent(in) :: r
  real :: U
  U = 2*PI*r
end function

Will the expression 2*PI be evaluated a compile or runtime?
Lets check the generated assember code!
...
And you'll see only crap.

Let's start with an simpler example. Replace every real to integer. Truncate PI to 3, thats fair enough. And remove the "2*" expression. Somewhere in the assembler code are there must be the expression y=3*x.
Here it is:
gfortran --save-temps -O1 -c constants.F90 circle.F90


  movl	4(%esp), %eax        % Store the adress of r from stack pointer into eax
  movl	(%eax), %eax         % Dereference adress and store value from r into eax
  leal	(%eax,%eax,2), %eax  % eax = eax + 2*eax 

The last line is the one. The compiler optimize the multiplicaton with 3 to an assembler instrctuin which performs a multiplication with 2 and an addition.
What we not see is a constant "3" in assember code. Or a assembler register filled with "3".

Lets add the "2*" and see whats happens.


  movl	4(%esp), %eax
  movl	(%eax), %eax
  leal	(%eax,%eax,2), %eax
  addl	%eax, %eax

Only the last line is new. The compiler decide that x+x is more performance than 2*x. Cute.

Okay lets have some fun. We force the compiler to evaluate (2*PI) first. (2*PI)*r
And the resulting assembler code is the same HAHA.

Okay back to topic. We see that integer, parameter variables in modules are evaluated at compile time. Lets switch back to real.

Now the assembler code looks a litte bit more complicated:


  flds	.LC0@GOTOFF(%eax)   % Load constant into floading point register
  movl	4(%esp), %eax       % Store value from r into eax
  fmuls	(%eax)              % Multiply the constant with r and put the result back in eax
.LC0:
  .long	1086918230          % The constand 2*PI

Now you can see a constant in assember as a 4byte iteger. Its only one constant, so I guess its 2*PI here.
This constant is loaded into a floting point register. The value from r is loaded into a second register. Then both get multiplyed. Thats all. Only a single multiplaction.

PARAMETER VARIABLES IN MODULE ARE EVALUATED AT COMPILE TIME! :D

31.05.2017

FORTRAN - NaN and min() max()

Filed under: Allgemein — Tags: — Thomas @ 19:05

What happens if one put NaN into min() or max()? Will it return NaN or the other number? Is it deterministic or random?
Sounds easy to test but its hard to implement. For example: How one get a quite-NaN number? A quite-NaN dosen't trigger debug traps, signaling-NaN do. There is no standard way in pre Fortran 2003! One have to google some nasty integer-float bit tricks. But with Fortran 2003 there is an intrinsic module ieee_arithmetic, which has the function ieee_value() which can return a quite-NaN. And one can test for Nan wich the function ieee_is_nan().

Remeber, C++ provide this functionality with std::numeric_limits::quite_nan() in C++1998 or earlier.

So the test ist quite simple. We put a number and NaN into min() and compare the result with our expected number. We do this for every combination and for min() and max(). The result for gfortran 6.3.2 32bit is: The returned number is alway not NaN, expect for the case where both parameter are NaN. Then, the function has no choice and returns NaN.

Back to the implementation. We want an automatic test, to quick start the program on different hardware and software settings. With a nice console output what happens, and what happens wrong.
There are two ways to do this. The easy one with lots of code duplication. And the right on, with only on test() function which get called for one test.
The test() function gets a varliable a, b which we wanna test. x als the expected result. A procedure pointer to either min() or max() and a short string with the function name we wanna test. For example:

call test(func, "min", 3.0, NAN, 3.0);

But... one can not get a procedure pointer on a intrinsic function!?? And if you implement your own function, it has to be a subroutine!??

Fortran is broken by design.

This ends up in lot more code and paint than is has to. See my next post where I implement this test in C++. Without pain and only half of code size.

Here is the 56line problem:


module test_m
  use, intrinsic :: IEEE_ARITHMETIC
  contains
  
  subroutine test(func, funcname, a, b, x) 
    implicit none
    procedure(), pointer :: func => null()  
    real, intent(in) :: a, b, x
    character(len=*), intent(in) :: funcname
    real :: res
    
    call func(a,b,res)
    write(*,*) funcname, "(",a, ",", b, ") =", res
    
    if(ieee_is_nan(x)) then
      if(.not. ieee_is_nan(res)) then
        write(*,*) "But sould be", x
      endif
    else if(res /= x) then
      write(*,*) "But should be", x
    endif    
  end subroutine

  subroutine mymin(a,b, x)
    implicit none
    real, intent(in) :: a, b
    real, intent(out) :: x
    x = min(a,b)
  end subroutine
  
  subroutine mymax(a,b, x)
    implicit none
    real, intent(in) :: a, b
    real, intent(out) :: x
    x = max(a,b)
  end subroutine
end module

program fmin    
  use test_m
  implicit none
  real :: NAN
  procedure(), pointer :: func => null()
  
  NAN = IEEE_VALUE(NAN, IEEE_QUIET_NAN)

  func => mymin  
  call test(func, "min", -3.0, NAN, -3.0)
  call test(func, "min", NAN, -3.0, -3.0)
  call test(func, "min", NAN, NAN,  NAN)

  func => mymax  
  call test(func, "max", -3.0, NAN, -3.0)
  call test(func, "max", NAN, -3.0, -3.0)
  call test(func, "max", NAN, NAN,  NAN)
end program

11.05.2017

C++ Guns - Why isn't there a swap function in Fortran and C?

Filed under: Allgemein — Tags: , , — Thomas @ 09:05

Because Fortran and C sucks.
To implement a generic swap function which works for any type and any custom type, one needs templates. Think about usually variables which store a value. Templates are variables which store types. But Fortran and C do not have template techniques. So one must implement a swap function for every type and any future custom type by himself. And pay the cost of a not-inline function call.

With C++ you don't even have to think about it.


template<class T>
inline void std::swap(T& a, T& b) noexcept;

26.06.2016

malloc info fortran

Filed under: Allgemein — Tags: — Thomas @ 00:06

Code snipped to get detailed information about memory usage

mallocinfo.f95.zip

Related: Code snipped to read /proc status with fortran.


module MallocInfo_m
  use :: iso_c_binding
  implicit none
  
    !> This structure type is used to return information about the dynamic memory allocator.
    type, bind(c) :: MallInfo_t
      !> This is the total size of memory allocated with sbrk by malloc, in bytes.
      integer(c_int) :: arena                  
      !> This is the number of chunks not in use. (The memory allocator internally gets chunks of memory from the operating system, and then carves them up to satisfy individual malloc requests; see Efficiency and Malloc.)
      integer(c_int) :: ordblks        
      !> This field is unused.
      integer(c_int) :: smblks        
      !> This is the total number of chunks allocated with mmap.
      integer(c_int) :: hblks        
      !> This is the total size of memory allocated with mmap, in bytes.
      integer(c_int) :: hblkhd        
      !> This field is unused.
      integer(c_int) :: usmblks        
      !> This field is unused.
      integer(c_int) :: fsmblks        
      !> This is the total size of memory occupied by chunks handed out by malloc.
      integer(c_int) :: uordblks        
      !> This is the total size of memory occupied by free (not in use) chunks.
      integer(c_int) :: fordblks        
      !> This is the size of the top-most releasable chunk that normally borders the end of the heap (i.e., the high end of the virtual address space’s data segment).
      integer(c_int) :: keepcost      
    end type
    
    interface
      function mallinfo() bind(c, name="mallinfo") result(data)
        use :: iso_c_binding
        implicit none
        
        type, bind(c) :: MallInfo_t      
          integer(c_int) :: arena                        
          integer(c_int) :: ordblks              
          integer(c_int) :: smblks        
          integer(c_int) :: hblks        
          integer(c_int) :: hblkhd        
          integer(c_int) :: usmblks        
          integer(c_int) :: fsmblks        
          integer(c_int) :: uordblks        
          integer(c_int) :: fordblks        
          integer(c_int) :: keepcost      
        end type
        type(MallInfo_t) :: data
      end function
    end interface
  
  contains
  
  subroutine getMallocInfo(malinfo)
    implicit none
    type(MallInfo_t), intent(out) :: malinfo
    malinfo = mallinfo()
  end subroutine
  
  subroutine printMallInfo(malinfo)
    implicit none
    type(MallInfo_t), intent(in) :: malinfo
    write(*,*) "Total size of memory allocated with sbrk by malloc in byte.  ", malinfo%arena
    write(*,*) "Total size of memory allocated with mmap, in bytes.          ", malinfo%hblkhd
    write(*,*) "Total size of memory occupied by chunks handed out by malloc.", malinfo%uordblks
    write(*,*) "Total number of chunks allocated with mmap.                  ", malinfo%hblks
    write(*,*) "Number of chunks not in use.                                 ", malinfo%ordblks
    write(*,*) "Total size of memory occupied by free (not in use) chunks.   ", malinfo%fordblks
    write(*,*) "Size of the top-most releasable chunk borders end of the heap", malinfo%keepcost    
  end subroutine
  
  
end module

program test
  use MallocInfo_m
  implicit none
  type(MallInfo_t) :: mallinfos(10000)  
  integer :: i, nInfos
  integer, allocatable :: data(:)
  
  allocate(data(0))
  nInfos = 0
  do i=1, 10
    write(*,*) "Iteration",i
    deallocate(data)
    allocate(data(i*100000))
    nInfos = nInfos+1
    call getMallocInfo(mallinfos(nInfos))    
    call printMallInfo(mallInfos(nInfos))
    call sleep(1)
  end do
  
   do i=10, 1, -1
    write(*,*) "Iteration",i
    deallocate(data)
    allocate(data(i*100000))
    nInfos = nInfos+1
    call getMallocInfo(mallinfos(nInfos))    
    call printMallInfo(mallInfos(nInfos))    
    call sleep(1)
  end do
  
  write(*,*) "Total size of memory allocated with sbrk. min, mean, max", minval(mallinfos(1:nInfos)%arena), sum(mallinfos(1:nInfos)%arena)/nInfos, maxval(mallinfos(1:nInfos)%arena)
end program

./a.out
Iteration 1
Total size of memory allocated with sbrk by malloc in byte. 135168
Total size of memory allocated with mmap, in bytes. 401408
Total size of memory occupied by chunks handed out by malloc. 7080
Total number of chunks allocated with mmap. 1
Number of chunks not in use. 1
Total size of memory occupied by free (not in use) chunks. 128088
Size of the top-most releasable chunk borders end of the heap 128088
Iteration 2
Total size of memory allocated with sbrk by malloc in byte. 135168
Total size of memory allocated with mmap, in bytes. 802816
Total size of memory occupied by chunks handed out by malloc. 7080
Total number of chunks allocated with mmap. 1
Number of chunks not in use. 1
Total size of memory occupied by free (not in use) chunks. 128088
Size of the top-most releasable chunk borders end of the heap 128088
.
.
.
Total size of memory allocated with sbrk. min, mean, max 135168 155443 540672

« Newer PostsOlder Posts »

Powered by WordPress