C++Guns – RoboBlog blogging the bot

26.07.2024

C++ Guns: Overloading friend operator for class template - the right way

Filed under: Allgemein — Tags: — Thomas @ 14:07

Goal

Overload operator +=, << etc. as free function for a templated class with private member variables without boilerplate code.
Choose free function to achieve less coupling than with member functions.
In the example you can imagine the Type as Array2D for a use case.

Solution

template<typename T>
class Type {
public:
    // free function form of the operator
    // no repeat of the template parameters for class Type necessary
    // with the friend keyword this is still a free function but can 
    // access private variables like a member function
    friend Type& operator+=(Type& lhs, double value) {
        lhs.data += value; // can access private member variable
        return x;
    }

private:
    double data = 0;
};


void f(Type<double>& x) {
    x += 2.0;
}

Not a Solution

// forward declaration, not usual
template<typename T>
class Type;

// Definition of the function is placed bevor the definition of the type. 
// This is also not usual but needed for the friend declaration later in the code.
// Repeat of the template parameter.
template<typename T>
inline Type<T>& operator+=(Type<T>& lhs, double value) {
    lhs.data += value;    
    return lhs;
}

template<typename T>
class Type {

private:
    double data = 0;

    // templated friend declaration. watch the necessary but very unusual <> here.    
    friend Type& operator+= <>(Type& lhs, double value);
};

void f(Type<double>& x) {
    x += 2.0;
}

C.4: Make a function a member only if it needs direct access to the representation of a class https://isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines#Rc-member
Sources: https://stackoverflow.com/questions/4660123/overloading-friend-operator-for-class-template

03.04.2024

C++ Guns: OmegaException class von Peter Muldoon

Filed under: Allgemein — Tags: — Thomas @ 21:04

Mal ausprobieren ob damit Exception einfacher in der Handhabung werden...
Aus Exceptionally Bad: The Misuse of Exceptions in C++ & How to Do Better - Peter Muldoon - CppCon 2023

Conclusion of the talk:

What should they used for ?
• Error tracing (logging)
• Stack unwinding (reset / termination)
• Data passing / Control flow (when necessary)

When should they used ?
• As “Rarely” as possible
• Serious/infrequent/unexpected errors
• With as few exception types as possible (as determined by catch functionality)

template<typename DATA_T>
class OmegaException {
public:
  OmegaException(std::string str, DATA_T data, const std::source_location& loc = std::source_location::current(), std::stacktrace trace = std::stacktrace::current())
:err_str_{std::move(str)}, user_data_{std::move(data)}, location_{loc}, backtrace_{trace} { }

  std::string& what() { return err_str_; }
  const std::string& what() const noexcept { return err_str_; }

  const std::source_location& where() const noexcept { return location_; }
  const std::stacktrace& stack() const noexcept { return backtrace_; }

  DATA_T& data(){ return user_data_;}
  const DATA_T& data() const noexcept { return user_data_; }

private:
  std::string err_str_;
  DATA_T user_data_;
  const std::source_location location_;
  const std::stacktrace backtrace_;
}

std::ostream& operator << (std::ostream& os, const std::source_location& location) {
  os << location.file_name() << "("
  << location.line() << ":"
  << location.column() << "), function `"
  << location.function_name() << "`";
  return os;
}

std::ostream& operator << (std::ostream& os, const std::stacktrace& backtrace) {
  for(auto iter = backtrace.begin(); iter != (backtrace.end()-3); ++iter) {
    os << iter->source_file() << "(" << iter->source_line()
   << ") : " << iter->description() << "\n";
  }
  return os;
}

///////////////

enum Errs1 { bad = 1, real_bad };
enum class Errs2 { not_bad = 10, not_good };

using MyExceptionErrs1 = OmegaException<Errs1>;
using MyExceptionErrs2 = OmegaException<Errs2>;

throw MyExceptionErrs1(“Bad Order id", real_bad);

catch(const MyExceptionErrs1& e) {
  std::cout << "Failed to process with code (" << e.data() << ") : "
  << e.what() << “\n" << e.where() << std::endl;
}

/*
Failed to process with code (2) : Bad Order id
/app/example.cpp(76:69), function `Order findOrder(unsigned int
*/


//////////////

struct bucket {
  int id_;
  Msg msg_;
};

using MyExceptionBucket = OmegaException< bucket >;

throw MyExceptionBucket ("bad error", bucket{cliendId, amsg});

catch(MyExceptionBucket& eb) {
  std::cout << "Failed to process id (" << e.data().id_ << ") : "
  << e.what() << "\n" << e.stack() << std::endl;
  send_error(eb.data().msg_);
}

/*
Failed to process id (222) : Bad Order id
example.cpp(76) : findOrder(unsigned int)
example.cpp(82) : processOrder(unsigned int)
example.cpp(97) : main
*/

Filed under: Allgemein — Tags: — Thomas @ 21:04

Mal ausprobieren ob damit Exception einfacher in der Handhabung werden...
Aus Exceptionally Bad: The Misuse of Exceptions in C++ & How to Do Better - Peter Muldoon - CppCon 2023

Conclusion of the talk:

What should they used for ?
• Error tracing (logging)
• Stack unwinding (reset / termination)
• Data passing / Control flow (when necessary)

When should they used ?
• As “Rarely” as possible
• Serious/infrequent/unexpected errors
• With as few exception types as possible (as determined by catch functionality)

template<typename DATA_T>
class OmegaException {
public:
  OmegaException(std::string str, DATA_T data, const std::source_location& loc = std::source_location::current(), std::stacktrace trace = std::stacktrace::current())
:err_str_{std::move(str)}, user_data_{std::move(data)}, location_{loc}, backtrace_{trace} { }

  std::string& what() { return err_str_; }
  const std::string& what() const noexcept { return err_str_; }

  const std::source_location& where() const noexcept { return location_; }
  const std::stacktrace& stack() const noexcept { return backtrace_; }

  DATA_T& data(){ return user_data_;}
  const DATA_T& data() const noexcept { return user_data_; }

private:
  std::string err_str_;
  DATA_T user_data_;
  const std::source_location location_;
  const std::stacktrace backtrace_;
}

std::ostream& operator << (std::ostream& os, const std::source_location& location) {
  os << location.file_name() << "("
  << location.line() << ":"
  << location.column() << "), function `"
  << location.function_name() << "`";
  return os;
}

std::ostream& operator << (std::ostream& os, const std::stacktrace& backtrace) {
  for(auto iter = backtrace.begin(); iter != (backtrace.end()-3); ++iter) {
    os << iter->source_file() << "(" << iter->source_line()
   << ") : " << iter->description() << "\n";
  }
  return os;
}

///////////////

enum Errs1 { bad = 1, real_bad };
enum class Errs2 { not_bad = 10, not_good };

using MyExceptionErrs1 = OmegaException<Errs1>;
using MyExceptionErrs2 = OmegaException<Errs2>;

throw MyExceptionErrs1(“Bad Order id", real_bad);

catch(const MyExceptionErrs1& e) {
  std::cout << "Failed to process with code (" << e.data() << ") : "
  << e.what() << “\n" << e.where() << std::endl;
}

/*
Failed to process with code (2) : Bad Order id
/app/example.cpp(76:69), function `Order findOrder(unsigned int
*/


//////////////

struct bucket {
  int id_;
  Msg msg_;
};

using MyExceptionBucket = OmegaException< bucket >;

throw MyExceptionBucket ("bad error", bucket{cliendId, amsg});

catch(MyExceptionBucket& eb) {
  std::cout << "Failed to process id (" << e.data().id_ << ") : "
  << e.what() << "\n" << e.stack() << std::endl;
  send_error(eb.data().msg_);
}

/*
Failed to process id (222) : Bad Order id
example.cpp(76) : findOrder(unsigned int)
example.cpp(82) : processOrder(unsigned int)
example.cpp(97) : main
*/

27.02.2024

C++ Guns: Analyse AVX _mm256_i32gather_pd instruction

Filed under: Allgemein — Tags: , — Thomas @ 17:02

In scientific application using triangular grids it is often the case to load a value from all three vertices of the triangle which is spread across RAM. In order to use SIMD instruction to speed up execution time of the program the three values must be stored into a single SIMD register to perform e.g. a parallel multiplication (for a scalar product). One approach to archive this is to load the values one by one into multiple registers and combine them in an additional step. Would it be a better approach to load the three values in on step into one register? This can be done with the AVX gather instruction.

As I don't know how to use SIMD Intrinsics in fortran, this example is C++ only.

We start by native approach how to load three values into a SIMD register using intrinsics datatypes like __m256d which can hold up to four 64bit double precision floating-point values where actually only three are used.

__m256d f1(const std::vector<double>& values, const std::array<int,3>& idx) {
    const __m256d vec{ values[idx[0]], values[idx[1]], values[idx[2]]};
    return vec;
}

This generate the following assembler code independent of passed optimization flags -O1 -O2 -O3

        movq    (%rdi), %rax                         #                                                           
        movslq  (%rsi), %rcx                         # load idx0 in register rcx                                 
        movslq  4(%rsi), %rdx                        # load idx1 in register rdx                                 
        movslq  8(%rsi), %rsi                        # load idx2 in register rsi                                 
        vmovsd  (%rax,%rcx,8), %xmm0                 # load values[idx[0]] into xmm0                             
        vmovq   (%rax,%rsi,8), %xmm1                 # load values[idx[2]] into xmm1
        vmovhpd (%rax,%rdx,8), %xmm0, %xmm0          # load values[idx[1]] into hight part of xmm0
        vinsertf128     $0x1, %xmm1, %ymm0, %ymm0    # combine xmm0 and xmm1 into 512bit register ymm0
        ret 

To load 3 values at once we can use the _mm256_mask_i32gather_pd instruction which take a base address and offsets in a variable of type __m128i. To only load 3 instead of 4 values there is an additional mask variable.

The C++ code looks like this now:

__m256d f2(const std::vector<double>& values, const std::array<int,3>& idx) {
    // mask to load 3 instead of 4 values
    const __m128i mask = _mm_setr_epi32(-1, -1, -1, 0);
    // Copy Element Node IDs into SIMD register
    const __m128i vindex = _mm_maskload_epi32(idx.data(), mask);
    // Load 3 double values
    constexpr int scale = 8;
    // mask to load 3 values but this time as 256d variable (but docu say it must be 256i, bug?)
    const __m256d mask2 = _mm256_setr_pd (-1, -1, -1, 0);
    const __m256d vec{_mm256_mask_i32gather_pd (__m256d{}, values.data(), vindex, mask2, scale)};
    return vec;
}

This generate the following assembler code independent of passed optimization flags -O1 -O2 -O3

        movq    (%rdi), %rax                          #                                                     
        vmovapd .LC1(%rip), %ymm2                     # load mask2 into ymm2
        vxorpd  %xmm0, %xmm0, %xmm0                   # zero register xmm0                                    
        vmovdqa .LC0(%rip), %xmm1                     # load mask1 into xmm1
        vpmaskmovd      (%rsi), %xmm1, %xmm1          # load idx into xmm1  
        vgatherdpd      %ymm2, (%rax,%xmm1,8), %ymm0  # load values[idx] into ymm0
        ret

With AVX512 it is even possible the gather and scatter 3 values at ones. This time the mask variable is easier to create.

void f3(std::vector<double>& values, const std::array<int,3>& idx) {
    // Copy Element Node IDs into SIMD register
    constexpr int mask = 0b00000111;
    const __m128i vindex = _mm_maskz_loadu_epi32(mask, idx.data());
    // Load 3 double values
    constexpr int scale = 8;
    const __m256d vec{_mm256_mmask_i32gather_pd (__m256d{}, mask, vindex, values.data(), scale)};        
    // Store 3 double vales
    _mm256_mask_i32scatter_pd(values.data(), mask, vindex, vec, scale);
}

This generate the following assembler code independent of passed optimization flags -O1 -O2 -O3

        movl    $7, %eax                              # load mask into eax
        vxorpd  %xmm0, %xmm0, %xmm0                   # zero register xmm0
        kmovw   %eax, %k1                             # convert mask into the mask register k1
        movq    (%rdi), %rax                          # 
        vmovdqu32       (%rsi), %xmm1{%k1}{z}         # load idx into xmm1
        kmovw   %k1, %k2                              # copy mask register. why is that so? bug?
        vgatherdpd      (%rax,%xmm1,8), %ymm0{%k2}    # load values[idx] into ymm0
        vscatterdpd     %ymm0, (%rax,%xmm1,8){%k1}    # stpre ymm0 into values[idx]
        vzeroupper                                    # can be optimization away?
        ret

To measure I created a little testcase with the presented different variants on the following CPUs: Intel Core i5-10500, Intel Xeon E5-2695 v4, Intel Core i9-7900X. Unfortunately there is no speedup using AVX. There is even no slowdown (except for 1-2%).

https://www.intel.com/content/www/us/en/docs/intrinsics-guide/index.html#text=m256_i32gather&expand=4474&ig_expand=4252,4250,2125,3720,4058,3723,3723&techs=AVX_ALL

24.07.2023

C++ Guns: DOS codepage 437 to UTF8

Filed under: Allgemein — Tags: , — Thomas @ 09:07

Wenn die üblichen Encodings wie UTF8, Latin1, ISO-8859-15 nicht funktionieren kann man CP437 versuchen. Das ist der Original-Zeichensatz des IBM-PC ab 1981. Dieser enthält Umlaute die in den üblicherweise eingestellten Encodings nicht angezeigt werden.

Hier mein erster Versuch CP437 nach UTF8 zu konvertieren. Die Sourcecode Datei und der Compiler und das ganze Betriebssystem ist selbstverständlich auf UTF8 eingestellt, sonst funktioniert es nicht.

std::string cp437toUTF8(std::string_view str) {
    std::string result;
    result.reserve(str.size());
    for(unsigned char c : str) {
        switch(c) {
        case 129: result.append("ü"); break;
        case 132: result.append("ä"); break;
        case 142: result.append("Ä"); break;
        case 148: result.append("ö"); break;
        case 153: result.append("Ö"); break;
        case 154: result.append("Ü"); break;
        case 225: result.append("ß"); break;
        default: result.push_back(c);
        }
    }

    return result;
}

[1] https://de.wikipedia.org/wiki/Codepage_437
[2] https://www.ascii-code.com/CP437

19.06.2023

C++ Guns: How to convert from UTC to local time in C++?

Filed under: Allgemein — Tags: — Thomas @ 11:06

Convert from a broken down date time structure from UTC to localtime in C++.

See Stackoverflow: How to convert from UTC to local time in C?
I converted the code from C to C++ and make it shorter.

// replace non std function strptime with std::get_time

#include <iostream>
#include <sstream>
#include <iomanip>
#include <ctime>

/*
  Convert from a broken down date time structure from UTC to localtime.
  See https://stackoverflow.com/questions/9076494/how-to-convert-from-utc-to-local-time-in-c
*/
std::tm UTC2localtime(std::tm tp) {
  // make sure Daylight saving time flag is not accidentally switched on in UTC time
  tp.tm_isdst = 0;

  // get seconds since EPOCH for this time
  const time_t utc = std::mktime(&tp);
  std::cout << "UTC date and time in seconds since EPOCH: " << utc << "\n";

  // convert UTC date and time (Jan. 1, 1970) to local date and time
  std::tm e0{};
  e0.tm_mday = 1;
  e0.tm_year = 70;

  // get time_t EPOCH value for e0. This handles daylight saving stuff.
  // The value is e.g. -3600 for 1h difference between the timezones
  const time_t diff = std::mktime(&e0);

  // calculate local time in seconds since EPOCH
  const time_t local = utc - diff;
  std::cout << "local date and time in seconds since EPOCH: " << local << "\n";

  // convert seconds since EPOCH for local time into local_tm time structure
  std::tm local_tm;
  if(localtime_r(&local, &local_tm) == nullptr) {
    throw std::system_error(errno, std::generic_category(), "UTC2localtime(): in conversion vom UTC to localtime");
  }
  return local_tm;
}

int main() {
  // hard coded date and time in UTC
  std::string datetime = "2013 11 30 23 30 26";
  std::cout << "UTC date and time to be converted in local time: " << datetime << "\n";

  // put values of datetime into time structure
  std::tm UTC_tm{};
  std::istringstream ss(datetime);
  ss >> std::get_time(&UTC_tm, "%Y %m %d %H %M %S");

  if(ss.fail()) {
    throw std::runtime_error("Can not parse datetime from datetime '" + datetime + "' to format %Y %m %d %H %M %S");
  }

  const std::tm local_tm = UTC2localtime(UTC_tm);
  std::cout << "local date and time: " << std::put_time(&local_tm, "%Y-%m-%d %H:%M:%S %Z") << "\n";
}

17.09.2022

C++ Guns: throw and catch all standard exceptions for fun

Filed under: Allgemein — Tags: — Thomas @ 19:09

This example throw and catch all standard exceptions just for fun
exception list from https://en.cppreference.com/w/cpp/error/exception
sorted after C++ Standard

searching a stack trace? look at https://en.cppreference.com/w/cpp/utility/basic_stacktrace

/* This example throw and catch all standard exceptions just for fun
 * exception list from https://en.cppreference.com/w/cpp/error/exception
 * sorted after C++ Standard
 *
 * searching a stack trace? look at https://en.cppreference.com/w/cpp/utility/basic_stacktrace
 */

#include <iostream>
#include <exception>
#include <future>
#include <regex>
#include <filesystem>
#include <chrono>
#include <any>
#include <optional>
#include <variant>
//#include <format> // C++20. Not implemented in GCC 12
// #include <expected> // C++23


int nCatched = 0;

template<typename T>
void check(const T& ex, std::string_view exceptionType) {
  nCatched++;
  std::string_view what(ex.what());
  // only compare the first chars from what() message, we are not intereseting in stuff after the exception type
  if(what.substr(0,exceptionType.size()) != exceptionType) {
      std::cerr << exceptionType << " what(): " << ex.what() << "\n";
      std::cerr << "\nERROR: Not all exception derived from " << exceptionType << " catched...\n";
      ::exit(EXIT_FAILURE);
    }
}

int main() {

  bool finish = false;
  int nThrow = 0;
  while(not finish){
    try {
      switch(nThrow) {
        case  1: ++nThrow; throw std::exception();
        case  2: ++nThrow; throw std::logic_error("std::logic_error");
        case  3: ++nThrow; throw std::invalid_argument("std::invalid_argument");
        case  4: ++nThrow; throw std::domain_error("std::domain_error");
        case  5: ++nThrow; throw std::length_error("std::length_error");
        case  6: ++nThrow; throw std::out_of_range("std::out_of_range");
        case  7: ++nThrow; throw std::runtime_error("std::runtime_error");
        case  8: ++nThrow; throw std::range_error("std::range_error");
        case  9: ++nThrow; throw std::overflow_error("std::overflow_error");
        case 10: ++nThrow; throw std::underflow_error("std::underflow_error");
        case 11: ++nThrow; throw std::bad_typeid();
        case 12: ++nThrow; throw std::bad_alloc();
        case 13: ++nThrow; throw std::bad_exception();
        case 14: ++nThrow; throw std::regex_error(std::regex_constants::error_collate); // C++11
        case 15: ++nThrow; throw std::system_error(ENOENT, std::system_category(), "std::system_error"); // C++11
        case 16: ++nThrow; throw std::ios_base::failure("std::ios_base::failure"); // C++11
        case 17: ++nThrow; throw std::future_error(std::future_errc::broken_promise); // C++11
        case 18: ++nThrow; throw std::bad_weak_ptr(); // C++11
        case 19: ++nThrow; throw std::bad_function_call(); // C++11
        case 20: ++nThrow; throw std::bad_array_new_length(); // C++11
        case 21: ++nThrow; throw std::filesystem::filesystem_error("std::filesystem::filesystem_error", std::error_code(ENOENT, std::system_category())); // C++17
        case 22: ++nThrow; throw std::bad_any_cast(); // C++17
        case 23: ++nThrow; throw std::bad_optional_access(); // C++17
        case 24: ++nThrow; throw std::bad_variant_access(); // C++17
        // case 25: throw std::chrono::nonexistent_local_time(); // C++ 20. Not implemented in GCC 12
        // case 26: throw std::chrono::ambiguous_local_time(); // C++20. Not implemented in GCC 12
        // case 27: throw std::format_error(); // C++20. Not implemented in GCC 12
        // case 28: throw std::bad_expected_access(); // C++23
        // case 29: throw std::tx_exception(); TODO
        default: {
          finish = true;
        }
      }
    }
    catch(std::bad_variant_access& ex) {
      check(ex, "bad variant access");
    }
    catch(std::bad_exception& ex) {
      check(ex, "std::bad_exception");
    }
    catch(std::bad_array_new_length& ex) {
      check(ex, "std::bad_array_new_length");
    }
    catch(std::bad_alloc& ex) {
      check(ex, "std::bad_alloc");
    }
    catch(std::bad_function_call& ex) {
      check(ex, "bad_function_call");
    }
    catch(std::bad_weak_ptr& ex) {
      check(ex, "bad_weak_ptr");
    }
    catch(std::bad_optional_access& ex) {
      check(ex, "bad optional access");
    }
    catch(std::bad_any_cast& ex) {
      check(ex, "bad any_cast");
    }
    catch(std::bad_typeid& ex) {
      check(ex, "std::bad_typeid");
    }
    catch(std::filesystem::filesystem_error& ex) {
      check(ex, "filesystem error");
    }
    catch(std::ios_base::failure& ex) {
      check(ex, "std::ios_base::failure");
    }
    catch(std::system_error& ex) {
      check(ex, "std::system_error");
    }
    catch(std::regex_error& ex) {
      check(ex, "Invalid collating element in regular expression"); // regex_error.what does not print the exception type first...
    }
    catch(std::underflow_error& ex) {
      check(ex, "std::underflow_error");
    }
    catch(std::overflow_error& ex) {
      check(ex, "std::overflow_error");
    }
    catch(std::range_error& ex) {
      check(ex, "std::range_error");
    }
    catch(std::runtime_error& ex) {
      check(ex, "std::runtime_error");
    }
    catch(std::future_error& ex) {
      check(ex, "std::future_error");
    }
    catch(std::out_of_range& ex) {
      check(ex, "std::out_of_range");
    }
    catch(std::length_error& ex) {
      check(ex, "std::length_error");
    }
    catch(std::domain_error& ex) {
      check(ex, "std::domain_error");
    }
    catch(std::invalid_argument& ex) {
      check(ex, "std::invalid_argument");
    }
    catch(std::logic_error& ex) {
      check(ex, "std::logic_error");
    }
    catch(std::exception& ex) {
      check(ex, "std::exception");
    }
  } // while


  if(nThrow != nCatched) {
    std::cerr << nThrow << " exception thrown but " << nCatched << " catched\n";
  } else {
    std::cout << "All exceptions which was thrown was catched\n";
  }

  return EXIT_SUCCESS;
}

$ ./a.out
All exceptions which was thrown was catched

07.09.2022

C++ Guns: Streams display the format flags

Filed under: Allgemein — Tags: — Thomas @ 12:09

Display which format flags are currently set e.g. fixed, scientific, dec, hex

std::ostream& operator<<(std::ostream& s, const std::ios::fmtflags f) {
   if(f & std::ios::boolalpha) s << "boolalpha ";
   if(f & std::ios::dec) s << "dec ";
   if(f & std::ios::hex) s << "hex ";
   if(f & std::ios::oct) s << "oct ";
   if(f & std::ios::fixed) s << "fixed ";
   if(f & std::ios::scientific) s << "scientific ";
   if(f & std::ios::right) s << "right ";
   if(f & std::ios::left) s << "left ";
   if(f & std::ios::internal) s << "internal ";
   if(f & std::ios::showbase) s << "showbase ";
   if(f & std::ios::showpoint) s << "showpoint ";
   if(f & std::ios::showpos) s << "showpos ";
   if(f & std::ios::uppercase) s << "uppercase ";
   if(f & std::ios::unitbuf) s << "unitbuf ";
   if(f & std::ios::skipws) s << "skipws ";
   return s;
}

std::cout << std::cout.flags() << "\n";

Example output

dec fixed skipws 

https://en.cppreference.com/w/cpp/io/ios_base/flags

30.05.2022

C++ Guns: MPI Dataype; send struct

Filed under: Allgemein — Tags: — Thomas @ 10:05

Das Beispiel habe ich von https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node425.htm
Jeder Thread erstellt einen MPI Datentyp welcher die Offsett Addressen der struct Member Variablen hat.
Thread 1 sendet Daten zu Thread 0
Thread 0 empfaenge Daten von Thread1 und seine eigenen Daten, so dass alle in einem Array dann liegen.

// This example is based on https://www.mpi-forum.org/docs/mpi-3.1/mpi31-report/node425.htm
// Jeder Thread erstellt einen MPI Datentyp welcher die Offsett Addressen der struct Member Variablen hat.
// Thread 1 sendet Daten zu Thread 0
// Thread 0 empfaenge Daten von Thread1 und seine eigenen Daten, so dass alle in einem Array dann liegen.

#include <iostream>
#include <array>

#include <mpi.h>

int my_rank;
int nThreads;

struct basetype_t {
  double volstep = 0;
  double volTot = 0;
};

// Vererbung ist nicht erlaubt
struct type_t  {
  char ID[50];
  int i = 0;
  float x = 0;
  // Ein Container wie std::vector zwischen den Datentypen scheint wohl zu funktionieren, ist in Fortran aber explizit nicht erlaubt.
//   std::vector<int> unused;
  bool l = false;
  double d = 0;
  basetype_t base;
};

// Check MPI Error code
void check(int ierr) {
  if (ierr != MPI_SUCCESS) {
    char err_recvbuffer[MPI_MAX_ERROR_STRING];
    int resultlen;
    MPI_Error_string(ierr, err_recvbuffer, &resultlen);
    std::cerr << err_recvbuffer << "\n";
    MPI_Finalize();
  }
}

// create new MPI datatype based on the addresses of the member variables of the type we want to send
MPI_Datatype createMPItyp() {
  type_t foo;
  MPI_Aint base;
  check(MPI_Get_address(&foo, &base));

  // Fuer jede member Variable die gesendet werden soll, Typ und Addresse bestimmen
  const int nMembervarsToSend = 7;
  std::array<MPI_Datatype, nMembervarsToSend> types;
  std::array<int,nMembervarsToSend> blocklen;
  std::array<MPI_Aint, nMembervarsToSend> disp;

  types[0] = MPI_INT;
  blocklen[0] = 1;
  check(MPI_Get_address(&foo.i, &disp[0]));

  types[1] = MPI_FLOAT;
  blocklen[1] = 1;
  check(MPI_Get_address(&foo.x, &disp[1]));

  types[2] = MPI_LOGICAL;
  blocklen[2] = 1;
  check(MPI_Get_address(&foo.l, &disp[2]));

  types[3] = MPI_DOUBLE;
  blocklen[3] = 1;
  check(MPI_Get_address(&foo.d, &disp[3]));

  types[4] = MPI_CHAR;
  blocklen[4] = sizeof(foo.ID);
  check(MPI_Get_address(&foo.ID, &disp[4]));

  types[5] = MPI_DOUBLE;
  blocklen[5] = 1;
  check(MPI_Get_address(&foo.base.volstep, &disp[5]));

  types[6] = MPI_DOUBLE;
  blocklen[6] = 1;
  check(MPI_Get_address(&foo.base.volTot, &disp[6]));

  if(my_rank == 0) {
      std::cout << "Base Address " << std::hex << base << "\n";
      std::cout << "Addresses   ";
      for(auto& x : disp) {
        std::cout << " " << std::hex << x;
      }
      std::cout << std::dec << "\n";
  }

  // Addresse zu Offset umrechnen
  for(auto& x : disp) {
    x -= base;
  }

  if(my_rank == 0) {
    std::cout << "Displacement";
    for(auto& x : disp) {
      std::cout << " " << x;
    }
    std::cout << "\n";
  }

  MPI_Datatype newMPItype;
  check(MPI_Type_create_struct(nMembervarsToSend, blocklen.data(), disp.data(), types.data(), &newMPItype));
  check(MPI_Type_commit(&newMPItype));

  return newMPItype;
}

void doRank0(MPI_Datatype newMPItype) {
    type_t sendbuffer;
    strcpy(sendbuffer.ID, "Kreis100");
    sendbuffer.i = 10;
    sendbuffer.x = 1.2f;
    sendbuffer.d = 1.23;
    sendbuffer.l = true;
    sendbuffer.base.volstep = 1.34;
    sendbuffer.base.volTot = 1.56;

    int  displacements[nThreads], counts[nThreads];

    std::vector<type_t> recvbuffer(2);
    std::cout << my_rank << " Receiving...\n";

    int root_rank = 0;
    displacements[0] = 0;
    displacements[1] = 1;
    counts[0] = 1;
    counts[1] = 1;
    // MPI_Gatherv(recvbuffer_send,count_send, datatype_send, recvbuffer_recv, counts_recv, displacements, datatype_recv, root,comm)
    check(MPI_Gatherv(&sendbuffer, 1, newMPItype, recvbuffer.data(), counts, displacements, newMPItype, root_rank, MPI_COMM_WORLD));
    std::cout << my_rank << " Done receiving\n";

    std::cout << my_rank << " content of struct:\n";
    for(const type_t& buf : recvbuffer) {
      std::cout << "ID "  << buf.ID << "\n";
      std::cout << "i "  << buf.i << "\n";
      std::cout << "x "  << buf.x << "\n";
      std::cout << "d "  << buf.d << "\n";
      std::cout << "l "  << buf.l << "\n";
      std::cout << "volstep "  << buf.base.volstep << "\n";
      std::cout << "volTot "  << buf.base.volTot << "\n\n";
    }
}

void doRank1(MPI_Datatype newMPItype) {
    type_t sendbuffer;
    int  displacements[nThreads], counts[nThreads];

    strcpy(sendbuffer.ID, "Kreis200");
    sendbuffer.i = 20;
    sendbuffer.x = 2.2;
    sendbuffer.d = 2.23;
    sendbuffer.l = true;
    sendbuffer.base.volstep = 2.34;
    sendbuffer.base.volTot = 2.56;

    std::cout << my_rank << " Sending...\n";
    // MPI_Gatherv(recvbuffer_send,count_send, datatype_send, recvbuffer_recv, counts_recv, displacements, datatype_recv, root,comm)
    int root_rank = 0;
    check(MPI_Gatherv(&sendbuffer, 1, newMPItype, NULL, counts, displacements, newMPItype, root_rank, MPI_COMM_WORLD));
    std::cout << my_rank << " Done sending\n";
}

int main(int argc, char* argv[]) {
    MPI_Init(&argc, &argv);

    // Get number of processes and check only 2 processes are used
    MPI_Comm_size(MPI_COMM_WORLD, &nThreads);
    if(nThreads != 2) {
        std::cout << "Start with 2 threads.\n";
        MPI_Abort(MPI_COMM_WORLD, EXIT_FAILURE);
    }

    MPI_Comm_rank(MPI_COMM_WORLD, &my_rank);
    MPI_Datatype newMPItype = createMPItyp();

    switch(my_rank) {
      case 0: doRank0(newMPItype); break;
      case 1: doRank1(newMPItype); break;
    };

    MPI_Finalize();

    return EXIT_SUCCESS;
}

$ mpic++ -g -ggdb -Wall test_MPI_struct.cpp

$ mpiexec -n 2 ./a.out
1 Sending...
1 Done sending
Base Address 7ffd83f7e180
Addresses 7ffd83f7e1b4 7ffd83f7e1b8 7ffd83f7e1bc 7ffd83f7e1c0 7ffd83f7e180 7ffd83f7e1c8 7ffd83f7e1d0
Displacement 52 56 60 64 0 72 80
0 Receiving...
0 Done receiving
0 content of struct:
ID Kreis100
i 10
x 1.2
d 1.23
l 1
volstep 1.34
volTot 1.56

ID Kreis200
i 20
x 2.2
d 2.23
l 1
volstep 2.34
volTot 2.56

14.05.2022

Kaffeetassenwärmer Optimierungsaufgabe

Filed under: Allgemein — Tags: , — Thomas @ 07:05

Niemand mag kalten Kaffee! Darum muss ein Kaffeetassenwärmer her! Aber das Zeug was man kaufen kann taugt alles nichts! USB2/1 liefert nicht genügend Leistung und USB3 ist eine Vergewaltigung der kleine Kabelchen. Noch dazu habe ich keinen USB3 Port am Laptop. Eine simple Heizplatte mit einem EIN/AUS Schalter langt doch vollkommen. So wie die in meiner Kaffeemaschine....

In der Bastelkiste finden sich noch ein paar Hochlast Widerstände. Und ein Spielzeug Transformator, welcher ca 16W Leistung liefert, ist auch noch vorhanden. Da müsste sich doch eine passende Kombination von Widerständen finden lassen?! Meine Versuche von Hand zeigen ein mögliches, aber nicht optimales Ergebnis: kein Widerstand wird überlastet, die Leistung in der Summe ist auch okay, aber die Widerstände werden sehr ungleich belastet. Einige bleiben fast kalt, andere werden sehr heiß.

Ein Computerprogramm soll es lösen!

Jede Kombination aus Widerstand, parallel und seriell Schaltung sollte von der Laufzeit her kein Problem sein. Aber da ich momentan keine Idee habe, wie man da alle Möglichkeiten durchgeht, will ich es erst mal als Optimierungsproblem beschreiben.

Die constraints sind, dass kein Widerstand überlastet wird. Also, dass die elektrische Leistung gebildet aus Spannung und Strom nicht über die maximale Leistung des Widerstandes liegt.
Die fitness Funktion berechnet die elektrische Leistung aller Widerstände, diese soll maximal sein.

Da die zufällig erstellte Schaltung durchaus komplex werden kann, langen die einfachen Rechenregeln für parallel und seriell Schaltung nicht. Da wird sich ein elektrisches Netzwerk bilden mit Knoten und Schleifen Regeln. Kirchhoffsche Regeln ... die werden in eine Matrix übersetzt die man dann löschen kann. Ich glaube so war das.

Mein erster Versuch sieht so aus:

--+--| 47 |--+----+--| 68 |------------+---
  +--| 47 |--+    |                    |
  +--| 47 |--+    +--| 33 |--+--| 1 |--+
  +--| 47 |--+    +--| 12 |--+

Eine parallel Schaltung aus vier 47 Ohm Widerständen. Das ergibt 47/4=11.75 Ohm. Plus eine parallel Schaltung von 68 Ohm mit einer Kombination einer parallel Schaltung von 33 mit 12 Ohm in Reihe zu 1 Ohm. Das ergibt in der Summe (gemessen) 20.9 Ohm. Bei einer Spannung von 16V ergibt das ein Strom von 16/20.9=0.77A und einer Leistung von 16*0.77=12.3W Ob das langt, den Kaffee warm zu halten, muss noch ermittelt werden ;)

Hier ein Bild des fliegenden Aufbaus:
kaffeetasse1

Older Posts »

Powered by WordPress