C++Guns – RoboBlog

06.10.2015

Infiniband Cluster

Filed under: Allgemein — Thomas @ 20:10

IBcluster5

IBcluster2

IBcluster

staubfilter1

LD_LIBRARY_PATH=/home/kater/bin/ las2txt64 -parse xyz -odir . -i 47605560_all_thin.las -keep_class 2

GIT

Repair Permissions
The repository isn't configured to be a shared repository (see core.sharedRepository in git help config). If the output of:
git config core.sharedRepository
is not group or true or 1 or some mask, try running:
git config core.sharedRepository group

cd /path/to/repo.git
chgrp -R groupname .
chmod -R g+rwX .
find . -type d -exec chmod g+s '{}' +

GIT Branch dauerhaft anzeigen. in .bash_rc

parse_git_branch() {
git branch 2> /dev/null | sed -e '/^[^*]/d' -e 's/* .*/(\1)/'
}
export PS1="\u@\h

\e[32m

\w

\e[91m

\$(parse_git_branch)

\e[00m

$ "

samba SMB

To enable old Version 1 SMB protocol version add vers=1.0
Die user Option hat eine doppelte Bedeutung. Steht sie allein, werden keine root Rechte zum mounten benötigt. Wird ein username bei user= angegeben, so bezieht sich dieser auf den Server. Die Option users mit einem s am Ende erlaubt, dass keine root Rechte benötigt werden und dass alle user die Partition umounten kann.
uid=
gid=

Für Multiusersysteme ist die Option noperm für cifs mounts hilfreich.
Schreibrechte checken!

192.168.10.49:/perm     /perm           nfs     noauto,nosuid,hard,user         0       0
192.168.10.49:/karten   /karten         nfs     noauto,nosuid,hard,user         0       0

//192.168.10.162/user           /laufwerkW      cifs    noauto,nosuid,hard,users,user=standard,password=,vers=1.0,gid=quad       0       0
//192.168.10.7/data             /laufwerkG      cifs    noauto,nosuid,hard,users,user=standard,password=,vers=1.0,gid=quad       0       0
//192.168.10.162/org            /laufwerkO      cifs    noauto,nosuid,hard,users,user=standard,password=,vers=1.0,gid=quad       0       0
//192.168.10.163/karten         /laufwerkK      cifs    noauto,nosuid,hard,users,user=standard,password=,vers=1.0,gid=quad       0       0
//192.168.10.161/modelldata     /laufwerkJ      cifs    noauto,nosuid,hard,users,user=standard,password=,vers=1.0,gid=quad       0       0

IGM samba SMB

Neuen User mit adduser anlegen. Default group auf users ändern
# sudo usermod -g users mhu
In Webmin neuen User convertieren
In Webmin Service restart
Point 139 445 forwarden
In Konquerror smb://mhu@localhost/
oder in der Konsole sudo mount -t cifs -o user=mhu //localhost/IGM-2 mnt/
testen

RAID

Allgemeiner Status:
$ cat /proc/mdstat
Die Angaben in dne [] Klammern sind Informationen pro Platte. Z.B. [UU] für 2 Platten.
U - Up
S - spare (schonen)
_ - Down

Informationen zum einzelnen Raid
$ mdadm -D /dev/md0

Hm immer gut
$mdadm --assemble --scan -v

Jede Partition einzeln untersuchen
$ mdadm -E /dev/sda1
$ mdadm -E /dev/sdb1

RAID Einrichten:
https://www.thomas-krenn.com/de/wiki/Linux_Software_RAID

Partition auf jede Festplatte erstellen mit Type Linux Raid
RAID mit mdadm erstellen
mdadm --create /dev/md0 --level=0 --raid-devices=3 /dev/sdb1 /dev/sdc1 /dev/sdd1

Dateisystem erstellen
mkfs.ext4 /dev/md0

UUID einer Festplatte herausfinden
root@login02:~# blkid /dev/sda
/dev/sda: UUID="7578c0b0-8fbe-44ba-93f4-cd55fe64e220" BLOCK_SIZE="4096" TYPE="ext4"

BACKUP and EMAIL notify

Backup Script:
rsync --numeric-ids --stats -hPave ssh /IGM2 192.168.://share/CACHEDEV1_DATA/data > `date "+%y%m%d"_copyprotocol`
echo -e "Subject: Backup PS2NAS\n\n" > tmp; cat `date "+%y%m%d"_copyprotocol` >> tmp; cat tmp | msmtp mail@me

Bei Fehler:
msmtp: TLS certificate verification failed: the certificate fingerprint does not match
msmtp: could not send mail (account default from /root/.msmtprc)

Das tun:
msmtp --serverinfo --tls --tls-certcheck=off --host=smtp.1und1.de --port 587

Und du bekommst:
SHA256: 6F:5C:80:76:FB:8A:82:02:E9:23:AB:4A:68:F4:E2:6C:B6:29:72:70:82:5D:02:F8:F5:A5:AB:AC:67:D3:72:61

Logical Volume Management

$ vgdisplay
Alloc PE / Size 29440 / 115.00 GiB
Free PE / Size 207491 / 810.51 GiB

$ lvextend -L910.51G /dev/mapper/vg00-home
Rounding size to boundary between physical extents: 910.51 GiB
Size of logical volume vg00/home changed from 100.00 GiB (25600 extents) to 910.51 GiB (233091 extents).
Logical volume home successfully resized.

$ resize2fs /dev/mapper/vg00-home
Filesystem at /dev/mapper/vg00-home is mounted on /home; on-line resizing required
old_desc_blocks = 7, new_desc_blocks = 57
The filesystem on /dev/mapper/vg00-home is now 238685184 (4k) blocks long.

$ df -h
/dev/mapper/vg00-home 897G 560M 860G 1% /home

Software

Put
apt::install-recommends "false";
in /etc/apt/apt.conf after installed basesystem to get rid of stupid nonsense software on a server.
Check with "aptitude why " and apt-config dump |fgrep -i recommend


Don't install these

kde-runtime

Don't deinstall

# IB
libmlx4-1 libmthca1

# obviously
meld

# for deptherror script
python-matplotlib
python-numpy
python-tk

Check for

*kde*
*gnome*
*gtk*
*vlc*
*wx*
libqt4*-dev
libqt4* keep for all quads
libqt5*


# checkinstall : um aus dem Quelltext ein Debian-Paket (.deb) zu erstellen,

Desktop Rechner
apt-get install konsole kate libreoffice kcalc konqueror cifs-utils nfs-common lm-sensors im net-tools

Cluster
apt-get install flex build-essential libhwloc5 bclibnuma1 ganglia-monitor apt-file ipmitool libconfuse-dev

/etc/modules
ipmi_devintf

# timeserver
apt-get install ntpdate
$ cat /etc/cron.daily/ntp
#!/bin/sh
ntpdate time.fu-berlin.de

chmod +x /etc/cron.daily/ntp


# stuff you prob. want to deinstall
mlocale collectd*

Cluster Allgemein

Ergebnise Kopieren (need update)
$ rsync -aPz ergzus.bin ergqh.bin ergqhmax.bin *.nml *.txt fort.4242 abenheim3979.* hydstructs thomas@.ddns.net:/hastenichtgesehn/

MPI Debuggen

Startet fuer jeden Thread ein xterm Fenster.
--ex run Startet den Debugger


mpiexec  -n 16 xterm -e gdb -ex "run" --args  /home/kater/bin/unrunoff

Infiniband

Für die neuen Karten in quad21-26
81:00.0 Infiniband controller: Mellanox Technologies MT27600 [Connect-IB]

Firmeware update der Karte machen, Siehe https://www.thomas-krenn.com/de/wikiDE/index.php?title=Mellanox_Firmware_Tools_-_Firmware_Upgrade_unter_Linux&xtxsearchselecthit=1

mft-4.23.0-104-x86_64-deb.tgz runterladen, entpacken, installieren. Dafür folgende Pakete installieren:
apt-get install gcc make dkms linux-headers-5.10.0-19-amd64 libucx0 libgfortran5 libevent-pthreads-2.1-7 libevent-core-2.1-7 libhwloc15 htop

Mit mlxfwmanager Firmware Version feststellen
Neue Firmaware runterladen:
mlxfwmanager --download-os linux_x64 --download-type self_extractor -y
cd linux_x64/
linux_x64# ./mlxup
Versions: Current Available
FW 10.10.4020 10.16.1200
reboot
root@quad21:~# ibstatus
Infiniband device 'mlx5_0' port 1 status:
default gid: fe80:0000:0000:0000:e41d:2d03:000c:0c20
base lid: 0x4
sm lid: 0x1
state: 4: ACTIVE
phys state: 5: LinkUp
rate: 40 Gb/sec (4X FDR10)
link_layer: InfiniBand

====================================

https://wiki.debian.org/RDMA
Install rdma-core

Mellanox Technologies MT26428 Installation

apt-get install ibutils infiniband-diags perftest libmlx4-1 libmthca1 libipathverbs1 rdmacm-utils cpufrequtils ibverbs-utils infiniband-diags perftest firmware-linux-free
### neueres? älters? paket: ibverbs-providers

modprobe ib_umad ib_mthca mlx4_ib ib_uverbs ib_ipoib rdma_ucm mlx4_core (Module einzeln laden)

/etc/modules
ib_umad
ib_mthca
mlx4_ib
ib_uverbs
ib_ipoib
rdma_ucm 
mlx4_core

(Ne brauchen wir nicht)
Uses who want to run MPI jobs will need to have write permissions for the following devices:
The simplest way to do this is to add the users to the rdma group.

/dev/infiniband/uverbs*
/dev/infiniband/rdma_cm*
usermod -a -G rdma aron

OpenMPI will need to pin memory. Edit /etc/security/limits.conf and add the line:

* hard memlock unlimited 
* soft memlock unlimited 

# to ping start server and use port GUID from ibstat
ibping -S
ibping -G fe80::2:c903:10:5fa4
Pong from quad02.(none) (Lid 5): time 0.602 ms


Programme:
$ ibdiagnet
$ iblinkinfo
$ ibnetdiscover - to get lid
$ ibportstate to set links speed
lid port speed speed
speed is the speed of the port: 1 for 2.5 Gbyte/sec, 2 for 5.0 Gbyte/sec, and 4 for 10.0 Gbyte/sec.
ibportstate 14 1 speed 7
operation "enable" hilft um den link speed zu aktivieren
$ perfquery


Ganglia

IB plugin https://github.com/ULHPC/ganglia_infiniband_module
Damit ganglia perfquery ausführen kann ohne sudo
$ chmod u+s /usr/sbin/perfquery

Braucht Symlink von libconfuse.so.0
thomas@quad11:/usr/lib/x86_64-linux-gnu$ ll libconfuse.*
-rw-r--r-- 1 root root 51K Aug 15 2018 libconfuse.so.1.0.0
lrwxrwxrwx 1 root root 19 Aug 15 2018 libconfuse.so -> libconfuse.so.1.0.0
lrwxrwxrwx 1 root root 13 Jul 24 09:11 libconfuse.so.0 -> libconfuse.so

/etc/ganglia/gmond.conf

modules {
   module {
     name = "ib_module"
     language = "C/C++"
     path = "/root/modInfiniband.so"
  }
}

collection_group {
  collect_every = 20
  time_threshold = 60
  metric {
    name = "ib_bytes_out"
    value_threshold = 1000000000000
    title = "Bytes Sent(infiniband)"
  }
  metric {
    name = "ib_bytes_in"
    value_threshold = 1000000000000
    title = "Bytes Received(infiniband)"
  }
  metric {
    name = "ib_pkts_in"
    value_threshold = 100000000000
    title = "Packets Received(infiniband)"
  }
  metric {
    name = "ib_pkts_out"
    value_threshold = 100000000000
    title = "Packets Sent(infiniband)"
  }
}

crontab -e
* * * * * /home/thomas/bin/gmetric-cpu-temp.sh

Trouble
=======

[quad01:05373] Error: unknown option "--hnp-topo-sig"
Use full path for mpiexec and inka_kopplung

latency and bandwidth test
==========================
# make sure every CPU has the same FREQ because stupidity
for ((i=0; i<32;i++)); do cpufreq-set -g performance -c $i; done
# On one PC
ib_send_lat -a
# On other PC
ib_send_lat 192.168.20.112 -a
# Same for ib_send_bw
ib_send_lat -a 192.168.20.112 
---------------------------------------------------------------------------------------
                    Send Latency Test
 Dual-port       : OFF          Device         : mlx4_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 TX depth        : 1
 Mtu             : 4096[B]
 Link type       : IB
 Max inline data : 236[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x07 QPN 0x0145 PSN 0x40bbdc
 remote address: LID 0x08 QPN 0x0127 PSN 0x625da0
---------------------------------------------------------------------------------------
 #bytes #iterations    t_min[usec]    t_max[usec]  t_typical[usec]
 2       1000          1.32           29.75        1.98   
 4       1000          1.24           15.99        1.54   
 8       1000          1.24           8.66         1.42   
 16      1000          1.19           11.14        1.40   
 32      1000          1.24           12.68        1.58   
 64      1000          1.37           13.84        1.66   
 128     1000          1.53           14.81        1.83   
 256     1000          3.08           33.22        5.29   
 512     1000          3.25           19.88        4.01   
 1024    1000          3.73           14.94        4.12   
 2048    1000          5.21           14.80        5.57   
 4096    1000          8.23           22.35        8.64   
 8192    1000          10.94          32.77        11.47  
 16384   1000          16.45          49.51        17.58  
 32768   1000          27.41          74.35        29.15  
 65536   1000          50.14          147.59       51.21  
 131072  1000          94.49          268.19       96.09  
 262144  1000          183.19         472.53       184.69 
 524288  1000          360.21         832.79       365.68 
 1048576 1000          714.91         1399.75      721.41 
 2097152 1000          1422.95        2865.38      1437.44
 4194304 1000          2840.72        4905.65      2907.37
 8388608 1000          5807.65        10473.58      5936.33
---------------------------------------------------------------------------------------

ib_send_bw 192.168.20.112  -a
---------------------------------------------------------------------------------------
                    Send BW Test
 Dual-port       : OFF          Device         : mlx4_0
 Number of qps   : 1            Transport type : IB
 Connection type : RC           Using SRQ      : OFF
 TX depth        : 128
 CQ Moderation   : 100
 Mtu             : 4096[B]
 Link type       : IB
 Max inline data : 0[B]
 rdma_cm QPs     : OFF
 Data ex. method : Ethernet
---------------------------------------------------------------------------------------
 local address: LID 0x07 QPN 0x0148 PSN 0xd4c92e
 remote address: LID 0x08 QPN 0x012a PSN 0x51f56e
---------------------------------------------------------------------------------------
 #bytes     #iterations    BW peak[MB/sec]    BW average[MB/sec]   MsgRate[Mpps]
 2          1000             7.64               5.88               3.080942
 4          1000             7.48               6.75               1.769229
 8          1000             48.74              42.44              5.562193
 16         1000             42.13              39.38              2.580683
 32         1000             170.37             140.12             4.591350
 64         1000             248.02             185.61             3.041025
 128        1000             92.72              55.54              0.454995
 256        1000             500.02             435.57             1.784099
 512        1000             1109.73            1050.87            2.152184
 1024       1000             1237.52            1217.68            1.246901
 2048       1000             1267.55            1251.44            0.640736
 4096       1000             1342.55            1336.95            0.342258
 8192       1000             1371.66            1354.30            0.173351
 16384      1000             1299.78            1183.89            0.075769
 32768      1000             1311.13            1302.48            0.041679
 65536      1000             1361.31            1361.30            0.021781
 131072     1000             1387.05            1281.65            0.010253
 262144     1000             1354.87            1291.06            0.005164
 524288     1000             1311.00            1297.36            0.002595
 1048576    1000             1344.04            1312.48            0.001312
 2097152    1000             1320.18            1312.31            0.000656
 4194304    1000             1349.24            1328.86            0.000332
 8388608    1000             1281.43            1281.43            0.000160
---------------------------------------------------------------------------------------

FTP

very secure ftp vsftp

chroot_local_user=YES
seccomp_sandbox=no
userlist_deny=NO
userlist_enable=YES
userlist_file=/etc/vsftpd.user_list
check_shell=NO
local_root=/srv/ftp/

Config Datein

# IPv6 deaktivieren
$ cat /etc/sysctl.d/01-disable-ipv6.conf
net.ipv6.conf.all.disable_ipv6 = 1

# sshd_config
PermitRootLogin no
StrictModes no
PubkeyAuthentication yes
PermitEmptyPasswords no
X11Forwarding yes

User Stuff

Sicherstellen, dass sich der User von jeden Quad auf jeden Quad automatisch einloggen kann. Da orted sich Baum-artig von Host zu Host einlogt.


adduser --uid 1001 thomas
addgroup --gid 998 quad
# Change default group of user
usermod -g quad aron
usermod -g quad thomas


## ~/.bashrc
# stupid systemd idiots
# Nein der Scheiss bewirkt absolut garnichts! Nur screen hilft.
#loginctl enable-linger $USER

#Change default file permission to rwx for directorys and files in
umask 0002


screen skript das immer die schon offenen session startet

#!/bin/sh

if [ $# -eq 0 ]; then
#echo "No arguments supplied"

if /usr/bin/screen -x; then
echo ""
else
/usr/bin/screen
fi
else
exec /usr/bin/screen "$@"
fi


# net quad00
echo 1 > /proc/sys/net/ipv4/ip_forward
iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE

# mpich debug
./configure --prefix=/home6/datahome/aroland/opt/mpich-3.2.1_gcc-7.2.0_dbg/ --enable-fortran=all --enable-g=all --enable-fast=O0 --enable-error-checking=all --enable-error-messages=all --enable-debuginfo

# netcdf hdf5
FC=/home/zanke/opt/openmpi_gfortran/bin/mpif90 CC=/home/zanke/opt/openmpi_gfortran/bin/mpicc ./configure --prefix=/home/zanke/opt/hdf5-1.8.19_gfortran6.3.0/ --enable-fortran --enable-fortran2003 --enable-parallel
make
make check

# netcdf
FC=/home/zanke/opt/openmpi_gfortran/bin/mpif90 CC=/home/zanke/opt/openmpi_gfortran/bin/mpicc CPPFLAGS=-I/home/zanke/opt/hdf5-1.8.19_gfortran6.3.0/include LDFLAGS=-L/home/zanke/opt/hdf5-1.8.19_gfortran6.3.0/lib ./configure --prefix=/home/zanke/opt/netcdf-4.4.1_gfortran6.3.0/

# netcdff
FC=/home/zanke/opt/openmpi_gfortran/bin/mpif90 CC=/home/zanke/opt/openmpi_gfortran/bin/mpicc CPPFLAGS=-I/home/zanke/opt/netcdf-4.4.1_gfortran6.3.0/include LDFLAGS=-L/home/zanke/opt/netcdf-4.4.1_gfortran6.3.0/lib ./configure --prefix=/home/zanke/opt/netcdf-4.4.1_gfortran6.3.0/

S.M.A.R.T

Wichtige Werte für nicht SSD Platten, die eine Langzeit Überwachung mit z.B. Ganglia brauchen IMO:
188 Command_Timeout
197 Current_Pending_Sector
198 Offline_Uncorrectable
5 Reallocated_Sector_Ct
187 Reported_Uncorrect
10 Spin_Retry_Count
199 UDMA_CRC_Error_Count

# smartctl -a /dev/sda | egrep "Command_Timeout|Current_Pending_Sector|Offline_Uncorrectable|Reallocated_Sector_Ct|Reported_Uncorrect|Spin_Retry_Count|UDMA_CRC_Error_Count"

29.10.2019

sda / /home

=== START OF INFORMATION SECTION ===
Device Model:     ST8000AS0002-1NA17Z
Serial Number:    Z840AG52
LU WWN Device Id: 5 000c50 08734ecdb
Firmware Version: AR15
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5980 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Oct 29 06:59:44 2019 CET

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   114   099   006    Pre-fail  Always       -       72037664
  3 Spin_Up_Time            0x0003   091   090   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   084   060   030    Pre-fail  Always       -       50041550840
  9 Power_On_Hours          0x0032   062   062   000    Old_age   Always       -       33792
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       43
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   091   091   000    Old_age   Always       -       9
190 Airflow_Temperature_Cel 0x0022   072   050   045    Old_age   Always       -       28 (Min/Max 28/36)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       1180
193 Load_Cycle_Count        0x0032   095   095   000    Old_age   Always       -       11969
194 Temperature_Celsius     0x0022   028   050   000    Old_age   Always       -       28 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a   114   099   000    Old_age   Always       -       72037664
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       31769 (216 109 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       43960187200
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       35616462371

SMART Error Log Version: 1
No Errors Logged

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Extended offline    Completed without error       00%     25167         -
# 2  Extended offline    Completed without error       00%     22711         -

sdb /perm

=== START OF INFORMATION SECTION ===
Device Model:     ST8000AS0002-1NA17Z
Serial Number:    Z840AMJ5
LU WWN Device Id: 5 000c50 08744d481
Firmware Version: AR15
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5980 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Oct 29 07:00:52 2019 CET

SMART Attributes Data Structure revision number: 10
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   118   099   006    Pre-fail  Always       -       187522464
  3 Spin_Up_Time            0x0003   091   090   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   087   060   030    Pre-fail  Always       -       4888765593
  9 Power_On_Hours          0x0032   062   062   000    Old_age   Always       -       33791
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       43
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   099   000    Old_age   Always       -       1
189 High_Fly_Writes         0x003a   100   100   000    Old_age   Always       -       0
190 Airflow_Temperature_Cel 0x0022   071   049   045    Old_age   Always       -       29 (Min/Max 28/38)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       289
193 Load_Cycle_Count        0x0032   088   088   000    Old_age   Always       -       25492
194 Temperature_Celsius     0x0022   029   051   000    Old_age   Always       -       29 (0 18 0 0 0)
195 Hardware_ECC_Recovered  0x001a   118   099   000    Old_age   Always       -       187522464
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       22376 (153 118 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       96619086856
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       63050316253

sdc /karten

=== START OF INFORMATION SECTION ===
Device Model:     ST8000AS0002-1NA17Z
Serial Number:    Z840AF12
LU WWN Device Id: 5 000c50 08735770e
Firmware Version: AR15
User Capacity:    8,001,563,222,016 bytes [8.00 TB]
Sector Sizes:     512 bytes logical, 4096 bytes physical
Rotation Rate:    5980 rpm
Device is:        Not in smartctl database [for details use: -P showall]
ATA Version is:   ACS-2, ACS-3 T13/2161-D revision 3b
SATA Version is:  SATA 3.1, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is:    Tue Oct 29 07:01:14 2019 CET

ID# ATTRIBUTE_NAME          FLAG     VALUE WORST THRESH TYPE      UPDATED  WHEN_FAILED RAW_VALUE
  1 Raw_Read_Error_Rate     0x000f   119   099   006    Pre-fail  Always       -       229844224
  3 Spin_Up_Time            0x0003   091   090   000    Pre-fail  Always       -       0
  4 Start_Stop_Count        0x0032   100   100   020    Old_age   Always       -       43
  5 Reallocated_Sector_Ct   0x0033   100   100   010    Pre-fail  Always       -       0
  7 Seek_Error_Rate         0x000f   079   060   030    Pre-fail  Always       -       86831809
  9 Power_On_Hours          0x0032   062   062   000    Old_age   Always       -       33791
 10 Spin_Retry_Count        0x0013   100   100   097    Pre-fail  Always       -       0
 12 Power_Cycle_Count       0x0032   100   100   020    Old_age   Always       -       43
183 Runtime_Bad_Block       0x0032   100   100   000    Old_age   Always       -       0
184 End-to-End_Error        0x0032   100   100   099    Old_age   Always       -       0
187 Reported_Uncorrect      0x0032   100   100   000    Old_age   Always       -       0
188 Command_Timeout         0x0032   100   100   000    Old_age   Always       -       0
189 High_Fly_Writes         0x003a   084   084   000    Old_age   Always       -       16
190 Airflow_Temperature_Cel 0x0022   072   055   045    Old_age   Always       -       28 (Min/Max 27/36)
191 G-Sense_Error_Rate      0x0032   100   100   000    Old_age   Always       -       0
192 Power-Off_Retract_Count 0x0032   100   100   000    Old_age   Always       -       28
193 Load_Cycle_Count        0x0032   084   084   000    Old_age   Always       -       33889
194 Temperature_Celsius     0x0022   028   045   000    Old_age   Always       -       28 (0 17 0 0 0)
195 Hardware_ECC_Recovered  0x001a   119   099   000    Old_age   Always       -       229844224
197 Current_Pending_Sector  0x0012   100   100   000    Old_age   Always       -       0
198 Offline_Uncorrectable   0x0010   100   100   000    Old_age   Offline      -       0
199 UDMA_CRC_Error_Count    0x003e   200   200   000    Old_age   Always       -       0
240 Head_Flying_Hours       0x0000   100   253   000    Old_age   Offline      -       18326 (15 246 0)
241 Total_LBAs_Written      0x0000   100   253   000    Old_age   Offline      -       21871056144
242 Total_LBAs_Read         0x0000   100   253   000    Old_age   Offline      -       3261163264

SWAP

fallocate -l 100G /home/swapfile
mkswap /home/swapfile
swapon /home/swapfile

debian/devuan vs BSD

Mein persönlicher Eindruck, wo einige Unterschiede sind.

* Bei D/D werden Log Datein regelmäßig gelöscht. Monatlich? Nach einem Reboot? Bei BSD ist die Grundeinstellung, dass auch nach Jahren Log Datein vorhanden sind. Gut, um längerfristige Veränderungen am System festzustellen. Von z.B. Hardware (Bus Errors), Malware

Festplatten testen

Zum testen von Festplatten nutze ich den destructiven Test von dem Programm badblocks. Die Festplatte wird dabei komplett vier mal überschrieben. Die SMART Werte der Festplatte speichere ich vor dem Testen mit badblocks ab und hinterher ein Vergleich zu machen.
Da das Programm badblocks im Falle von fehlerhaften Blöcken die Blocknummer auf die Konsole ausgibt und man bei vielen Ausgaben die Fortschrittsanzeige nicht mehr sieht, kann diese Ausgabe mit der -o Option in eine Datei umgeleitet werden.

badblocks -swv -o badblocks.txt /dev/sdb

Test with random data 131072*4096b = 512MB
badblocks -svw -b 4096 -o badblocks.txt -c 131072 -t random /dev/sdb

No Comments

No comments yet.

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress