ODE SpaceCloud® OS short instructions

This section is devoted to short instructions regarding the SpaceCloud® OS for ODE/iX5-100 or other Qseven based Unibap compute products.

ODE SpaceCloud® OS short instructions

You can always turn to the ODE User’s Guide for detailed information. Here we present frequent questions and answers and short common instructions.

Default login information

The default login information for the SpaceCloud® OS upgrade disk image is

username: unibap

password: unibap


Keyboard settings

The default keyboard setting is US locale. To change this to a keyboard setting of your choice, do the following:

$ sudo dpkg-reconfigure keyboard-configuration


Network configuration

SpaceCloud® OS uses netplan network configuration management.

The network config can be changed in this configuration file,

$ sudo nano /etc/netplan/00-installer-config.yaml

example (to account for added PCIe device in x4 slot)

# This is the network config written by 'subiquity'
network:
  renderer: networkd
  ethernets:
    enp1s0:
      dhcp4: true
      optional: true
      dhcp6: false
    enp2s0:
      dhcp4: true
      optional: true
      dhcp6: false
    enp3s0:
      dhcp4: true
      optional: true
      dhcp6: false
  version: 2

Power management of the GPU

SpaceCloud® OS support dynamic power management (DPM) of the AMD GPU.

The user can use the provided scripts or change the settings manually.

The default DPM  setting for the GPU is “auto”, while also “low” and “high” are options.

This can be changed by the scripts $ gpu_high.sh or gpu_low.sh or gpu_auto.sh

Or simply directly at the sysfs by

echo 'high' > /sys/class/drm/card0/device/power_dpm_force_performance_level

The GPU performance is affected by the power saving settings. The default setting in the image is to turn off screen saver and switch off display. These settings can be changed by the user if desired otherwise.

Monitoring the GPU load

The user can monitor the GPU load using the tool radeontop.

$ radeontop


OpenCL overrides

The SpaceCloud® OS provides common MESA settings  for manipulating the OpenCL device and CLC version overrides. This can be tested by these commands;

$ CLOVER_DEVICE_VERSION_OVERRIDE=1.1 CLOVER_DEVICE_CLC_VERSION_OVERRIDE=1.1 clinfo

$ CLOVER_DEVICE_VERSION_OVERRIDE=1.2 CLOVER_DEVICE_CLC_VERSION_OVERRIDE=1.2 clinfo


Test machine learning with TensorFlow lite using GPU

In addition to the complete TensorFlow v2.2.0 (optimized CPU version), Unibap provide support for CPU and GPU accelerated TensorFlow Lite.

Goto test directory which also include example source code, in this case cpp/tflite_gpu.cpp which include a warm-up run for the GPU caches and benchmark using TensorFlow lite.

$ cd /opt/spacecloud/extras/ml/performancetests

$ make sure the GPU is set to high speed, $ gpu_high.sh

$ make clean && make -j4

$ ./tflite  (cpu version)

$ ./tflite_gpu (gpu version)

INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
110239 us

Note that the GPU and CPU are fairly equal in performance in this case.CPU and GPU can be run simultaneously this yields inference times for CPU and GPU of ~160 ms and ~110 ms respectively with an edge to the GPU. This corresponds to ~6.3 fps and ~9.1 fps respectively, or a combined ~15 fps if run in parallel. Note that this is using 32-bit precision.

Test full TensorFlow

$ python3 -c ‘import tensorflow as tf; print(tf.__version__)’

2.2.0

$ python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
2020-06-23 06:58:30.434656: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 1197805000 Hz
2020-06-23 06:58:30.435507: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4fd2a80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-23 06:58:30.435585: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-06-23 06:58:30.435932: I tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
3


clFFT

A demo of GPU accelerated FFT processing is also included in the performancetest library. This example processes a 4096 x 4096 two dimensional buffer (16 Mpixel) on the GPU.

Goto test directory which also include example source code, in this case cpp/fft.cpp;

$ cd /opt/spacecloud/extras/ml/performancetests

$ make sure the GPU is set to high speed, $ gpu_high.sh

$ make clean && make -j4

$ ./fft

Execute the plan on GPU - Optionally Monitor with radeontop to see the GPU load.
Elapsed time: 0,30 seconds.

The clFFT library is capable of processing the 16 Mpixel buffer on the GPU at 0,3 seconds in this case.


pyTorch

A test of the installed pyTorch is also included for reference in the performancetest package.

Goto test directory which also include example source code, in this case py/torchtest.py;

$ cd /opt/spacecloud/extras/ml/performancetests

$ python py/torchtest.py


Memory protection EDAC

The user can dynamically change the memory protection on and off through the standard linux sysfs interfaces provided by AMD EDAC driver.

The default behavior is to enable PCI parity and background RAM memory ECC scrubbing. However, these settings can be changed in runtime.

To change the background ECC scrubbing rate, use this command with bytes / second as input. The driver will automatically round the value to the closest possible value.

$ echo 97650 > /sys/devices/system/edac/mc/mc0/sdram_scrub_rate

The actual settings can be read by the provided script $ unibap-edac-info.sh


Hardware accelerated video encoding/decoding (H.264)

The AMD SOC found on ODE/iX5-100 products support hardware accelerated video encoding and decoding of H.264 video. The Video Codec Engine (VCE) v2.0 provided by the AMD SOC can encode two full-HD (1080p) videos in parallel using VA library and gstreamer. The following profiles are supported:

  • H264ConstrainedBaseline
  • H264Main
  • H264High

The H.264 encoding capability can be tested with this command;

// Hw encode with VCE for Two h.264 video streams
$ FILE_A=uab-q7-1804-vaapih264_testA.avi

$ FILE_B=uab-q7-1804-vaapih264_testB.avi

$ DISPLAY= WAYLAND_DISPLAY= gst-launch-1.0 -e videotestsrc num-buffers=100 ! 'video/x-raw, width=1920, height=1080, format=NV12' ! timeoverlay ! tee name=streams streams. ! queue ! vaapih264enc ! h264parse ! queue ! avimux ! filesink location=$FILE_A streams. ! queue ! vaapih264enc ! h264parse ! queue ! avimux ! filesink location=$FILE_B

The resulting videos from the command above should look like this:

uab-q7-1804-vaapih264_test output.


Performance (perf) counters

SpaceCloud® OS support Linux performance counters for analyzing CPU cache level events etc. This command will list the supported Linux perf features:

$ perf list


DONT’S

The SpaceCloud® OS uses LLVM 10.0.1 which is pre-installed. This package is marked hold by default using command, $ sudo apt-mark hold llvm. Do not upgrade this package.


Other test programs

There are some other tests installed in the OS, primarily for testing the GPU. These are found in the directory /opt/spacecloud/extras/bin and include clBLAS, clBlast, clFFT test programs.

[example_saxpy, example_chbmv] are clBLAS tests

[dgemv, dtrsm, haxpy] are clBlast tests

[cl_fft1d, cl_fft2d, cl_fft2d-2048×2048, cl_fft3d] are clFFT tests.


Optimized OpenCV

Unibap has optimized OpenCV to use the AMD Jaguar microarchitecture found in the ODE/iX5 products and enabled GPU acceleration using clBLAS and clFFT. This can be verified by this command.

$ opencv_version --opencl
4.1.2
OpenCL Platforms:
    Clover
        iGPU: AMD KABINI (DRM 3.35.0, 5.4.44-050444-generic, LLVM 10.0.1) (OpenCL 1.2 Mesa 20.1.1 (git-127c2be9c5))
Current OpenCL device:
    Type = iGPU
    Name = AMD KABINI (DRM 3.35.0, 5.4.44-050444-generic, LLVM 10.0.1)
    Version = OpenCL 1.2 Mesa 20.1.1 (git-127c2be9c5)
    Driver version = 20.1.1
    Address bits = 64
    Compute units = 2
    Max work group size = 256
    Local memory size = 32 KB
    Max memory allocation size = 901 MB 477 KB 204 B
    Double support = Yes
    Host unified memory = Yes
    Device extensions:
        cl_khr_byte_addressable_store
        cl_khr_global_int32_base_atomics
        cl_khr_global_int32_extended_atomics
        cl_khr_local_int32_base_atomics
        cl_khr_local_int32_extended_atomics
        cl_khr_int64_base_atomics
        cl_khr_int64_extended_atomics
        cl_khr_fp64
        cl_khr_fp16
    Has AMD Blas = Yes
    Has AMD Fft = Yes
    Preferred vector width char = 16
    Preferred vector width short = 8
    Preferred vector width int = 4
    Preferred vector width long = 2
    Preferred vector width float = 4
    Preferred vector width double = 2

Health monitoring

Unibap provide elaborate health monitoring and optionally advanced features for SafetyBoot and SafetyChip associated with the flight configuration of the products. Please contact us for information regarding these advanced flight related features.

Depending on the FPGA programming version the user can try these commands for additional health information.

This requires that the FPGA is turned on before the AMD SOC is booted.

$ sudo systemctl start sensor-server.service

Check the FPGA serial number

$ fetch_fpga

(will print out something like this if supported by the installed FPGA bitfile)

FPGA serial id: 0x11a5709ab61836421a7bf80d4b122046

Check the detailed health status

$ unibap_health_monitor

(which will print out something like this if supported by the installed FPGA bitfile)

Unibap health monitor started
Printing all data fetched from the Unibap modbus tcp server
2020.06.22 22:57:43 UTC

Q7 data:
Fpga Serial number is: 0x11a5709ab61836421a7bf80d4b122046
5V voltage: 5278 mV
3,3V voltage: 3274 mV
1,5V voltage: 1527 mV
FPGA current: 917 mA
Temp1: 33981 mC
Temp2: 33754 mC

Soc voltage: 958 mV
Soc voltage 5V: 5215 mV
Soc voltage 3,3V: 3257 mV
Soc voltage 1,8V: 1797 mV
Soc voltage 0,95V: 958 mV
Soc IO voltage : 1533 mV
Soc NB voltage: 866 mV
Soc current: 1643 mA
Soc Temp: 34512 mC
Soc adc in: 1733 mV

AMD Power plane: 8568 mW
FPGA Power plane: 4839 mW
CPU power: 9338 mW
CPU temp: 47625 mC
GPU temp: 4094 mC

Ode data:
fintec_in0 is: 1656 mV
fintec_in1 is: 1344 mV
fintec_in2 is: 1056 mV
fintec_in3 is: 1000 mV
fintec_in4 is: 1048 mV
fintec_in5 is: 1600 mV
fintec_in6 is: 1544 mV
fintec_in7 is: 1696 mV
fintec_fan_rpm is: 5415 rpm
fintec_temp1 is: 36000 mC
fintec_temp2 is: 34000 mC
fintec_temp3 is: 36000 mC

ssd_1 data:
ssd1_serial is: B0518176100000000DB4
ssd1_temp is: 99000mC
ssd1_health is: 0
ssd1_power_hours is: 3838 h
ssd1_power_cycle is: 30 cycles

RAM usage
Total RAM available: 1800 MB
Free RAM available: 1103 MB
Used RAM: 696 MB
Ram usage: 38.6897 %
Total Virtual Memory: 2874 MB
Free Virtual Memory: 2177 MB
Used Virtual Memory: 696 MB

CPU usage
CPU load 23.9796 %


Accessing the FPGA board controller and built in FlashPro

Newer version of the ODE provide a FTDI USB interface for serial/uart communication with the FPGA board controller running on the Cortex-M3 MCU. The same port also provide support for Microsemi  FPGA programming through FlashPro JTAG interface. The FTDI USB connector is illustrated with a green oval in this picture.

The mini-USB to USB cable is included in the delivery box.

The UART is recognized by both Windows and Linux OSes and should be set to 115200 baud.

Microsemi support FlashPro up to version 11.9 but no later (e.g. 12 or higher). However, do not reprogram the FPGA without contacting Unibap as the bitfile must have certain IO in flash freeze configuration to have proper power management function. Wrong settings may brick the system and require a support task. If you upgrade the FPGA with a Unibap certified ODE FPGA upgrade bitfile you can safely do so. Currently the SpaceCloud® OS v1.0.0 does not include a FPGA bitfile upgrade.

The default bare metal FPGA firmware version would output something like this on the uart:

—  UNIBAP Heterogeneous Computing System  —

— FPGA Output, SmartFusion2 ARM Cortex-M3 —

Usage:

[1,2,3], toggle led

[b], print traced BIOS codes

[f], reset FPGA

[F], power off FPGA

[h], print health status

[i], print FPGA device info

[l], toggle led blink

[p], perform ISP of bitfile from spi-flash

[r], print raw adc values

[s], SoC power button

[S], SoC reset button

Device serial number:       0x1388D8040000CD5A00440017001XXXXX

User code number:           0x013377BB

User design version number: 0xF003

SoC:off bios-code-cnt:0 tmr_cnt:1s

 

Optional FreeRTOS based FPGA firmware

Unibap can provide a FPGA firmware running FreeRTOS (which is used for flights units like iX5-100 to support SafetyChip and SafetyBoot, which in this case would output the following menu on the FTDI uart:

— Welcome to Unibap UART Debug output console —
— ODE SmartFusion2 Cortex-M3 Embedded Application (FreeRTOS) —
support@unibap.com
Version Core: 2. Board: 0
Usage:
[1,2,3], toggle led
[b], print traced BIOS codes
[f], reset FPGA
[F], Power off FPGA
[h], print health status
[i], print FPGA device info
[l], toggle led blink
[r], print raw adc values
[s], SoC power button
[S], SoC reset button
[a], Toggle AMD autostart: 0
[Z], Format CFG Filesystem in SPI flash

AMD SoC:on port 80 bios-code-cnt:3 uptime:10211s