ODE SpaceCloud® OS short instructions
This section is devoted to short instructions regarding the SpaceCloud® OS for ODE/iX5-100 or other Qseven based Unibap compute products.
ODE SpaceCloud® OS short instructions
You can always turn to the ODE User’s Guide for detailed information. Here we present frequent questions and answers and short common instructions.
Default login information
The default login information for the SpaceCloud® OS upgrade disk image is
username: unibap
password: unibap
Keyboard settings
The default keyboard setting is US locale. To change this to a keyboard setting of your choice, do the following:
$ sudo dpkg-reconfigure keyboard-configuration
Network configuration
SpaceCloud® OS uses netplan network configuration management.
The network config can be changed in this configuration file,
$ sudo nano /etc/netplan/00-installer-config.yaml
example (to account for added PCIe device in x4 slot)
# This is the network config written by 'subiquity' network: renderer: networkd ethernets: enp1s0: dhcp4: true optional: true dhcp6: false enp2s0: dhcp4: true optional: true dhcp6: false enp3s0: dhcp4: true optional: true dhcp6: false version: 2
Power management of the GPU
SpaceCloud® OS support dynamic power management (DPM) of the AMD GPU.
The user can use the provided scripts or change the settings manually.
The default DPM setting for the GPU is “auto”, while also “low” and “high” are options.
This can be changed by the scripts $ gpu_high.sh or gpu_low.sh or gpu_auto.sh
Or simply directly at the sysfs by
echo 'high' > /sys/class/drm/card0/device/power_dpm_force_performance_level
The GPU performance is affected by the power saving settings. The default setting in the image is to turn off screen saver and switch off display. These settings can be changed by the user if desired otherwise.
Monitoring the GPU load
The user can monitor the GPU load using the tool radeontop.
$ radeontop
OpenCL overrides
The SpaceCloud® OS provides common MESA settings for manipulating the OpenCL device and CLC version overrides. This can be tested by these commands;
$ CLOVER_DEVICE_VERSION_OVERRIDE=1.1 CLOVER_DEVICE_CLC_VERSION_OVERRIDE=1.1 clinfo
$ CLOVER_DEVICE_VERSION_OVERRIDE=1.2 CLOVER_DEVICE_CLC_VERSION_OVERRIDE=1.2 clinfo
Test machine learning with TensorFlow lite using GPU
In addition to the complete TensorFlow v2.2.0 (optimized CPU version), Unibap provide support for CPU and GPU accelerated TensorFlow Lite.
Goto test directory which also include example source code, in this case cpp/tflite_gpu.cpp which include a warm-up run for the GPU caches and benchmark using TensorFlow lite.
$ cd /opt/spacecloud/extras/ml/performancetests
$ make sure the GPU is set to high speed, $ gpu_high.sh
$ make clean && make -j4
$ ./tflite (cpu version)
$ ./tflite_gpu (gpu version)
INFO: Initialized OpenCL-based API.
INFO: Created 1 GPU delegate kernels.
110239 us
Note that the GPU and CPU are fairly equal in performance in this case.CPU and GPU can be run simultaneously this yields inference times for CPU and GPU of ~160 ms and ~110 ms respectively with an edge to the GPU. This corresponds to ~6.3 fps and ~9.1 fps respectively, or a combined ~15 fps if run in parallel. Note that this is using 32-bit precision.
Test full TensorFlow
$ python3 -c ‘import tensorflow as tf; print(tf.__version__)’
2.2.0
$ python3
Python 3.6.9 (default, Apr 18 2020, 01:56:04)
[GCC 8.4.0] on linux
Type “help”, “copyright”, “credits” or “license” for more information.
>>> import tensorflow as tf
>>> tf.add(1, 2).numpy()
2020-06-23 06:58:30.434656: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 1197805000 Hz
2020-06-23 06:58:30.435507: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4fd2a80 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2020-06-23 06:58:30.435585: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version
2020-06-23 06:58:30.435932: I tensorflow/core/common_runtime/process_util.cc:147] Creating new thread pool with default inter op setting: 2. Tune using inter_op_parallelism_threads for best performance.
3
clFFT
A demo of GPU accelerated FFT processing is also included in the performancetest library. This example processes a 4096 x 4096 two dimensional buffer (16 Mpixel) on the GPU.
Goto test directory which also include example source code, in this case cpp/fft.cpp;
$ cd /opt/spacecloud/extras/ml/performancetests
$ make sure the GPU is set to high speed, $ gpu_high.sh
$ make clean && make -j4
$ ./fft
Execute the plan on GPU - Optionally Monitor with radeontop to see the GPU load.
Elapsed time: 0,30 seconds.
The clFFT library is capable of processing the 16 Mpixel buffer on the GPU at 0,3 seconds in this case.
pyTorch
A test of the installed pyTorch is also included for reference in the performancetest package.
Goto test directory which also include example source code, in this case py/torchtest.py;
$ cd /opt/spacecloud/extras/ml/performancetests
$ python py/torchtest.py
Memory protection EDAC
The user can dynamically change the memory protection on and off through the standard linux sysfs interfaces provided by AMD EDAC driver.
The default behavior is to enable PCI parity and background RAM memory ECC scrubbing. However, these settings can be changed in runtime.
To change the background ECC scrubbing rate, use this command with bytes / second as input. The driver will automatically round the value to the closest possible value.
$ echo 97650 > /sys/devices/system/edac/mc/mc0/sdram_scrub_rate
The actual settings can be read by the provided script $ unibap-edac-info.sh
Hardware accelerated video encoding/decoding (H.264)
The AMD SOC found on ODE/iX5-100 products support hardware accelerated video encoding and decoding of H.264 video. The Video Codec Engine (VCE) v2.0 provided by the AMD SOC can encode two full-HD (1080p) videos in parallel using VA library and gstreamer. The following profiles are supported:
- H264ConstrainedBaseline
- H264Main
- H264High
The H.264 encoding capability can be tested with this command;
// Hw encode with VCE for Two h.264 video streams
$ FILE_A=uab-q7-1804-vaapih264_testA.avi
$ FILE_B=uab-q7-1804-vaapih264_testB.avi
$ DISPLAY= WAYLAND_DISPLAY= gst-launch-1.0 -e videotestsrc num-buffers=100 ! 'video/x-raw, width=1920, height=1080, format=NV12' ! timeoverlay ! tee name=streams streams. ! queue ! vaapih264enc ! h264parse ! queue ! avimux ! filesink location=$FILE_A streams. ! queue ! vaapih264enc ! h264parse ! queue ! avimux ! filesink location=$FILE_B
The resulting videos from the command above should look like this:
uab-q7-1804-vaapih264_test output.
Performance (perf) counters
SpaceCloud® OS support Linux performance counters for analyzing CPU cache level events etc. This command will list the supported Linux perf features:
$ perf list
DONT’S
The SpaceCloud® OS uses LLVM 10.0.1 which is pre-installed. This package is marked hold by default using command, $ sudo apt-mark hold llvm. Do not upgrade this package.
Other test programs
There are some other tests installed in the OS, primarily for testing the GPU. These are found in the directory /opt/spacecloud/extras/bin and include clBLAS, clBlast, clFFT test programs.
[example_saxpy, example_chbmv] are clBLAS tests
[dgemv, dtrsm, haxpy] are clBlast tests
[cl_fft1d, cl_fft2d, cl_fft2d-2048×2048, cl_fft3d] are clFFT tests.
Optimized OpenCV
Unibap has optimized OpenCV to use the AMD Jaguar microarchitecture found in the ODE/iX5 products and enabled GPU acceleration using clBLAS and clFFT. This can be verified by this command.
$ opencv_version --opencl
4.1.2
OpenCL Platforms:
Clover
iGPU: AMD KABINI (DRM 3.35.0, 5.4.44-050444-generic, LLVM 10.0.1) (OpenCL 1.2 Mesa 20.1.1 (git-127c2be9c5))
Current OpenCL device:
Type = iGPU
Name = AMD KABINI (DRM 3.35.0, 5.4.44-050444-generic, LLVM 10.0.1)
Version = OpenCL 1.2 Mesa 20.1.1 (git-127c2be9c5)
Driver version = 20.1.1
Address bits = 64
Compute units = 2
Max work group size = 256
Local memory size = 32 KB
Max memory allocation size = 901 MB 477 KB 204 B
Double support = Yes
Host unified memory = Yes
Device extensions:
cl_khr_byte_addressable_store
cl_khr_global_int32_base_atomics
cl_khr_global_int32_extended_atomics
cl_khr_local_int32_base_atomics
cl_khr_local_int32_extended_atomics
cl_khr_int64_base_atomics
cl_khr_int64_extended_atomics
cl_khr_fp64
cl_khr_fp16
Has AMD Blas = Yes
Has AMD Fft = Yes
Preferred vector width char = 16
Preferred vector width short = 8
Preferred vector width int = 4
Preferred vector width long = 2
Preferred vector width float = 4
Preferred vector width double = 2
Health monitoring
Unibap provide elaborate health monitoring and optionally advanced features for SafetyBoot and SafetyChip associated with the flight configuration of the products. Please contact us for information regarding these advanced flight related features.
Depending on the FPGA programming version the user can try these commands for additional health information.
This requires that the FPGA is turned on before the AMD SOC is booted.
$ sudo systemctl start sensor-server.service
Check the FPGA serial number
$ fetch_fpga
(will print out something like this if supported by the installed FPGA bitfile)
FPGA serial id: 0x11a5709ab61836421a7bf80d4b122046
Check the detailed health status
$ unibap_health_monitor
(which will print out something like this if supported by the installed FPGA bitfile)
Unibap health monitor started
Printing all data fetched from the Unibap modbus tcp server
2020.06.22 22:57:43 UTC
Q7 data:
Fpga Serial number is: 0x11a5709ab61836421a7bf80d4b122046
5V voltage: 5278 mV
3,3V voltage: 3274 mV
1,5V voltage: 1527 mV
FPGA current: 917 mA
Temp1: 33981 mC
Temp2: 33754 mC
Soc voltage: 958 mV
Soc voltage 5V: 5215 mV
Soc voltage 3,3V: 3257 mV
Soc voltage 1,8V: 1797 mV
Soc voltage 0,95V: 958 mV
Soc IO voltage : 1533 mV
Soc NB voltage: 866 mV
Soc current: 1643 mA
Soc Temp: 34512 mC
Soc adc in: 1733 mV
AMD Power plane: 8568 mW
FPGA Power plane: 4839 mW
CPU power: 9338 mW
CPU temp: 47625 mC
GPU temp: 4094 mC
Ode data:
fintec_in0 is: 1656 mV
fintec_in1 is: 1344 mV
fintec_in2 is: 1056 mV
fintec_in3 is: 1000 mV
fintec_in4 is: 1048 mV
fintec_in5 is: 1600 mV
fintec_in6 is: 1544 mV
fintec_in7 is: 1696 mV
fintec_fan_rpm is: 5415 rpm
fintec_temp1 is: 36000 mC
fintec_temp2 is: 34000 mC
fintec_temp3 is: 36000 mC
ssd_1 data:
ssd1_serial is: B0518176100000000DB4
ssd1_temp is: 99000mC
ssd1_health is: 0
ssd1_power_hours is: 3838 h
ssd1_power_cycle is: 30 cycles
RAM usage
Total RAM available: 1800 MB
Free RAM available: 1103 MB
Used RAM: 696 MB
Ram usage: 38.6897 %
Total Virtual Memory: 2874 MB
Free Virtual Memory: 2177 MB
Used Virtual Memory: 696 MB
CPU usage
CPU load 23.9796 %
Accessing the FPGA board controller and built in FlashPro
Newer version of the ODE provide a FTDI USB interface for serial/uart communication with the FPGA board controller running on the Cortex-M3 MCU. The same port also provide support for Microsemi FPGA programming through FlashPro JTAG interface. The FTDI USB connector is illustrated with a green oval in this picture.
The mini-USB to USB cable is included in the delivery box.
The UART is recognized by both Windows and Linux OSes and should be set to 115200 baud.
Microsemi support FlashPro up to version 11.9 but no later (e.g. 12 or higher). However, do not reprogram the FPGA without contacting Unibap as the bitfile must have certain IO in flash freeze configuration to have proper power management function. Wrong settings may brick the system and require a support task. If you upgrade the FPGA with a Unibap certified ODE FPGA upgrade bitfile you can safely do so. Currently the SpaceCloud® OS v1.0.0 does not include a FPGA bitfile upgrade.
The default bare metal FPGA firmware version would output something like this on the uart:
— UNIBAP Heterogeneous Computing System —
— FPGA Output, SmartFusion2 ARM Cortex-M3 —
Usage:
[1,2,3], toggle led
[b], print traced BIOS codes
[f], reset FPGA
[F], power off FPGA
[h], print health status
[i], print FPGA device info
[l], toggle led blink
[p], perform ISP of bitfile from spi-flash
[r], print raw adc values
[s], SoC power button
[S], SoC reset button
Device serial number: 0x1388D8040000CD5A00440017001XXXXX
User code number: 0x013377BB
User design version number: 0xF003
SoC:off bios-code-cnt:0 tmr_cnt:1s
Optional FreeRTOS based FPGA firmware
Unibap can provide a FPGA firmware running FreeRTOS (which is used for flights units like iX5-100 to support SafetyChip and SafetyBoot, which in this case would output the following menu on the FTDI uart:
— Welcome to Unibap UART Debug output console —
— ODE SmartFusion2 Cortex-M3 Embedded Application (FreeRTOS) —
support@unibap.com
Version Core: 2. Board: 0
Usage:
[1,2,3], toggle led
[b], print traced BIOS codes
[f], reset FPGA
[F], Power off FPGA
[h], print health status
[i], print FPGA device info
[l], toggle led blink
[r], print raw adc values
[s], SoC power button
[S], SoC reset button
[a], Toggle AMD autostart: 0
[Z], Format CFG Filesystem in SPI flash
AMD SoC:on port 80 bios-code-cnt:3 uptime:10211s