Java process crashing on Odroid HC1 / XU4 4


I recently released a „j-lawyer.BOX 2“ – a Linux based server with the j-lawyer.org case management software pre-installed – based on Odroid HC1. Until now, there was a j-lawyer.BOX 1 based on Odroid C2 which has proven to be very reliable.

Two pilot users / customers mentioned they were facing issues with Wildfly Application Server (Java process serving the j-lawyer.org services) crashing sporadically. Unfortunately, there were no logs available, neither in the Wildfly logs nor in /var/log, so I had no easy way to track this down.

One common pattern was that the process seemed to crash after longer periods of low load / idling. The HC1 has eight cores, four of them being ARM A15 and four more ARM A7. I suspected the Java process could be moved to the „smaller“ cores, and that this might be causing the crashes.Now, let’s try to pin the Java process to the four A15 cores:

  • Determine current CPU affinity: get the PID of your process and check which cores are utilized
pidof java(will output the PID)
taskset -p <PID>

This will output a mask of ff – which means the process might utilize all available cores. This is the default for any process when not using special configurations.

 

  • Create a new file /etc/systemd/system/cpuset.service with the following content:
[Unit]
Description=Setup CPU groups
Before=sysinit.target
After=local-fs.target
DefaultDependencies=no

[Service]
Type=oneshot
ExecStart=/bin/true
ExecStartPost=-/bin/mkdir -p /sys/fs/cgroup/cpuset/littlecores /sys/fs/cgroup/cpuset/bigcores
ExecStartPost=-/bin/sh -c '/bin/echo "0-3" > /sys/fs/cgroup/cpuset/littlecores/cpuset.cpus'
ExecStartPost=-/bin/sh -c '/bin/echo "0"> /sys/fs/cgroup/cpuset/littlecores/cpuset.mems'
ExecStartPost=-/bin/sh -c '/bin/chmod -R 777 /sys/fs/cgroup/cpuset/littlecores'
ExecStartPost=-/bin/sh -c '/bin/echo "4-7"> /sys/fs/cgroup/cpuset/bigcores/cpuset.cpus'
ExecStartPost=-/bin/sh -c '/bin/echo "0"> /sys/fs/cgroup/cpuset/bigcores/cpuset.mems'
ExecStartPost=-/bin/sh -c '/bin/chmod -R 777 /sys/fs/cgroup/cpuset/bigcores'

[Install]
WantedBy=sysinit.target
  • Make the file executable:
chmod a+x /etc/systemd/system/cpuset.service
  • Enable and start the new service:
systemctl enable cpuset
systemctl start cpuset
  • Install cgroup-tools
apt-get install cgroup-tools
  • Make sure your service is assigned to the A15 cores

To do this, you will have to edit your services startup script to determine the PID and use cgroups to control the affinity.

systemd: Add a line to your Service section

ExecStartPost=-/bin/sh -c 'echo $MAINPID | tee -a /sys/fs/cgroup/cpuset/bigcores/tasks'

init.d: in the „start“ section of your script, once the process is launched, do something like

pidof java | tr " " "\0"| xargs -0 -n1 | sudo tee -a /sys/fs/cgroup/cpuset/bigcores/tasks
  • Reboot and check the affinity again

Reboot the system and use taskset again to determine the affinity of the Java process. It should now output a mask of f0 – voilà, Java is now pinned to the A15 cores.

 

Since doing this, I have no longer seen crashes. Fingers crossed 🙂

By the way, although this pins a „large“ process to four specific cores, you should not see major differences – there are plenty of other processes that will be happily using the other four cores (unless you pin additional processes).

Disabling the little cores entirely

If you really want to use only the big cores, you can uncomment the line with CPUAffinity in /etc/systemd/system.conf and change the value to

CPUAffinity=4 5 6 7

Reboot and taskset will output ALL processes running with a mask of f0.


Schreibe einen Kommentar

Deine E-Mail-Adresse wird nicht veröffentlicht. Erforderliche Felder sind mit * markiert.

4 Gedanken zu “Java process crashing on Odroid HC1 / XU4

  • Stuart

    Hi

    I got as far as..

    systemctl start cpuset

    Then I get an error warning from
    systemctl start cpuset

    cpuset.service – Setup CPU groups
    Loaded: error (Reason: Invalid argument)
    Active: inactive (dead)

    Aug 18 10:25:05 ODroid systemd[1]: [/etc/systemd/system/cpuset.service:10] Failed to parse service type, ignoring: oneshot ExecStart=/bin/true
    Aug 18 10:25:05 ODroid systemd[1]: cpuset.service: Service lacks both ExecStart= and ExecStop= setting. Refusing.
    Aug 18 10:25:05 ODroid systemd[1]: cpuset.service: Cannot add dependency job, ignoring: Unit cpuset.service is not loaded properly: Invalid argument

    can you help?

  • j-dimension.com Autor des Beitrags

    I just had an endless boot loop after activating the CPU affinity on an Odroid HC1. It seems you MUST do a

    apt-get update && apt-get upgrade && apt-get dist-upgrade

    before enabling CPU affinity.

    • j-dimension.com Autor des Beitrags

      Turns out that was not enough. If you downloaded the latest image you might have kernel 4.14.5-92 (check with uname -a). I had to upgrade the kernel to make the CPU affinity feature work again. Here’s how:

      sudo apt update
      sudo apt upgrade
      sudo apt dist-upgrade
      sudo apt install linux-image-xu3
      (answer the question with „No“)
      sudo reboot

      Now running on 4.14.73-136 and everything seems fine.