Skip to content

Releases: clustervision/trinityX

Release 15.1u1

27 Jun 10:41
Compare
Choose a tag to compare

Bug fixes

  • Hot-Fix: Fix version to 4.0.3 for Open Ondemand
  • Added CRB repository for non-OpenHPC based slurm installation
  • AlertX Drainer will not start when slurm is not installed

Release 15.1

05 Jun 14:27
Compare
Choose a tag to compare

New features

  • Login image support. The login image will provide for Open OnDemand functionality and Shell access for users
  • OpenSuse image support. A playbook is now provided to build an opensuse image
  • External Floating IP support for HA setups where access to the active controller is needed
  • Improved AlertX functionality:
    • Silencing Alerts is now supported
    • A more scalable approach in Overview for larger clusters
    • Rules for HA related degradations
  • Introduction of python3 libraries to render configurations for slurm, genders and more:
    • Luna based configurations are now extendable by pre-configured defaults
    • Support for GRES resources, based on a dedicated gres.conf file

Release 15u2

07 Apr 11:51
Compare
Choose a tag to compare

New features

  • Support for panfs, lustre, gpfs and beegfs as external source for shared_fs_disk, used in HA setups.
  • Introduction of manual fstype in shared_fs_disk, used in HA to tell TrinityX that the admin will take care of the mount point.
  • Legacy prometheus rules disable task, to pave the road for TrinityX 14.x upgrade to 15.

Bug fixes

  • luna 2.1 introduced a newer way setting up interfaces during iPXE, but this broke cloud provisioning. Fixed by introducing allowing to skip the looped interface discovery.

Release 15u1

04 Mar 14:29
Compare
Choose a tag to compare

Bug fix

  • fix for fallback mechanism while setting up HA without using any provided shared_fs_disks resulted in a playbook termination

Release 15

27 Feb 14:44
Compare
Choose a tag to compare

New features

  • AlertX - commandline and graphical application to manage prometheus alerts, rules and manage Node Health Checking (NHC)
  • NHC drainer - nodes triggered by the NHC rule are drained from jobs. currently slurm supported.
  • Per Job statistics - detailed breakdown per job for resource utilization and power consumption
  • Beta ARM support. Note that currently only homogeneous clusters are supported. Controller(s) and nodes are expected to be the same architecture, ARM+ARM and x86+x86
  • additional prometheus exporters for collecting more metrics including GPU, Hardware config and state
  • OOD application for changing a user’s password
  • Improved/extended grafana panels
  • luna 2.1
  • Open Ondemand 4.0.0
  • latest OpenHPC release 2.9 for EL8 and 3.2.1 for EL9
  • Prometheus 3.1.0
  • HA setups support cross mount shared disk exports, allowing passive/standby controllers to access the shared filesystems

Release 14.4u4

04 Feb 11:56
Compare
Choose a tag to compare

Fixes

  • nfs or direct mounts for shared_fs_disks fix
  • added lchroot wrapper foor OOD OSImage app
  • openldap role to better handle non posix/attr compliant filesystems
  • mariadb datadir overlap fix for HA setups

Release 14.4u3

23 Jan 13:56
Compare
Choose a tag to compare

Fixes

  • Code Server recent commit broke form.yml.erb. TrinityX now uses a fixed release.

Release 14.4u2

13 Nov 18:29
Compare
Choose a tag to compare

Fixes

  • Better reporting on Selinux mismatch
  • Better hostname checking
  • Improved external fqdn discovery which is a key component for Grafana and OOD
  • Fixes for reported issues

Release 14.4u1

22 Oct 14:37
Compare
Choose a tag to compare

Fixes

  • Ubuntu image creation work arounds for RH family debootstrap/Ubuntu repository name changes
  • multi host, e.g. HA controllers inventory hosts file based Ansible runs working as intended again
  • Removal of legacy roles and code

Release 14.4

25 Sep 14:52
Compare
Choose a tag to compare

New features

  • Kubernetes K3s integration
  • ZFS grafana panel
  • Infiniband exporter added for Infiniband analyzer link troubleshooter support
  • Introducing password quality constraints
  • Retries added to dnf/yum Ansible calls to overcome troubled or busy repository fails

Fixes

  • Openldap fix to not start on H/A pairs on boot time
  • Epel repo link fix to point to latest release