Discussion:
[PATCH v2] arm64: compat: Implement misalignment fixups for multiword loads
(too old to reply)
Arnd Bergmann
2022-07-01 14:20:01 UTC
Permalink
v2: - drop some obsolete comments
- emit a perf alignment-fault event for every handled instruction
- use arm64_skip_faulting_instruction() to get the correct behavior
wrt IT state and single step
- use types with correct endianness annotation (instructions are
always little endian on v7/v8+)
Reviewed-by: Arnd Bergmann <***@arndb.de>
gene heskett
2022-07-01 14:50:04 UTC
Permalink
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.
This is a jolly good idea. Linuxcnc is one app that would benefit
from this if it doesn't have a huge effect on latency when we
build a preempt-rt kernel. I built this one on a pi for the pi, several
years ago now, and its worked so well I've not diligently searched
for newer.

4.19.71-rt24-v7l+ #1 SMP PREEMPT RT Thu Feb 6 07:09:18 EST 2020 armv7l
Installed on top of the armhf raspios buster, it Just Works.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note that the intent is to implement the exact same handling of
misaligned multi-word loads and stores as the 32-bit kernel does,
including what appears to be missing support for user space programs
that rely on SETEND to switch to a different byte order and back. Also,
like the 32-bit ARM version, we rely on the faulting address reported by
the CPU to infer the memory address, instead of decoding the instruction
fully to obtain this information.
This implementation is taken from the 32-bit ARM tree, with all pieces
removed that deal with instructions other than LDRD/STRD and LDM/STM, or
that deal with alignment exceptions taken in kernel mode.
---
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
v2: - drop some obsolete comments
- emit a perf alignment-fault event for every handled instruction
- use arm64_skip_faulting_instruction() to get the correct behavior
wrt IT state and single step
- use types with correct endianness annotation (instructions are
always little endian on v7/v8+)
arch/arm64/Kconfig | 4 +
arch/arm64/include/asm/exception.h | 1 +
arch/arm64/kernel/Makefile | 1 +
arch/arm64/kernel/compat_alignment.c | 389 ++++++++++++++++++++
arch/arm64/mm/fault.c | 3 +
5 files changed, 398 insertions(+)
diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
index 1652a9800ebe..401e4f8fa149 100644
--- a/arch/arm64/Kconfig
+++ b/arch/arm64/Kconfig
@@ -1508,6 +1508,10 @@ config THUMB2_COMPAT_VDSO
Compile the compat vDSO with '-mthumb -fomit-frame-pointer' if y,
otherwise with '-marm'.
+config COMPAT_ALIGNMENT_FIXUPS
+ bool "Fix up misaligned multi-word loads and stores in user space"
+ default y
+
menuconfig ARMV8_DEPRECATED
bool "Emulate deprecated/obsolete ARMv8 instructions"
depends on SYSCTL
diff --git a/arch/arm64/include/asm/exception.h b/arch/arm64/include/asm/exception.h
index d94aecff9690..e92ca08f754c 100644
--- a/arch/arm64/include/asm/exception.h
+++ b/arch/arm64/include/asm/exception.h
@@ -70,6 +70,7 @@ void do_sysinstr(unsigned long esr, struct pt_regs *regs);
void do_sp_pc_abort(unsigned long addr, unsigned long esr, struct pt_regs *regs);
void bad_el0_sync(struct pt_regs *regs, int reason, unsigned long esr);
void do_cp15instr(unsigned long esr, struct pt_regs *regs);
+int do_compat_alignment_fixup(unsigned long addr, struct pt_regs *regs);
void do_el0_svc(struct pt_regs *regs);
void do_el0_svc_compat(struct pt_regs *regs);
void do_ptrauth_fault(struct pt_regs *regs, unsigned long esr);
diff --git a/arch/arm64/kernel/Makefile b/arch/arm64/kernel/Makefile
index fa7981d0d917..58b472fa34fe 100644
--- a/arch/arm64/kernel/Makefile
+++ b/arch/arm64/kernel/Makefile
@@ -40,6 +40,7 @@ $(obj)/%.stub.o: $(obj)/%.o FORCE
obj-$(CONFIG_COMPAT) += sys32.o signal32.o \
sys_compat.o
obj-$(CONFIG_COMPAT) += sigreturn32.o
+obj-$(CONFIG_COMPAT_ALIGNMENT_FIXUPS) += compat_alignment.o
obj-$(CONFIG_KUSER_HELPERS) += kuser32.o
obj-$(CONFIG_FUNCTION_TRACER) += ftrace.o entry-ftrace.o
obj-$(CONFIG_MODULES) += module.o
diff --git a/arch/arm64/kernel/compat_alignment.c b/arch/arm64/kernel/compat_alignment.c
new file mode 100644
index 000000000000..3b41557803a3
--- /dev/null
+++ b/arch/arm64/kernel/compat_alignment.c
@@ -0,0 +1,387 @@
+// SPDX-License-Identifier: GPL-2.0-only
+// based on arch/arm/mm/alignment.c
+
+#include <linux/compiler.h>
+#include <linux/errno.h>
+#include <linux/kernel.h>
+#include <linux/init.h>
+#include <linux/perf_event.h>
+#include <linux/uaccess.h>
+
+#include <asm/exception.h>
+#include <asm/ptrace.h>
+#include <asm/traps.h>
+
+/*
+ * 32-bit misaligned trap handler (c) 1998 San Mehat (CCC) -July 1998
+ *
+ * Speed optimisations and better fault handling by Russell King.
+ */
+#define CODING_BITS(i) (i & 0x0e000000)
+
+#define LDST_P_BIT(i) (i & (1 << 24)) /* Preindex */
+#define LDST_U_BIT(i) (i & (1 << 23)) /* Add offset */
+#define LDST_W_BIT(i) (i & (1 << 21)) /* Writeback */
+#define LDST_L_BIT(i) (i & (1 << 20)) /* Load */
+
+#define LDST_P_EQ_U(i) ((((i) ^ ((i) >> 1)) & (1 << 23)) == 0)
+
+#define LDSTHD_I_BIT(i) (i & (1 << 22)) /* double/half-word immed */
+
+#define RN_BITS(i) ((i >> 16) & 15) /* Rn */
+#define RD_BITS(i) ((i >> 12) & 15) /* Rd */
+#define RM_BITS(i) (i & 15) /* Rm */
+
+#define REGMASK_BITS(i) (i & 0xffff)
+
+#define BAD_INSTR 0xdeadc0de
+
+/* Thumb-2 32 bit format per ARMv7 DDI0406A A6.3, either f800h,e800h,f800h */
+#define IS_T32(hi16) \
+ (((hi16) & 0xe000) == 0xe000 && ((hi16) & 0x1800))
+
+union offset_union {
+ unsigned long un;
+ signed long sn;
+};
+
+#define TYPE_ERROR 0
+#define TYPE_FAULT 1
+#define TYPE_LDST 2
+#define TYPE_DONE 3
+
+static void
+do_alignment_finish_ldst(unsigned long addr, u32 instr, struct pt_regs *regs,
+ union offset_union offset)
+{
+ if (!LDST_U_BIT(instr))
+ offset.un = -offset.un;
+
+ if (!LDST_P_BIT(instr))
+ addr += offset.un;
+
+ if (!LDST_P_BIT(instr) || LDST_W_BIT(instr))
+ regs->regs[RN_BITS(instr)] = addr;
+}
+
+static int
+do_alignment_ldrdstrd(unsigned long addr, u32 instr, struct pt_regs *regs)
+{
+ unsigned int rd = RD_BITS(instr);
+ unsigned int rd2;
+ int load;
+
+ if ((instr & 0xfe000000) == 0xe8000000) {
+ /* ARMv7 Thumb-2 32-bit LDRD/STRD */
+ rd2 = (instr >> 8) & 0xf;
+ load = !!(LDST_L_BIT(instr));
+ } else if (((rd & 1) == 1) || (rd == 14)) {
+ return TYPE_ERROR;
+ } else {
+ load = ((instr & 0xf0) == 0xd0);
+ rd2 = rd + 1;
+ }
+
+ if (load) {
+ unsigned int val, val2;
+
+ if (get_user(val, (u32 __user *)addr) ||
+ get_user(val2, (u32 __user *)(addr + 4)))
+ return TYPE_FAULT;
+ regs->regs[rd] = val;
+ regs->regs[rd2] = val2;
+ } else {
+ if (put_user(regs->regs[rd], (u32 __user *)addr) ||
+ put_user(regs->regs[rd2], (u32 __user *)(addr + 4)))
+ return TYPE_FAULT;
+ }
+ return TYPE_LDST;
+}
+
+/*
+ * LDM/STM alignment handler.
+ *
+ *
+ * B = rn pointer before instruction, A = rn pointer after instruction
+ * ------ increasing address ----->
+ * | | r0 | r1 | ... | rx | |
+ * PU = 01 B A
+ * PU = 11 B A
+ * PU = 00 A B
+ * PU = 10 A B
+ */
+static int
+do_alignment_ldmstm(unsigned long addr, u32 instr, struct pt_regs *regs)
+{
+ unsigned int rd, rn, nr_regs, regbits;
+ unsigned long eaddr, newaddr;
+ unsigned int val;
+
+ /* count the number of registers in the mask to be transferred */
+ nr_regs = hweight16(REGMASK_BITS(instr)) * 4;
+
+ rn = RN_BITS(instr);
+ newaddr = eaddr = regs->regs[rn];
+
+ if (!LDST_U_BIT(instr))
+ nr_regs = -nr_regs;
+ newaddr += nr_regs;
+ if (!LDST_U_BIT(instr))
+ eaddr = newaddr;
+
+ if (LDST_P_EQ_U(instr)) /* U = P */
+ eaddr += 4;
+
+ for (regbits = REGMASK_BITS(instr), rd = 0; regbits;
+ regbits >>= 1, rd += 1)
+ if (regbits & 1) {
+ if (LDST_L_BIT(instr)) {
+ if (get_user(val, (u32 __user *)eaddr))
+ return TYPE_FAULT;
+ if (rd < 15)
+ regs->regs[rd] = val;
+ else
+ regs->pc = val;
+ } else {
+ /*
+ * The PC register has a bias of +8 in ARM mode
+ * and +4 in Thumb mode. This means that a read
+ * of the value of PC should account for this.
+ * Since Thumb does not permit STM instructions
+ * to refer to PC, just add 8 here.
+ */
+ val = (rd < 15) ? regs->regs[rd] : regs->pc + 8;
+ if (put_user(val, (u32 __user *)eaddr))
+ return TYPE_FAULT;
+ }
+ eaddr += 4;
+ }
+
+ if (LDST_W_BIT(instr))
+ regs->regs[rn] = newaddr;
+
+ return TYPE_DONE;
+}
+
+/*
+ * Convert Thumb multi-word load/store instruction forms to equivalent ARM
+ * instructions so we can reuse ARM userland alignment fault fixups for Thumb.
+ *
+ * This implementation was initially based on the algorithm found in
+ * gdb/sim/arm/thumbemu.c. It is basically just a code reduction of same
+ * to convert only Thumb ld/st instruction forms to equivalent ARM forms.
+ *
+ * 1. Comments below refer to ARM ARM DDI0100E Thumb Instruction sections.
+ * 2. If for some reason we're passed an non-ld/st Thumb instruction to
+ * decode, we return 0xdeadc0de. This should never happen under normal
+ * circumstances but if it does, we've got other problems to deal with
+ * elsewhere and we obviously can't fix those problems here.
+ */
+
+static unsigned long thumb2arm(u16 tinstr)
+{
+ u32 L = (tinstr & (1<<11)) >> 11;
+
+ switch ((tinstr & 0xf800) >> 11) {
+ /* 6.6.1 Format 1: */
+ case 0xc000 >> 11: /* 7.1.51 STMIA */
+ case 0xc800 >> 11: /* 7.1.25 LDMIA */
+ {
+ u32 Rn = (tinstr & (7<<8)) >> 8;
+ u32 W = ((L<<Rn) & (tinstr&255)) ? 0 : 1<<21;
+
+ return 0xe8800000 | W | (L<<20) | (Rn<<16) |
+ (tinstr&255);
+ }
+
+ /* 6.6.1 Format 2: */
+ case 0xb000 >> 11: /* 7.1.48 PUSH */
+ case 0xb800 >> 11: /* 7.1.47 POP */
+ if ((tinstr & (3 << 9)) == 0x0400) {
+ static const u32 subset[4] = {
+ 0xe92d0000, /* STMDB sp!,{registers} */
+ 0xe92d4000, /* STMDB sp!,{registers,lr} */
+ 0xe8bd0000, /* LDMIA sp!,{registers} */
+ 0xe8bd8000 /* LDMIA sp!,{registers,pc} */
+ };
+ return subset[(L<<1) | ((tinstr & (1<<8)) >> 8)] |
+ (tinstr & 255); /* register_list */
+ }
+ fallthrough; /* for illegal instruction case */
+
+ return BAD_INSTR;
+ }
+}
+
+/*
+ * Convert Thumb-2 32 bit LDM, STM, LDRD, STRD to equivalent instruction
+ * handlable by ARM alignment handler, also find the corresponding handler,
+ * so that we can reuse ARM userland alignment fault fixups for Thumb.
+ *
+ *
+ * 1. Comments below refer to ARMv7 DDI0406A Thumb Instruction sections.
+ * 2. Register name Rt from ARMv7 is same as Rd from ARMv6 (Rd is Rt)
+ */
+static void *
+do_alignment_t32_to_handler(u32 *pinstr, struct pt_regs *regs,
+ union offset_union *poffset)
+{
+ u32 instr = *pinstr;
+ u16 tinst1 = (instr >> 16) & 0xffff;
+ u16 tinst2 = instr & 0xffff;
+
+ switch (tinst1 & 0xffe0) {
+ /* A6.3.5 Load/Store multiple */
+ case 0xe880: /* STM/STMIA/STMEA,LDM/LDMIA, PUSH/POP T2 */
+ case 0xe8a0: /* ...above writeback version */
+ case 0xe900: /* STMDB/STMFD, LDMDB/LDMEA */
+ case 0xe920: /* ...above writeback version */
+ /* no need offset decision since handler calculates it */
+ return do_alignment_ldmstm;
+
+ case 0xf840: /* POP/PUSH T3 (single register) */
+ if (RN_BITS(instr) == 13 && (tinst2 & 0x09ff) == 0x0904) {
+ u32 L = !!(LDST_L_BIT(instr));
+ const u32 subset[2] = {
+ 0xe92d0000, /* STMDB sp!,{registers} */
+ 0xe8bd0000, /* LDMIA sp!,{registers} */
+ };
+ *pinstr = subset[L] | (1<<RD_BITS(instr));
+ return do_alignment_ldmstm;
+ }
+ /* Else fall through for illegal instruction case */
+ break;
+
+ /* A6.3.6 Load/store double, STRD/LDRD(immed, lit, reg) */
+ poffset->un = (tinst2 & 0xff) << 2;
+ fallthrough;
+
+ return do_alignment_ldrdstrd;
+
+ /*
+ * No need to handle load/store instructions up to word size
+ * since ARMv6 and later CPUs can perform unaligned accesses.
+ */
+ break;
+ }
+ return NULL;
+}
+
+static int alignment_get_arm(struct pt_regs *regs, __le32 __user *ip, u32 *inst)
+{
+ __le32 instr = 0;
+ int fault;
+
+ fault = get_user(instr, ip);
+ if (fault)
+ return fault;
+
+ *inst = __le32_to_cpu(instr);
+ return 0;
+}
+
+static int alignment_get_thumb(struct pt_regs *regs, __le16 __user *ip, u16 *inst)
+{
+ __le16 instr = 0;
+ int fault;
+
+ fault = get_user(instr, ip);
+ if (fault)
+ return fault;
+
+ *inst = __le16_to_cpu(instr);
+ return 0;
+}
+
+int do_compat_alignment_fixup(unsigned long addr, struct pt_regs *regs)
+{
+ union offset_union offset;
+ unsigned long instrptr;
+ int (*handler)(unsigned long addr, u32 instr, struct pt_regs *regs);
+ unsigned int type;
+ u32 instr = 0;
+ u16 tinstr = 0;
+ int isize = 4;
+ int thumb2_32b = 0;
+ int fault;
+
+ instrptr = instruction_pointer(regs);
+
+ if (compat_thumb_mode(regs)) {
+ __le16 __user *ptr = (__le16 __user *)(instrptr & ~1);
+
+ fault = alignment_get_thumb(regs, ptr, &tinstr);
+ if (!fault) {
+ if (IS_T32(tinstr)) {
+ /* Thumb-2 32-bit */
+ u16 tinst2;
+ fault = alignment_get_thumb(regs, ptr + 1, &tinst2);
+ instr = ((u32)tinstr << 16) | tinst2;
+ thumb2_32b = 1;
+ } else {
+ isize = 2;
+ instr = thumb2arm(tinstr);
+ }
+ }
+ } else {
+ fault = alignment_get_arm(regs, (__le32 __user *)instrptr, &instr);
+ }
+
+ if (fault)
+ return 1;
+
+ switch (CODING_BITS(instr)) {
+ case 0x00000000: /* 3.13.4 load/store instruction extensions */
+ if (LDSTHD_I_BIT(instr))
+ offset.un = (instr & 0xf00) >> 4 | (instr & 15);
+ else
+ offset.un = regs->regs[RM_BITS(instr)];
+
+ if ((instr & 0x001000f0) == 0x000000d0 || /* LDRD */
+ (instr & 0x001000f0) == 0x000000f0) /* STRD */
+ handler = do_alignment_ldrdstrd;
+ else
+ return 1;
+ break;
+
+ case 0x08000000: /* ldm or stm, or thumb-2 32bit instruction */
+ if (thumb2_32b) {
+ offset.un = 0;
+ handler = do_alignment_t32_to_handler(&instr, regs, &offset);
+ } else {
+ offset.un = 0;
+ handler = do_alignment_ldmstm;
+ }
+ break;
+
+ return 1;
+ }
+
+ type = handler(addr, instr, regs);
+
+ if (type == TYPE_ERROR || type == TYPE_FAULT)
+ return 1;
+
+ if (type == TYPE_LDST)
+ do_alignment_finish_ldst(addr, instr, regs, offset);
+
+ perf_sw_event(PERF_COUNT_SW_ALIGNMENT_FAULTS, 1, regs, regs->pc);
+ arm64_skip_faulting_instruction(regs, isize);
+
+ return 0;
+}
diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c
index c5e11768e5c1..b25119b4beca 100644
--- a/arch/arm64/mm/fault.c
+++ b/arch/arm64/mm/fault.c
@@ -687,6 +687,9 @@ static int __kprobes do_translation_fault(unsigned long far,
static int do_alignment_fault(unsigned long far, unsigned long esr,
struct pt_regs *regs)
{
+ if (IS_ENABLED(CONFIG_COMPAT_ALIGNMENT_FIXUPS) &&
+ compat_user_mode(regs))
+ return do_compat_alignment_fixup(far, regs);
do_bad_area(far, esr, regs);
return 0;
}
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Wookey
2022-07-14 02:00:01 UTC
Permalink
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
Decent 32-bit arm hardware is thin on the ground these days. Debian
still has some but it's getting old and flaky. Being able to build
reliably on 64-bit hardware is important and useful. Unaligned
accesses are much less of a problem than they used to be, but they can
still happen, so having these fixups available is definitely a good
thing.

Debian runs its 32-bit buildds with alignment fixups turned on. It
looks like the boxes still hit about 1 per day.

We also do 32 bit builds on 64-bit kernels (in 32-bit userspaces) and
it mostly works. We do have packages that fail on 64-bit kernels and
have to be built on real 32-bit hardware, but I don't know how much of
that would be fixed by this patch. Some, presumably.

So yes, cheers for this. It is helpful in the real world (or at least
it should be).

Wookey
--
Principal hats: Debian, Wookware, ARM
http://wookware.org/
LinAdmin
2022-07-15 08:10:01 UTC
Permalink
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
Post by Wookey
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
Decent 32-bit arm hardware is thin on the ground these days. Debian
still has some but it's getting old and flaky. Being able to build
reliably on 64-bit hardware is important and useful. Unaligned
accesses are much less of a problem than they used to be, but they can
still happen, so having these fixups available is definitely a good
thing.
Debian runs its 32-bit buildds with alignment fixups turned on. It
looks like the boxes still hit about 1 per day.
We also do 32 bit builds on 64-bit kernels (in 32-bit userspaces) and
it mostly works. We do have packages that fail on 64-bit kernels and
have to be built on real 32-bit hardware, but I don't know how much of
that would be fixed by this patch. Some, presumably.
So yes, cheers for this. It is helpful in the real world (or at least
it should be).
Wookey
gene heskett
2022-07-15 09:40:01 UTC
Permalink
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
Post by Wookey
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
I agree.  So far, raspios is still available in armhf flavor, and for
running
heavy machinery with just a few microseconds to respond to an IRQ,
armhf builds are a given. LinuxCNC is such an application.

I built this kernel for an rpi4b quite a while ago, but none more recent
have been as usable. uname -a:

Linux rpi4.coyote.den 4.19.71-rt24-v7l+ #1 SMP PREEMPT RT Thu Feb 6
07:09:18 EST 2020 armv7l GNU/Linux

latency-test shows about 12 u-secs as long as I stay away from firefox.

That's good enough to run a cnc converted 80 yo 11x56 Sheldon lathe,
making it do dance steps that were not in its vocabulary 80 years ago.

Yet that raspios buster install is the full blown graphical install I
also use
as a development platform, with big SSD's plugged into its usb3 ports for
workspace.

Is it stable? Absolutely, no splats since the above date unless caused by
me, uptimes are in the many months category as I try newer stuff now
and then and find it wanting.

The exception is right now, as libc6 was replaced and I rebooted it
2 days ago.

It would be running bullseye but the last time I switched boot cards to try
it, the python was too new to build LinuxCNC with but the built on buster
version still worked and so did the above kernel.

What I'd like to know, is why is armhf such a dirty word to debian?

Take care and stay well everybody.
Post by LinAdmin
Post by Wookey
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
Decent 32-bit arm hardware is thin on the ground these days. Debian
still has some but it's getting old and flaky. Being able to build
reliably on 64-bit hardware is important and useful. Unaligned
accesses are much less of a problem than they used to be, but they can
still happen, so having these fixups available is definitely a good
thing.
Debian runs its 32-bit buildds with alignment fixups turned on. It
looks like the boxes still hit about 1 per day.
We also do 32 bit builds on 64-bit kernels (in 32-bit userspaces) and
it mostly works. We do have packages that fail on 64-bit kernels and
have to be built on real 32-bit hardware, but I don't know how much of
that would be fixed by this patch. Some, presumably.
So yes, cheers for this. It is helpful in the real world (or at least
it should be).
Wookey
.
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Arnd Bergmann
2022-07-15 10:20:01 UTC
Permalink
Post by gene heskett
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
Please stop the name calling, and the spreading of misinformation on this list.
Post by gene heskett
I built this kernel for an rpi4b quite a while ago, but none more recent
Linux rpi4.coyote.den 4.19.71-rt24-v7l+ #1 SMP PREEMPT RT Thu Feb 6
07:09:18 EST 2020 armv7l GNU/Linux
latency-test shows about 12 u-secs as long as I stay away from firefox.
That's good enough to run a cnc converted 80 yo 11x56 Sheldon lathe,
making it do dance steps that were not in its vocabulary 80 years ago.
Yet that raspios buster install is the full blown graphical install I
also use
as a development platform, with big SSD's plugged into its usb3 ports for
workspace.
Is it stable? Absolutely, no splats since the above date unless caused by
me, uptimes are in the many months category as I try newer stuff now
and then and find it wanting.
The exception is right now, as libc6 was replaced and I rebooted it
2 days ago.
It would be running bullseye but the last time I switched boot cards to try
it, the python was too new to build LinuxCNC with but the built on buster
version still worked and so did the above kernel.
What I'd like to know, is why is armhf such a dirty word to debian?
Ard's kernel patch is for the armhf target, and to keep it working
on modern hardware that runs a 64-bit kernel, as there is a specific
compatibility problem (specifically applications that trigger
undefined behavior in C with misaligned pointers) without this patch.

If you see /other/ problems with the 64-bit kernel (using the
same user space, kernel source and kernel config as the 32-bit
kernel), please report those to the respective upstream kernel
maintainers so we can fix those as well.

Arnd
Paul Wise
2022-07-15 10:50:02 UTC
Permalink
Post by Arnd Bergmann
If you see /other/ problems with the 64-bit kernel (using the
same user space, kernel source and kernel config as the 32-bit
kernel), please report those to the respective upstream kernel
maintainers so we can fix those as well.
Gene's complaint is unrelated to this thread, but it is that Debian
refuses to support running the 32-bit ARMMP kernel on 64-bit hardware,
specifically on the RaspberryPi 4b. There wasn't any justification from
Debian given in the bug reports, but it sounds like only build config
options are needed to be enabled, but Debian refuses to do that:

https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971059#12
https://bugs.debian.org/981586
--
bye,
pabs

https://wiki.debian.org/PaulWise
Arnd Bergmann
2022-07-15 11:10:01 UTC
Permalink
Post by Paul Wise
Post by Arnd Bergmann
If you see /other/ problems with the 64-bit kernel (using the
same user space, kernel source and kernel config as the 32-bit
kernel), please report those to the respective upstream kernel
maintainers so we can fix those as well.
Gene's complaint is unrelated to this thread, but it is that Debian
refuses to support running the 32-bit ARMMP kernel on 64-bit hardware,
specifically on the RaspberryPi 4b. There wasn't any justification from
Debian given in the bug reports, but it sounds like only build config
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971059#12
https://bugs.debian.org/981586
I see, and I agree that this is frustrating when trying to pinpoint
a bug in the 64-bit kernel. On the other hand this is of course
a sensible decision, since users clearly should not actually
run the 32-bit kernel in this hardware other than for testing
purposes.

I suppose this is made worse by the lack of a 64-bit kernel
option in the armhf installer, which means one has to go through
a couple of extra steps to install the arm64 kernel and get a
booting system.

Arnd
Wookey
2022-07-15 16:20:02 UTC
Permalink
Post by Paul Wise
Post by Arnd Bergmann
If you see /other/ problems with the 64-bit kernel (using the
same user space, kernel source and kernel config as the 32-bit
kernel), please report those to the respective upstream kernel
maintainers so we can fix those as well.
Gene's complaint is unrelated to this thread, but it is that Debian
refuses to support running the 32-bit ARMMP kernel on 64-bit hardware,
specifically on the RaspberryPi 4b. There wasn't any justification from
Debian given in the bug reports, but it sounds like only build config
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971059#12
https://bugs.debian.org/981586
Ah thanks Paul. I was wondering why we were being accused of 'Debian
abandonning armhf' when it was news to me, and I'm just writing the
'ARM ports status' talk for Debconf next week.

Clearly one normally does not run foreign-arch kernels on hardware so
we don't have to support it, and Ben is right to say 'this is not a
bug'.

On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?

Does this only work for the RPi4, or does it enable/prevent 32-bit kernels on other 64-bit machines?

Do i386 kernels work on amd64 machines?

Sounds like something that might be worth discussion at debconf next week. I'll mention it in the talk.

Wookey
--
Principal hats: Debian, Wookware, ARM
http://wookware.org/
gene heskett
2022-07-15 17:50:01 UTC
Permalink
Post by Wookey
Post by Paul Wise
Post by Arnd Bergmann
If you see /other/ problems with the 64-bit kernel (using the
same user space, kernel source and kernel config as the 32-bit
kernel), please report those to the respective upstream kernel
maintainers so we can fix those as well.
Gene's complaint is unrelated to this thread, but it is that Debian
refuses to support running the 32-bit ARMMP kernel on 64-bit hardware,
specifically on the RaspberryPi 4b. There wasn't any justification from
Debian given in the bug reports, but it sounds like only build config
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=971059#12
https://bugs.debian.org/981586
Ah thanks Paul. I was wondering why we were being accused of 'Debian
abandonning armhf' when it was news to me, and I'm just writing the
'ARM ports status' talk for Debconf next week.
Clearly one normally does not run foreign-arch kernels on hardware so
we don't have to support it, and Ben is right to say 'this is not a
bug'.
On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?
It, LinuxCNC, does indeed run on an armhf kernel built right on the pi
and has been since Jessie on a rpi3b.
Post by Wookey
Does this only work for the RPi4, or does it enable/prevent 32-bit kernels on other 64-bit machines?
No. It runs with the same armhf kernel on an rpi3b, but the 3b is
dragging its
tongue on the floor where the 4b has some leftover zip.

I'm driving an 80 yo Sheldon lathe with it, and a 3 axis Mazak mill is
under
construction/conversion by another person out on the left coast as we're
discussing this. But the lack of armhf in Debian is why we're both running
raspian, I built his boot cards.

Because our latency-test results are better on armhf than on arm64, we
use armhf
for its performance.
Post by Wookey
Do i386 kernels work on amd64 machines?
Different architecture. No relevance here.
Post by Wookey
Sounds like something that might be worth discussion at debconf next week. I'll mention it in the talk.
Wookey
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Wookey
2022-07-15 18:30:01 UTC
Permalink
Post by gene heskett
Post by Wookey
Clearly one normally does not run foreign-arch kernels on hardware so
we don't have to support it, and Ben is right to say 'this is not a
bug'.
On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?
It, LinuxCNC, does indeed run on an armhf kernel built right on the pi
and has been since Jessie on a rpi3b.
And it is now in debian: https://tracker.debian.org/pkg/linuxcnc
Post by gene heskett
Post by Wookey
Does this only work for the RPi4, or does it enable/prevent 32-bit kernels on other 64-bit machines?
No. It runs with the same armhf kernel on an rpi3b, but the 3b is dragging
its
tongue on the floor where the 4b has some leftover zip.
Sorry. I meant other arm64 hardware than from broadcom (not other RPi flavours). I.e does enabling
CONFIG_PCIE_BRCMSTB=y
CONFIG_RESET_RASPERBERRY=y
RESET_BRCMSTB_RESCAL=y

Cause the kernel any issues on other platforms?
Post by gene heskett
Because our latency-test results are better on armhf than on arm64, we use
armhf for its performance.
OK. How much better? What sort of performance difference are we talking about?

And how many other users care about this? Debian is a general-purpose
OS and has to choose options that are generally useful or at least not
generally harmful. One user with some interesting hardware can clearly
install a new kernel built with specific options.

The question from Debian's POV is how many other people want to use
non-native arm kernels (and for what?). How many platforms is it
relevant to? And if there is a downside, how many does that effect,
and how/how much.

You say the kernel is 'a few kB bigger'. How many kB? Kernel size has
been critical on some armhf models in this past so even if that's the
only cost, it's not necessarily negligible. We may have dropped all
the platforms this was critical for by now, in which case perhaps a 'a
few kB' doesn't matter.
Post by gene heskett
Post by Wookey
Do i386 kernels work on amd64 machines?
Different architecture. No relevance here.
It's not entirely irrelevant. If it works on x86 then it's not
entirely unreasonable for people to expect it to work on arm. We do
strive for parity to the degree that it is possible and reasonable.
If it doesn't work on x86 then that justification can't be used, and
indeed strengthens the argument that 'just about nobody runs
non-native kernels - if you want to, you are on your own'.

Wookey
--
Principal hats: Debian, Wookware, ARM
http://wookware.org/
Vagrant Cascadian
2022-07-15 19:00:01 UTC
Permalink
Post by Wookey
The question from Debian's POV is how many other people want to use
non-native arm kernels (and for what?). How many platforms is it
relevant to? And if there is a downside, how many does that effect,
and how/how much.
For Reproducible Builds testing of armhf packages I run several machines
(some physical, some virtual) with arm64 kernel and armhf userland, and
it basically works.

It is a little tricky to set up multi-arch to be able to get the
linux-image-arm64 kernel from arm64 without pulling in all the
recommends on various :arm64 packages, but once it is set up, it works
fine...

I have no idea how difficult it would be to add multi-arch support to
debian-installer, but it is not too hard to build an image using
"mmdebstrap" that supports a linux-image-arm64:arm64 kernel on an armhf
userland.


live well,
vagrant
Paul Wise
2022-07-16 02:00:01 UTC
Permalink
Post by gene heskett
Because our latency-test results are better on armhf than on arm64,
we use armhf for its performance.
Are these results for armhf kernel with armhf userland?

Are the results for arm64 kernel with armhf userland similar?

How much worse are the results for arm64 kernel and userland?
--
bye,
pabs

https://wiki.debian.org/PaulWise
gene heskett
2022-07-16 03:10:03 UTC
Permalink
Post by Paul Wise
Post by gene heskett
Because our latency-test results are better on armhf than on arm64,
we use armhf for its performance.
Are these results for armhf kernel with armhf userland?
The whole install of raspios is armhf. So I guess its yes.
Post by Paul Wise
Are the results for arm64 kernel with armhf userland similar?
I have not tried to build an aarch64 from the src I have.
Post by Paul Wise
How much worse are the results for arm64 kernel and userland?
No, not exact, but its roughly 4x longer when I can get it to run,
as it did check for a realtime preempt kernel and did a graceful
exit if not found. So its not been run on 64 bit Debian image since
stretch.

It, latency-test, has recently been worked on but I've not tested
it on a Debian image of either flavor since. It is now in testing so
those of you w/o 5 years of history to lose could try it for the price
of a 64G u-sd card.

I want to thank you /all/ for a civil discussion. It has not been all that
welcoming  in the past.

Take care and stay well everybody.

Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Paul Wise
2022-07-16 10:00:02 UTC
Permalink
Post by gene heskett
The whole install of raspios is armhf. So I guess its yes.
I seem to remember them switching to arm64 recently?
Post by gene heskett
I have not tried to build an aarch64 from the src I have.
I think it would be helpful if someone with an RPi4 could do this.
Post by gene heskett
It, latency-test, has recently been worked on but I've not tested
it on a Debian image of either flavor since. It is now in testing so
those of you w/o 5 years of history to lose could try it for the price of a 64G u-sd card.
For those of you who are able to try this, sounds like you just install
linuxcnc-uspace and then run latency-test. Also install/try latencytop,
although Linux CONFIG_LATENCYTOP is not enabled in Debian probably.

$ apt-file search latency-test
linuxcnc-uspace: /usr/bin/latency-test
linuxcnc-uspace: /usr/share/doc/linuxcnc/examples/sample-configs/apps/latency/latency-test.demo
linuxcnc-uspace: /usr/share/man/man1/latency-test.1.gz
--
bye,
pabs

https://wiki.debian.org/PaulWise
gene heskett
2022-07-16 12:30:01 UTC
Permalink
Post by Paul Wise
Post by gene heskett
The whole install of raspios is armhf. So I guess its yes.
I seem to remember them switching to arm64 recently?
They may have, I've had a total loss of history here with 2, 2T seagate
drives dieing within a couple weeks of each other. So let me see if I have
the bullseye image for arm I've tried.  Yes, and its:
2021-10-30-raspios-bullseye-armhf-full.img
Which ran fine, even with my now 2 yo kernel installed but the python
is too new for linuxcnc. With a second user wanting to dup what I did,
I've sent him two cards with the raspios buster for armhf, with my kernel
installed by my method. Linuxcnc has been advised of the python showstopper,
and that may have been fixed by now, but if so, its not caught my attention
in the email's I get from a 4x a day git pull keeping the pi up to date.

Since that master is now in testing for bookworm, I have to assume it
has been fixed.

I also have 11-2 of yours, in netinstall flavor and labeled as armhf. As
I recall
it had some sort of a showstopper, and I knew raspios worked, I don't recall
that  I investigated yours further. Is there a later netinstall for
armhf or arm64
available now? u-sd cards I have.
Post by Paul Wise
Post by gene heskett
I have not tried to build an aarch64 from the src I have.
I think, since the 240G I'm using for workspace is over 50% full, that I
had better
replace the 120G I've used for a backup, with a 1T and copy the 240 to
it, before
I do anything rash. That can be done but I'm in the middle of rebuilding
a Prusa MK3S
with a better print head and that has priority at the moment. I am using
it to make
the nuts for a woodworkers workbench vises, carving the screw from a
2x2" stick
of hard maple about 20" long. See version #1 on my web page it the sig.
Reshaping
the threads for a lot better fit, version #5 is waiting on a working
printer to make
more nuts.
Post by Paul Wise
Post by gene heskett
I think it would be helpful if someone with an RPi4 could do this.
That's probably me but its also likely a week on down the log.
Post by Paul Wise
Post by gene heskett
It, latency-test, has recently been worked on but I've not tested
it on a Debian image of either flavor since. It is now in testing so
those of you w/o 5 years of history to lose could try it for the price of a 64G u-sd card.
For those of you who are able to try this, sounds like you just install
linuxcnc-uspace and then run latency-test. Also install/try latencytop,
although Linux CONFIG_LATENCYTOP is not enabled in Debian probably.
$ apt-file search latency-test
linuxcnc-uspace: /usr/bin/latency-test
linuxcnc-uspace: /usr/share/doc/linuxcnc/examples/sample-configs/apps/latency/latency-test.demo
linuxcnc-uspace: /usr/share/man/man1/latency-test.1.gz
latencytop I've heard of, but haven't found, so no comparison comment.


Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Arnd Bergmann
2022-07-16 12:10:01 UTC
Permalink
Post by gene heskett
Post by Paul Wise
Post by gene heskett
Because our latency-test results are better on armhf than on arm64,
we use armhf for its performance.
Are these results for armhf kernel with armhf userland?
The whole install of raspios is armhf. So I guess its yes.
Post by Paul Wise
Are the results for arm64 kernel with armhf userland similar?
I have not tried to build an aarch64 from the src I have.
Post by Paul Wise
How much worse are the results for arm64 kernel and userland?
No, not exact, but its roughly 4x longer when I can get it to run,
as it did check for a realtime preempt kernel and did a graceful
exit if not found. So its not been run on 64 bit Debian image since
stretch.
There are unfortunately a number of variables here that make things
really hard to compare, any of these can have an effect that dominates
your results:

- 4.19 is four years old, and both the mainline kernel and the
preempt-rt patches have changed a lot in the meantime. It's
possible that a current preempt-rt has regressed compared to
the version you are running. If so, we can work on fixing the
regression for future kernels, but there won't be much interest
in working on the old kernel

- Raspberry Pi OS (and Raspbian before that) has a number of
platform specific kernel patches that are neither in mainline
Linux nor in the Debian kernel packages. It is possible that they
have already identified and fixed a source of latency in their
kernels but not managed to upstream that fix for a number of
reasons.

- A lot of kernel configuration options can have a huge impact
on latency, it's not just preempt-rt that can be turned on or off,
but any device driver that disables preemption for too long can
increase the maximum latency of the system.

- The raspbian user space should have very little effect on
latency but it's worth pointing out that you may see different
performance between armv6 (raspbian) and armv7 (debian
armhf), between vfpv3-d16 (raspbian and debian armhf) and
neon-d32 (fedora and others), and between a32 (raspbian),
t32 (debian armhf) and a64 (debian arm64), instruction sets
running the same code. In most applications the effect is very
small, and it's not always the same one that's fast either.
Post by gene heskett
It, latency-test, has recently been worked on but I've not tested
it on a Debian image of either flavor since.
Do you have your latency-test output available for reference
somewhere?

To establish a baseline, it would be good if someone could run
the same test using debian armhf userland on similar hardware
with this kernel:
https://packages.debian.org/bookworm/linux-image-rt-arm64

If that can reproduce the bad numbers you observed, the next
step would be to try a corresponding 32-bit kernel and see if
that is better, but that requires building a custom package.
There is a linux-image-rt-armmp package in bookworm, but to
get PCI and USB3 working on Raspberry Pi 4, one needs to
enable both the PCI driver and CONFIG_ARM_LPAE, possibly
more.
Post by gene heskett
It is now in testing so those of you w/o 5 years of history to lose
could try it for the price of a 64G u-sd card.
For some reason, linuxcnc is still missing for armhf. I managed
to build the source package, which had a minor issue finding the
libboost_python310 dependency, but it worked after I added
that. I don't have the right system to test on myself though.

[Side note: I hope you are not storing any important data on an
SD card. Even with "industrial grade" ones, I would recommend
doing regular backups to more permanent storage, and the
usual consumer cards are not designed to handle running a
general-purpose OS at all and will cause data corruption over
time]

Arnd
gene heskett
2022-07-16 12:50:01 UTC
Permalink
Post by Arnd Bergmann
Post by gene heskett
Post by Paul Wise
Post by gene heskett
Because our latency-test results are better on armhf than on arm64,
we use armhf for its performance.
Are these results for armhf kernel with armhf userland?
The whole install of raspios is armhf. So I guess its yes.
Post by Paul Wise
Are the results for arm64 kernel with armhf userland similar?
I have not tried to build an aarch64 from the src I have.
Post by Paul Wise
How much worse are the results for arm64 kernel and userland?
No, not exact, but its roughly 4x longer when I can get it to run,
as it did check for a realtime preempt kernel and did a graceful
exit if not found. So its not been run on 64 bit Debian image since
stretch.
There are unfortunately a number of variables here that make things
really hard to compare, any of these can have an effect that dominates
- 4.19 is four years old, and both the mainline kernel and the
preempt-rt patches have changed a lot in the meantime. It's
possible that a current preempt-rt has regressed compared to
the version you are running. If so, we can work on fixing the
regression for future kernels, but there won't be much interest
in working on the old kernel
- Raspberry Pi OS (and Raspbian before that) has a number of
platform specific kernel patches that are neither in mainline
Linux nor in the Debian kernel packages. It is possible that they
have already identified and fixed a source of latency in their
kernels but not managed to upstream that fix for a number of
reasons.
Their kernels are uniformly horrible with latency's ranging to above a
millisecond.
Linuxcnc simply refuses to run on those kernels.
Post by Arnd Bergmann
- A lot of kernel configuration options can have a huge impact
on latency, it's not just preempt-rt that can be turned on or off,
but any device driver that disables preemption for too long can
increase the maximum latency of the system.
- The raspbian user space should have very little effect on
latency but it's worth pointing out that you may see different
performance between armv6 (raspbian)
I wish you would admit that the raspios I am running IS armhf (kernel7l)
I've no clue where you got the impression it was v6. It is not.
Post by Arnd Bergmann
and armv7 (debian
armhf), between vfpv3-d16 (raspbian and debian armhf) and
neon-d32 (fedora and others), and between a32 (raspbian),
t32 (debian armhf) and a64 (debian arm64), instruction sets
running the same code. In most applications the effect is very
small, and it's not always the same one that's fast either.
Post by gene heskett
It, latency-test, has recently been worked on but I've not tested
it on a Debian image of either flavor since.
Do you have your latency-test output available for reference
somewhere?
To establish a baseline, it would be good if someone could run
the same test using debian armhf userland on similar hardware
https://packages.debian.org/bookworm/linux-image-rt-arm64
If that can reproduce the bad numbers you observed, the next
step would be to try a corresponding 32-bit kernel and see if
that is better, but that requires building a custom package.
There is a linux-image-rt-armmp package in bookworm, but to
get PCI and USB3 working on Raspberry Pi 4, one needs to
enable both the PCI driver and CONFIG_ARM_LPAE, possibly
more.
Post by gene heskett
It is now in testing so those of you w/o 5 years of history to lose
could try it for the price of a 64G u-sd card.
For some reason, linuxcnc is still missing for armhf. I managed
to build the source package, which had a minor issue finding the
libboost_python310 dependency, but it worked after I added
that. I don't have the right system to test on myself though.
[Side note: I hope you are not storing any important data on an
SD card. Even with "industrial grade" ones, I would recommend
doing regular backups to more permanent storage, and the
usual consumer cards are not designed to handle running a
general-purpose OS at all and will cause data corruption over
time]
The card its running on right now is over 2 years old, zero problems,
and has had all updates, including a daily update of linuxcnc from the
the buildbot, or if the buildbot is down, my own scripts also building
installable deb's. The secret is use a big enough card that it has enough
room to do its maintenance. 64G card has around 15G's on it.

Take care and stay well everybody.
Post by Arnd Bergmann
Arnd
.
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Arnd Bergmann
2022-07-16 14:40:01 UTC
Permalink
Post by gene heskett
Post by Arnd Bergmann
- 4.19 is four years old, and both the mainline kernel and the
preempt-rt patches have changed a lot in the meantime. It's
possible that a current preempt-rt has regressed compared to
the version you are running. If so, we can work on fixing the
regression for future kernels, but there won't be much interest
in working on the old kernel
- Raspberry Pi OS (and Raspbian before that) has a number of
platform specific kernel patches that are neither in mainline
Linux nor in the Debian kernel packages. It is possible that they
have already identified and fixed a source of latency in their
kernels but not managed to upstream that fix for a number of
reasons.
Their kernels are uniformly horrible with latency's ranging to above a
millisecond.
Linuxcnc simply refuses to run on those kernels.
I think this is simply because raspbian does not ship any preempt-rt
kernel themselves, so clearly their kernel binaries won't be low-latency.

You did not say where you got the kernel that you are running
successfully, so as far as I could tell this might be a combination
of the raspbian patches and the preempt-rt patches.
Post by gene heskett
Post by Arnd Bergmann
- The raspbian user space should have very little effect on
latency but it's worth pointing out that you may see different
performance between armv6 (raspbian)
I wish you would admit that the raspios I am running IS armhf (kernel7l)
I've no clue where you got the impression it was v6. It is not.
This paragraph was about the user space, not the kernel.

The entire reason for Raspbian's existence is that it runs on armv6
hardware like the Raspberry Pi 1, which Debian armhf does not
run on. Building for armv6 means they can run the same user space
on all hardware generations from v6 to v8, and they advertise this
on their website:
https://www.raspberrypi.com/software/operating-systems/
ship at least two separate 32-bit kernels, since an LPAE-enabled
kernel is needed to access PCI and high memory but is incompatible
with Armv6 hardware.

Arnd
gene heskett
2022-07-16 16:10:01 UTC
Permalink
Post by Arnd Bergmann
Post by gene heskett
Post by Arnd Bergmann
- 4.19 is four years old, and both the mainline kernel and the
preempt-rt patches have changed a lot in the meantime. It's
possible that a current preempt-rt has regressed compared to
the version you are running. If so, we can work on fixing the
regression for future kernels, but there won't be much interest
in working on the old kernel
- Raspberry Pi OS (and Raspbian before that) has a number of
platform specific kernel patches that are neither in mainline
Linux nor in the Debian kernel packages. It is possible that they
have already identified and fixed a source of latency in their
kernels but not managed to upstream that fix for a number of
reasons.
Their kernels are uniformly horrible with latency's ranging to above a
millisecond.
Linuxcnc simply refuses to run on those kernels.
I think this is simply because raspbian does not ship any preempt-rt
kernel themselves, so clearly their kernel binaries won't be low-latency.
You did not say where you got the kernel that you are running
successfully, so as far as I could tell this might be a combination
of the raspbian patches and the preempt-rt patches.\
I got this as 4.19.y, and applied the realtime patch kit. I've not been
able to find another kernel src any newer that even admits to having a
realtime preempt in its config, it is conspicuously absent in anything
newer, and I am subbed to linux-rt so I see all the new stuff being
announced. But the .configs have not included a realtime you can
see, let alone turn on. This particular 4.19.y was obtained from a git
link I was supplied by a forum msg, about a day before I was black holed.

I've been on my own since, several years now. It was me that figured
out how to build it, and it was me who figured out a way to install it.

Up until now, when I've asked arm questions here, I've essentially been
told to bug/buzz off. Which is why I made the comment about a civil
discussion.
Its a surprise, and I certainly welcome it. I've never had the intention
of being
a PITA.

What you've seen in the uname -a output I've posted was my 2nd build, the
first one was built when 4.19 was fairly new but the video was still
slow, this
one was after some patches that enable the specific gfx these arms used. And
TBT, I was amazed that it actually worked both times. I had been led to
believe
that stability was not in the arm vocabulary, but this is at least as
solid as
wintel stuff has ever been. OTOH, wheezy was a giant step fwd in that
department.

I can put that src tarball up on my web page, but 4.19.y s/b available
at faster
servers. I've only a 10 megabaud adsl, meaning slow slow downloads from
my site.

Take care & stay well.
Post by Arnd Bergmann
Post by gene heskett
Post by Arnd Bergmann
- The raspbian user space should have very little effect on
latency but it's worth pointing out that you may see different
performance between armv6 (raspbian)
I wish you would admit that the raspios I am running IS armhf (kernel7l)
I've no clue where you got the impression it was v6. It is not.
This paragraph was about the user space, not the kernel.
The entire reason for Raspbian's existence is that it runs on armv6
hardware like the Raspberry Pi 1, which Debian armhf does not
run on. Building for armv6 means they can run the same user space
on all hardware generations from v6 to v8, and they advertise this
https://www.raspberrypi.com/software/operating-systems/
ship at least two separate 32-bit kernels, since an LPAE-enabled
kernel is needed to access PCI and high memory but is incompatible
with Armv6 hardware.
Arnd
.
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Matthias Klein
2022-07-16 18:30:02 UTC
Permalink
Post by gene heskett
I've not been
able to find another kernel src any newer that even admits to having a
realtime preempt in its config, it is conspicuously absent in anything
newer, and I am subbed to linux-rt so I see all the new stuff being
announced.
You could try the following kernel:

https://github.com/kdoren/linux
https://github.com/kdoren/linux/discussions/11

(I have no own experience with the above kernel)

Best regards,
Matthias
Lennart Sorensen
2022-07-16 15:30:01 UTC
Permalink
Post by gene heskett
I wish you would admit that the raspios I am running IS armhf (kernel7l)
I've no clue where you got the impression it was v6. It is not.
You said raspios which sure looks like raspian. Raspian/Raspberry Pi
OS is armv6. If you are running Debian armhf, then it is armv7 but it
would be a lot less confusing to call it Debian and not raspios in
that case.

Doesn't matter what your kernel is. What is your userspace you are
actually running?
Post by gene heskett
The card its running on right now is over 2 years old, zero problems,
and has had all updates, including a daily update of linuxcnc from the
the buildbot, or if the buildbot is down, my own scripts also building
installable deb's. The secret is use a big enough card that it has enough
room to do its maintenance. 64G card has around 15G's on it.
Backups once in a while of the card is still nice to have.
--
Len Sorensen
gene heskett
2022-07-16 16:20:02 UTC
Permalink
Post by Lennart Sorensen
Post by gene heskett
I wish you would admit that the raspios I am running IS armhf (kernel7l)
I've no clue where you got the impression it was v6. It is not.
You said raspios which sure looks like raspian. Raspian/Raspberry Pi
OS is armv6. If you are running Debian armhf, then it is armv7 but it
would be a lot less confusing to call it Debian and not raspios in
that case.
raspian/raspios is available in all 3 flavors.
Post by Lennart Sorensen
Doesn't matter what your kernel is. What is your userspace you are
actually running?
Post by gene heskett
The card its running on right now is over 2 years old, zero problems,
and has had all updates, including a daily update of linuxcnc from the
the buildbot, or if the buildbot is down, my own scripts also building
installable deb's. The secret is use a big enough card that it has enough
room to do its maintenance. 64G card has around 15G's on it.
Backups once in a while of the card is still nice to have.
True, but when those two seagate 2T drives puked in quick succession, I
lost all
my patched amanda sources. I'd only been running amanda since 1998.
I figured if bullseye ever stabilizes I might see about starting it up
again. But UM
sold it to zmanda so progress stopped, and then was sold to Betsol about 2
years back, and so far they've been all hat and no cattle. As far as I'm
concerned its time to look for a new backup strategy that mimics how amanda
worked. IOW I'm still building this box, almost from scratch. Its been
very painful
so far with bullseye.

Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Lennart Sorensen
2022-07-16 18:30:01 UTC
Permalink
Post by gene heskett
raspian/raspios is available in all 3 flavors.
Oh right they do have the other ones, they are just not the ones they
recommend by default.
Post by gene heskett
True, but when those two seagate 2T drives puked in quick succession, I lost
all
my patched amanda sources. I'd only been running amanda since 1998.
I figured if bullseye ever stabilizes I might see about starting it up
again. But UM
sold it to zmanda so progress stopped, and then was sold to Betsol about 2
years back, and so far they've been all hat and no cattle. As far as I'm
concerned its time to look for a new backup strategy that mimics how amanda
worked. IOW I'm still building this box, almost from scratch. Its been very
painful
so far with bullseye.
Yeah loosing multiple drives sucks.

My important stuff sits on a raid6 setup and does automatic off site
(to my parents house) backups using rsnapshot.
--
Len Sorensen
Diederik de Haas
2022-07-16 18:40:01 UTC
Permalink
Post by gene heskett
Post by Lennart Sorensen
You said raspios which sure looks like raspian. Raspian/Raspberry Pi
OS is armv6. If you are running Debian armhf, then it is armv7 but it
would be a lot less confusing to call it Debian and not raspios in
that case.
raspian/raspios is available in all 3 flavors.
Raspbian(.org) was created by Peter Green (plugwash) (and Mike Thompson who's
name is still attached to raspbian(.org)'s GPG key, but otherwise moved on)
precisely because the RPi 1 did not meet the armhf/armv7 qualifications that
Debian uses.
The Raspberry Pi Foundation (RPF) started with (Debian's) armel (armv5)
architecture, but that was slow and didn't optimally use the HW that was
available on the RPi 1.

So Plugwash (and Mike) started a recompilation of the Debian archive which
makes better use of the HW available in the RPi 1. Confusingly, they labeled
it armhf, while it was and is NOT the same as Debian's armhf.
To add to the confusion, RPF called their OS also Raspbian :-/

AFAIK it's still Plugwash that runs the buildd which compiles the packages for
Raspbian/RaspiOS, but those packages are now also mirrored on RPF servers/
archives. That is still ~armv6 (+hardfloat+sth IIRC).

The RPi 2 (and newer) can run Debian's armhf (armv7).

The RPi 3 and newer can also run arm64 and that is the same as Debian's.

I am *quite* sure RaspiOS is not available in normal/Debian's armhf, but only
in their own armv6+ (but labeled armhf) and arm64.

HTH
Lennart Sorensen
2022-07-16 19:10:01 UTC
Permalink
Post by Diederik de Haas
Raspbian(.org) was created by Peter Green (plugwash) (and Mike Thompson who's
name is still attached to raspbian(.org)'s GPG key, but otherwise moved on)
precisely because the RPi 1 did not meet the armhf/armv7 qualifications that
Debian uses.
The Raspberry Pi Foundation (RPF) started with (Debian's) armel (armv5)
architecture, but that was slow and didn't optimally use the HW that was
available on the RPi 1.
So Plugwash (and Mike) started a recompilation of the Debian archive which
makes better use of the HW available in the RPi 1. Confusingly, they labeled
it armhf, while it was and is NOT the same as Debian's armhf.
To add to the confusion, RPF called their OS also Raspbian :-/
AFAIK it's still Plugwash that runs the buildd which compiles the packages for
Raspbian/RaspiOS, but those packages are now also mirrored on RPF servers/
archives. That is still ~armv6 (+hardfloat+sth IIRC).
The RPi 2 (and newer) can run Debian's armhf (armv7).
The RPi 3 and newer can also run arm64 and that is the same as Debian's.
I am *quite* sure RaspiOS is not available in normal/Debian's armhf, but only
in their own armv6+ (but labeled armhf) and arm64.
Yeah looking at raspberrypi.com they seem to have 64 bit and 32 bit
builds. The 32 bit is definitely armv6 since it says it is compatible
with all versions of the pi. Pretty sure they have never done armv7
since that would just be what Debian already provides and would break
the Pi 0 and Pi 1 after all although the 2 would be happier. It shoudl
happily run on a kernel that supports armv7 but it does mean user space
certainly isn't fully taking advantage of what a pi 2 or newer offers.

I certainly only see 2 flavours on the page. The original armel I don't
think has been made for quite a few years at this point and proper armv7
armhf they have certainly never done either. So they have 2 flavours:
armv6 and armv8.
--
Len Sorensen
Arnd Bergmann
2022-07-15 19:50:01 UTC
Permalink
Post by Wookey
On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?
Does this only work for the RPi4, or does it enable/prevent 32-bit
kernels on other 64-bit machines?
The bug report was only about a missing driver for the Raspberry Pi 4,
enabling the driver has no effect on other machines besides making the
kernel slightly bigger, or adding another kernel module if it can be
modular (most PCI host drivers are traditionally built-in, though that is
not strictly necessary).
Post by Wookey
Do i386 kernels work on amd64 machines?
Sounds like something that might be worth discussion at debconf next week. I'll mention it in the talk.
It is generally possible to run 32-bit kernels on 64-bit hardware on x86,
some armv8 and mips, but there are a lot of downsides. On powerpc,
sparc, riscv, and newer armv8/v9, one has to run a 64-bit kernel.

Traditionally you'd only have a 64-bit kernel but 32-bit user space, at
least on powerpc, pa-risc and sparc.

I think x86 and arm are the odd ones out here, because Debian has
never shipped a 64-bit kernel packaged as a 32-bit .deb file here,
though at the moment mipsel is the only one that ships with
64-bit kernel by default.

Arnd
Lennart Sorensen
2022-07-15 20:00:01 UTC
Permalink
Post by Arnd Bergmann
It is generally possible to run 32-bit kernels on 64-bit hardware on x86,
some armv8 and mips, but there are a lot of downsides. On powerpc,
sparc, riscv, and newer armv8/v9, one has to run a 64-bit kernel.
Traditionally you'd only have a 64-bit kernel but 32-bit user space, at
least on powerpc, pa-risc and sparc.
I think x86 and arm are the odd ones out here, because Debian has
never shipped a 64-bit kernel packaged as a 32-bit .deb file here,
though at the moment mipsel is the only one that ships with
64-bit kernel by default.
http://snapshot.debian.org/archive/debian/20050312T000000Z/pool/main/k/kernel-image-2.6.9-amd64/kernel-image-2.6.9-9-amd64-generic_2.6.9-4_i386.deb

Debian did plenty of them. It was very commonly used.
--
Len Sorensen
Lennart Sorensen
2022-07-15 20:00:01 UTC
Permalink
Post by Wookey
Ah thanks Paul. I was wondering why we were being accused of 'Debian
abandonning armhf' when it was news to me, and I'm just writing the
'ARM ports status' talk for Debconf next week.
Clearly one normally does not run foreign-arch kernels on hardware so
we don't have to support it, and Ben is right to say 'this is not a
bug'.
On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?
Does this only work for the RPi4, or does it enable/prevent 32-bit kernels on other 64-bit machines?
Certainly people have been running 32 bit kernels on the Pi 3 and 4 and
it works fine. Some high end aarch64 CPUs don't support 32 bit mode,
but that is certainly not the case for the Pi's CPU.
Post by Wookey
Do i386 kernels work on amd64 machines?
Yes, but... They certainly don't work with more than 3.5GB or so of ram
unless you use the pae version of the kernel then you can have some more.
There have been issues in some cases with systems that had too much ram
where rather than just ignoring it the kernel would fail to boot.

Of course it was very common a 15 or 20 years ago to run debian i386 with
an amd64 kernel and was fully supported by debian including the installer
which as far as I remember even recommended that kernel if supported by
the host. Quite a bit of user space code still had issues with 64 bit,
but you got to run a kernel that could take full advantage of your ram
and other cpu features, while running 32 bit user space (since the amd64
kernel of course can run i386 binaries just fine).

For example this was very much in Debian:
http://snapshot.debian.org/archive/debian/20050312T000000Z/pool/main/k/kernel-image-2.6.9-amd64/kernel-image-2.6.9-9-amd64-generic_2.6.9-4_i386.deb

So an amd64 kernel in the i386 archive.
Post by Wookey
Sounds like something that might be worth discussion at debconf next week. I'll mention it in the talk.
Well it would essentially mean treating arm like i386 used to be treated.
It is certainly not a thing Debian hasn't supported before.
--
Len Sorensen
Arnd Bergmann
2022-07-15 20:50:01 UTC
Permalink
On Fri, Jul 15, 2022 at 9:55 PM Lennart Sorensen
Post by Lennart Sorensen
Post by Wookey
Ah thanks Paul. I was wondering why we were being accused of 'Debian
abandonning armhf' when it was news to me, and I'm just writing the
'ARM ports status' talk for Debconf next week.
Clearly one normally does not run foreign-arch kernels on hardware so
we don't have to support it, and Ben is right to say 'this is not a
bug'.
On the other hand, if the armhf kernel does work on RPi4 with a few
config options, and there is an actual use case, then the question is
what is the downside of enabling the config options?
Does this only work for the RPi4, or does it enable/prevent 32-bit kernels on other 64-bit machines?
Certainly people have been running 32 bit kernels on the Pi 3 and 4 and
it works fine. Some high end aarch64 CPUs don't support 32 bit mode,
but that is certainly not the case for the Pi's CPU.
Post by Wookey
Do i386 kernels work on amd64 machines?
Yes, but... They certainly don't work with more than 3.5GB or so of ram
unless you use the pae version of the kernel then you can have some more.
There have been issues in some cases with systems that had too much ram
where rather than just ignoring it the kernel would fail to boot.
This is exactly the same situation on arm as on x86: without an (L)PAE-enabled
kernel, only the low (device specific) few GB are accessible, anything at
a higher physical address than 4GB disappears completely.
I don't know how much of the memory this affects on bcm2711, but at least
for the 8GB Raspberry Pi there is no point running with the default (non-LPAE)
kernel.

The other issues with running 32-bit kernels are also similar to x86:

- many new features are only added on 64-bit kernels or are only available
in 64-bit mode, so running a 32-bit kernel is less secure

- errata workarounds for newer CPU cores are often missing, and
the 32-bit kernel doesn't even officially support armv8 hardware,
though it mostly works in practice.

- virtual address space is somewhat more limited, with usually 3GB
(sometimes less) in 32-bit kernels, but the full 4GB for 32-bit processes
on 64-bit kernels.

- running with highmem is generally no fun, so anything above 768MB
of physical RAM can not easily be used. highmem support will eventually
go away. We have plans to still support up to 4GB of physical memory
using a new memory layout at a performance penalty for extra page table
switches.
Post by Lennart Sorensen
Of course it was very common a 15 or 20 years ago to run debian i386 with
an amd64 kernel and was fully supported by debian including the installer
which as far as I remember even recommended that kernel if supported by
the host. Quite a bit of user space code still had issues with 64 bit,
but you got to run a kernel that could take full advantage of your ram
and other cpu features, while running 32 bit user space (since the amd64
kernel of course can run i386 binaries just fine).
http://snapshot.debian.org/archive/debian/20050312T000000Z/pool/main/k/kernel-image-2.6.9-amd64/kernel-image-2.6.9-9-amd64-generic_2.6.9-4_i386.deb
Ah right, I forgot about that and only remembered the MIPS ones that
still do this.
Post by Lennart Sorensen
So an amd64 kernel in the i386 archive.
Post by Wookey
Sounds like something that might be worth discussion at debconf next week. I'll mention it in the talk.
Well it would essentially mean treating arm like i386 used to be treated.
It is certainly not a thing Debian hasn't supported before.
This probably also requires a 64-bit grub in addition to the kernel, but
should otherwise not be too difficult. In particular, I would hope that
the build infrastructure for the kernel package can be adapted so it builds
a near-identical image to the normal arm64 kernel package and not
require extra regression testing.

Arnd
LinAdmin
2022-07-18 17:00:01 UTC
Permalink
I won't care about Debian Arm anymore because Ubuntu jammy
2204 LTS 32 bit runs like a charm.
LinAdmin
Post by Arnd Bergmann
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
Please stop the name calling, and the spreading of misinformation on this list.
...
Arnd
Tobias Frost
2022-07-19 18:20:01 UTC
Permalink
LinAdmin,

your way how you interact on this Mailling list is highly inappropiate.

You've been called out due to name calling and instead of appoligizing you're
doubling down in your response.

Such kind of messages will not be tolerated on Debian mailing lists, and you
have been told that before.

To be frank, until you want to constructivly interact with the Debian
community you are _NOT_ welcome.
--
tobi
Post by LinAdmin
I won't care about Debian Arm anymore because Ubuntu jammy
2204 LTS 32 bit runs like a charm.
LinAdmin
Post by Arnd Bergmann
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
Please stop the name calling, and the spreading of misinformation on this list.
...
Arnd
LinAdmin
2022-07-21 07:10:01 UTC
Permalink
Tobias!

My posting did not contain any name and if somebody felt
addressed that explains many issues :p

Did you ever hear about the Streisand effect or do you have
any good excuses why you warmed up this threads after my
message that I have switched to Ubuntu 22.4 LTS which does
works as expected on Arm 32?

And btw, I had made detailed and constructive suggestions
what few patches would be needed to get Debian running on
Pi4 and this was down turned by the so called experts
without any good arguments.

To be frank, your text reads as a paid troll post.
LinAdmin
Post by Andrew M.A. Cater
LinAdmin,
your way how you interact on this Mailling list is highly inappropiate.
You've been called out due to name calling and instead of appoligizing you're
doubling down in your response.
Such kind of messages will not be tolerated on Debian mailing lists, and you
have been told that before.
To be frank, until you want to constructivly interact with the Debian
community you are _NOT_ welcome.
Andrew M.A. Cater
2022-07-15 15:10:01 UTC
Permalink
Post by gene heskett
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
I agree.  So far, raspios is still available in armhf flavor, and for
running
heavy machinery with just a few microseconds to respond to an IRQ,
armhf builds are a given. LinuxCNC is such an application.
LinAdmin,

Just a thought: you might want to check which "so called experts" you are
calling out here. Wookey has been an expert at ARM for the last 15 years or
more - he knows what he's talking about.

Debian has *not* abandoned armhf - Debian is one of the last Linux
distributions actively supporting 32 bit for ARM or Intel architectures.

Gene,

Raspberry Pi OS is **not** Debian. Strictly, it's very much on its own as a forkof Raspbian from Peter Green.
For historical reasons, Raspbian and Raspberry Pi OS are the odd ones out with
their version of armhf - arm v6 plus hardware floating point originally,
where everyone else had settled on arm v7 plus hardware floating point.

LinuxCNC is probably supported by neither Debian nor Raspberry Pi OS.

Finally, of course, it's useful for everyone to remember to be polite
and considerate - Code of Conduct applies here as everywhere else on
Debian's mailing lists.

With every good wish, as ever,

Andy Cater
Alan Corey
2022-07-15 16:40:01 UTC
Permalink
Debian ARM actually splits 3 ways: https://www.debian.org/ports/arm/
for armel, armhf, arm64. Raspbian still uses one version I think.

I had been using Raspbian for years until somebody there decided to
drop the LXDE/Openbox desktops with Bullseye. And they seem to be
using Debian now(?). I actually use a RPI ZeroW for doing audio
recording powered by a 18650 cell, but at the other end I have a few
RPI 3Bs, no need for arm64. So I took the path less trodden. Mostly
not bad (Debian armhf) but a few oddities like menu colors in Gimp are
mostly unreadable with the default theme.
Post by Andrew M.A. Cater
Post by gene heskett
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
I agree.  So far, raspios is still available in armhf flavor, and for
running
heavy machinery with just a few microseconds to respond to an IRQ,
armhf builds are a given. LinuxCNC is such an application.
LinAdmin,
Just a thought: you might want to check which "so called experts" you are
calling out here. Wookey has been an expert at ARM for the last 15 years or
more - he knows what he's talking about.
Debian has *not* abandoned armhf - Debian is one of the last Linux
distributions actively supporting 32 bit for ARM or Intel architectures.
Gene,
Raspberry Pi OS is **not** Debian. Strictly, it's very much on its own as a
forkof Raspbian from Peter Green.
For historical reasons, Raspbian and Raspberry Pi OS are the odd ones out with
their version of armhf - arm v6 plus hardware floating point originally,
where everyone else had settled on arm v7 plus hardware floating point.
LinuxCNC is probably supported by neither Debian nor Raspberry Pi OS.
Finally, of course, it's useful for everyone to remember to be polite
and considerate - Code of Conduct applies here as everywhere else on
Debian's mailing lists.
With every good wish, as ever,
Andy Cater
--
-------------
Education is contagious.
gene heskett
2022-07-15 17:30:01 UTC
Permalink
Post by Andrew M.A. Cater
Post by gene heskett
Post by LinAdmin
Pi 4 has much more throughput in 32-bit modes but the so
called experts of Debian decided to abandon it :-(
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
I agree.  So far, raspios is still available in armhf flavor, and for
running
heavy machinery with just a few microseconds to respond to an IRQ,
armhf builds are a given. LinuxCNC is such an application.
LinAdmin,
Just a thought: you might want to check which "so called experts" you are
calling out here. Wookey has been an expert at ARM for the last 15 years or
more - he knows what he's talking about.
Debian has *not* abandoned armhf - Debian is one of the last Linux
distributions actively supporting 32 bit for ARM or Intel architectures.
Gene,
Raspberry Pi OS is **not** Debian. Strictly, it's very much on its own as a forkof Raspbian from Peter Green.
For historical reasons, Raspbian and Raspberry Pi OS are the odd ones out with
their version of armhf - arm v6 plus hardware floating point originally,
where everyone else had settled on arm v7 plus hardware floating point.
Ahh contraire, linuxcnc is at this moment and has been since the
rpi was a 3b, running on the armhf version quite well in fact.  With
this kernel:
Linux rpi4.coyote.den 4.19.71-rt24-v7l+ #1 SMP PREEMPT RT Thu Feb 6
07:09:18 EST 2020 armv7l GNU/Linux
Post by Andrew M.A. Cater
LinuxCNC is probably supported by neither Debian nor Raspberry Pi OS.
Finally, of course, it's useful for everyone to remember to be polite
and considerate - Code of Conduct applies here as everywhere else on
Debian's mailing lists.
With every good wish, as ever,
Andy Cater
.
Cheers, Gene Heskett.
--
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author, 1940)
If we desire respect for the law, we must first make the law respectable.
- Louis D. Brandeis
Genes Web page <http://geneslinuxbox.net:6309/>
Aurelien Jarno
2022-08-16 19:30:01 UTC
Permalink
Hi,
Post by Wookey
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
Decent 32-bit arm hardware is thin on the ground these days. Debian
still has some but it's getting old and flaky. Being able to build
reliably on 64-bit hardware is important and useful. Unaligned
accesses are much less of a problem than they used to be, but they can
still happen, so having these fixups available is definitely a good
thing.
Debian runs its 32-bit buildds with alignment fixups turned on. It
looks like the boxes still hit about 1 per day.
We also do 32 bit builds on 64-bit kernels (in 32-bit userspaces) and
it mostly works. We do have packages that fail on 64-bit kernels and
have to be built on real 32-bit hardware, but I don't know how much of
that would be fixed by this patch. Some, presumably.
So yes, cheers for this. It is helpful in the real world (or at least
it should be).
I confirm that this would be very helpful to Debian, so that 32-bit
binaries behaves the same with a 32-bit or a 64-bit kernel. Otherwise we
need to keep running (old) 32-bit hardware.

What's the status of those patches?

Thanks
Aurelien
--
Aurelien Jarno GPG: 4096R/1DDD8C9B
***@aurel32.net http://www.aurel32.net
Ard Biesheuvel
2022-08-16 21:00:02 UTC
Permalink
Post by Aurelien Jarno
Hi,
Post by Wookey
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
Decent 32-bit arm hardware is thin on the ground these days. Debian
still has some but it's getting old and flaky. Being able to build
reliably on 64-bit hardware is important and useful. Unaligned
accesses are much less of a problem than they used to be, but they can
still happen, so having these fixups available is definitely a good
thing.
Debian runs its 32-bit buildds with alignment fixups turned on. It
looks like the boxes still hit about 1 per day.
We also do 32 bit builds on 64-bit kernels (in 32-bit userspaces) and
it mostly works. We do have packages that fail on 64-bit kernels and
have to be built on real 32-bit hardware, but I don't know how much of
that would be fixed by this patch. Some, presumably.
So yes, cheers for this. It is helpful in the real world (or at least
it should be).
I confirm that this would be very helpful to Debian, so that 32-bit
binaries behaves the same with a 32-bit or a 64-bit kernel. Otherwise we
need to keep running (old) 32-bit hardware.
What's the status of those patches?
Thanks for chiming in.

At this point, it is really up to the maintainers to decide whether
the maintenance burden is worth it. The code itself seems pretty
uncontroversial afaict.

Might other distros be in a similar situation? Or is this specific to Debian?
Arnd Bergmann
2022-08-17 09:50:01 UTC
Permalink
Post by Ard Biesheuvel
Thanks for chiming in.
At this point, it is really up to the maintainers to decide whether
the maintenance burden is worth it. The code itself seems pretty
uncontroversial afaict.
Might other distros be in a similar situation? Or is this specific to Debian?
My guess is that this is the most prominent on Debian: Many others including
have discontinued or reduced support for 32-bit builds across architectures:
Ubuntu only supports "Core" with fewer packages on Raspberry Pi 2 but
not desktop or server, Opensuse Leap and Tumbleweed both distributes a
lot of board specific images but you have to know where to look as the main
page only advertises amd64/i686/arm64/ppc64le/s390x, Fedora stopped
entirely.

Android may be an interesting distro here: there are still a lot of phones
running a pure 32-bit userland on Cortex-A53/A55 CPUs, and there are a
large number of applications built for this. As far as I can tell, they tend to
run 32-bit kernels as well, but that is not going to work on newer processors
starting with Cortex-A76 big cores or Cortex-A510 little cores.

archlinuxarm supports 32-bit and 64-bit machines equally, but they
apparently avoid the build service problem by using distcc with
x86-to-arm cross compilers, and they don't seem to support
their 32-bit images on 64-bit hardware/kernel.

https://hub.docker.com/search?q=&source=verified&type=image&architecture=arm&image_filter=official
lists 98 "official" arm32 images plus countless ones in other categories.
I think these are popular in memory-constrained cloud hosting
setups on arm64, so the Alpine based images are probably the most
interesting ones because of their size, but they would run under
someone else's kernel.

Arnd
LinAdmin
2022-08-18 15:30:01 UTC
Permalink
I do know that you do not like my comment that 32bit on Pi4
is much more efficient than 64 Bit ...
Linadmin
Post by Arnd Bergmann
Post by Ard Biesheuvel
Thanks for chiming in.
At this point, it is really up to the maintainers to decide whether
the maintenance burden is worth it. The code itself seems pretty
uncontroversial afaict.
Might other distros be in a similar situation? Or is this specific to Debian?
My guess is that this is the most prominent on Debian: Many others including
Ubuntu only supports "Core" with fewer packages on Raspberry Pi 2 but
not desktop or server, Opensuse Leap and Tumbleweed both distributes a
lot of board specific images but you have to know where to look as the main
page only advertises amd64/i686/arm64/ppc64le/s390x, Fedora stopped
entirely.
Android may be an interesting distro here: there are still a lot of phones
running a pure 32-bit userland on Cortex-A53/A55 CPUs, and there are a
large number of applications built for this. As far as I can tell, they tend to
run 32-bit kernels as well, but that is not going to work on newer processors
starting with Cortex-A76 big cores or Cortex-A510 little cores.
archlinuxarm supports 32-bit and 64-bit machines equally, but they
apparently avoid the build service problem by using distcc with
x86-to-arm cross compilers, and they don't seem to support
their 32-bit images on 64-bit hardware/kernel.
https://hub.docker.com/search?q=&source=verified&type=image&architecture=arm&image_filter=official
lists 98 "official" arm32 images plus countless ones in other categories.
I think these are popular in memory-constrained cloud hosting
setups on arm64, so the Alpine based images are probably the most
interesting ones because of their size, but they would run under
someone else's kernel.
Arnd
Andrew M.A. Cater
2022-08-18 17:00:02 UTC
Permalink
Post by LinAdmin
I do know that you do not like my comment that 32bit on Pi4
is much more efficient than 64 Bit ...
Linadmin
Good afternoon, LinAdmin

It does appear to me that this comment is not directly relevant to this
message and might not be helpful. Nobody mentioned this in this thread
today apart from you

It might be worth looking at the Debian Code of Conduct: please be considerate,
think about how you come across and be constructive. Working with people is
more useful than provoking them.

This list - as all Debian lists - is subject to the Debian Code of Conduct to
allow us to work better together. This is a reminder rather than a warning.

With every good wish, as ever,

Andy Cater

[For the Debian Community Team.]
LinAdmin
2022-08-19 13:20:01 UTC
Permalink
Good night Andy

Is it possible you never have heard of the Streisand effect?

Regards
LinAdmin
Post by Andrew M.A. Cater
Post by LinAdmin
I do know that you do not like my comment that 32bit on Pi4
is much more efficient than 64 Bit ...
Linadmin
Good afternoon, LinAdmin
It does appear to me that this comment is not directly relevant to this
message and might not be helpful. Nobody mentioned this in this thread
today apart from you
It might be worth looking at the Debian Code of Conduct: please be considerate,
think about how you come across and be constructive. Working with people is
more useful than provoking them.
This list - as all Debian lists - is subject to the Debian Code of Conduct to
allow us to work better together. This is a reminder rather than a warning.
With every good wish, as ever,
Andy Cater
[For the Debian Community Team.]
Andrew M.A. Cater
2022-08-19 14:10:01 UTC
Permalink
Post by LinAdmin
Good night Andy
Is it possible you never have heard of the Streisand effect?
Regards
LinAdmin
Hi LinAdmin,

As one of the people who wrote the FAQ for the debian-user mailing list,
not only have I heard of it, I specifically referenced it. If you are trying
to get some random posting removed from a Debian mailing list, this is
difficult becasue it may be archived anywhere in the world. Complaining
too much about it risks the Streisand effect. See, for example, the latest
https://lists.debian.org/debian-user/2022/08/msg00004.html

My reply to you yesterday was to ask you to be considerate, think of
others on the mailing list and to specifically consider how your words
may come across. Working constructively with people, adding to what
they are doing, offering help is more useful than offering criticism
in general.

This is a community: you are working with / being read by people you may
never meet and it helps to work out who you are dealing with and what
ability they may have to help you.

In an earlier message in the thread:
<https://lists.debian.org/debian-arm/2022/07/msg00039.html>
you suggested that you didn't care about Debian because Ubuntu
worked well.

We do care about what you've said, some of the people following
the list are people who might be able to help evaluate what we can do to
improve Debian but the way that you are making your point is not the most
helpful to what you are trying to say. There may be better ways to explain as
you wish to without putting down other people's work.

The Code of Conduct (https://www.debian.org/code_of_conduct) and Debian
Community Team notes (https://www.debian.org/code_of_conduct_interpretation)
refer.

I'd suggest particularly that section 2 - Assume good faith is very relevant in this situation.
Post by LinAdmin
Post by Andrew M.A. Cater
Post by LinAdmin
I do know that you do not like my comment that 32bit on Pi4
is much more efficient than 64 Bit ...
Linadmin
Good afternoon, LinAdmin
It does appear to me that this comment is not directly relevant to this
message and might not be helpful. Nobody mentioned this in this thread
today apart from you
With every good wish, as ever,

Andy Cater

[For the Debian Community Team].
Catalin Marinas
2022-08-31 17:40:01 UTC
Permalink
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
so let's implement it for the arm64 compat layer as well.
Note that the intent is to implement the exact same handling of
misaligned multi-word loads and stores as the 32-bit kernel does,
including what appears to be missing support for user space programs
that rely on SETEND to switch to a different byte order and back. Also,
like the 32-bit ARM version, we rely on the faulting address reported by
the CPU to infer the memory address, instead of decoding the instruction
fully to obtain this information.
This implementation is taken from the 32-bit ARM tree, with all pieces
removed that deal with instructions other than LDRD/STRD and LDM/STM, or
that deal with alignment exceptions taken in kernel mode.
---
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
v2: - drop some obsolete comments
- emit a perf alignment-fault event for every handled instruction
- use arm64_skip_faulting_instruction() to get the correct behavior
wrt IT state and single step
- use types with correct endianness annotation (instructions are
always little endian on v7/v8+)
It looks like that's a fairly popular request from people running 32-bit
user on AArch64 kernels, so happy to queue it for 6.1 (if it still
applies cleanly). I'm not too keen on code duplication but it's a lot
more hassle to create a common decoding/emulation library to share with
arch/arm, especially as such code is not going to change in the future.
+config COMPAT_ALIGNMENT_FIXUPS
+ bool "Fix up misaligned multi-word loads and stores in user space"
+ default y
For consistency with ARMV8_DEPRECATED, I think we should keep this as
default n.

Thanks.
--
Catalin
Ard Biesheuvel
2022-09-05 10:30:01 UTC
Permalink
Post by Catalin Marinas
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.
This feature is one of the remaining impediments to being able to switch
to 64-bit kernels on 64-bit capable hardware running 32-bit user space,
soDocumentation/x86/boot.rst let's implement it for the arm64 compat layer as well.
Note that the intent is to implement the exact same handling of
misaligned multi-word loads and stores as the 32-bit kernel does,
including what appears to be missing support for user space programs
that rely on SETEND to switch to a different byte order and back. Also,
like the 32-bit ARM version, we rely on the faulting address reported by
the CPU to infer the memory address, instead of decoding the instruction
fully to obtain this information.
This implementation is taken from the 32-bit ARM tree, with all pieces
removed that deal with instructions other than LDRD/STRD and LDM/STM, or
that deal with alignment exceptions taken in kernel mode.
---
Note to cc'ees: if this is something you would like to see merged,
please indicate so. This stuff is unlikely to get in if there are no
users.
v2: - drop some obsolete comments
- emit a perf alignment-fault event for every handled instruction
- use arm64_skip_faulting_instruction() to get the correct behavior
wrt IT state and single step
- use types with correct endianness annotation (instructions are
always little endian on v7/v8+)
It looks like that's a fairly popular request from people running 32-bit
user on AArch64 kernels, so happy to queue it for 6.1 (if it still
applies cleanly). I'm not too keen on code duplication but it's a lot
more hassle to create a common decoding/emulation library to share with
arch/arm, especially as such code is not going to change in the future.
+config COMPAT_ALIGNMENT_FIXUPS
+ bool "Fix up misaligned multi-word loads and stores in user space"
+ default y
For consistency with ARMV8_DEPRECATED, I think we should keep this as
default n.
Fair enough. I take it you can fix this up while applying?
Catalin Marinas
2022-09-05 12:00:01 UTC
Permalink
Post by Ard Biesheuvel
Post by Catalin Marinas
+config COMPAT_ALIGNMENT_FIXUPS
+ bool "Fix up misaligned multi-word loads and stores in user space"
+ default y
For consistency with ARMV8_DEPRECATED, I think we should keep this as
default n.
Fair enough. I take it you can fix this up while applying?
Yes.
--
Catalin
Catalin Marinas
2022-09-06 18:10:01 UTC
Permalink
The 32-bit ARM kernel implements fixups on behalf of user space when
using LDM/STM or LDRD/STRD instructions on addresses that are not 32-bit
aligned. This is not something that is supported by the architecture,
but was done anyway to increase compatibility with user space software,
which mostly targeted x86 at the time and did not care about aligned
accesses.
[...]
Applied to arm64 (for-next/misc), thanks!

[1/1] arm64: compat: Implement misalignment fixups for multiword loads
https://git.kernel.org/arm64/c/3fc24ef32d3b
--
Catalin
Loading...