Discussion:
scikit-learn in unstable FTBFS on arm64, armel, armhf, i386, ppc64el and s390x
(too old to reply)
John Paul Adrian Glaubitz
2022-02-16 11:00:01 UTC
Permalink
Hello!
Is anyone able to help with the bus error on armhf please?
Bus errors are normally easy to spot. Just run the code in question through
GDB and see where it crashes. Then look at the backtrace with the debug
symbols installed.

Usually it's a result of bad pointer arithmetics which should definitely be
fixed as such operations usually violate the C/C++ standards.

I can have quick look.

Adrian
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
John Paul Adrian Glaubitz
2022-02-16 11:10:02 UTC
Permalink
HellO!
Post by John Paul Adrian Glaubitz
Is anyone able to help with the bus error on armhf please?
Bus errors are normally easy to spot. Just run the code in question through
GDB and see where it crashes. Then look at the backtrace with the debug
symbols installed.
Usually it's a result of bad pointer arithmetics which should definitely be
fixed as such operations usually violate the C/C++ standards.
So, I have skimmed over the build logs and one of the main issues is the use of
-march flags to enforce a certain baseline [1]:

powerpc64le-linux-gnu-gcc: error: unrecognized command-line option ‘-march=native’; did you mean ‘-mcpu=native’?

This is a policy violation and must be fixed in any case. Blacklisting architectures
is not enough in this case as forcing the baseline of the buildds can lead to code
that won't run on the user's machines.

Adrian
Post by John Paul Adrian Glaubitz
[1] https://buildd.debian.org/status/fetch.php?pkg=scikit-learn&arch=ppc64el&ver=1.0.2-1&stamp=1644956229&raw=0
--
.''`. John Paul Adrian Glaubitz
: :' : Debian Developer - ***@debian.org
`. `' Freie Universitaet Berlin - ***@physik.fu-berlin.de
`- GPG: 62FF 8A75 84E0 2956 9546 0006 7426 3B37 F5B5 F913
Christian Kastner
2022-02-16 12:30:01 UTC
Permalink
Hi,
Post by John Paul Adrian Glaubitz
Hello!
Is anyone able to help with the bus error on armhf please?
Bus errors are normally easy to spot. Just run the code in question through
GDB and see where it crashes. Then look at the backtrace with the debug
symbols installed.
Usually it's a result of bad pointer arithmetics which should definitely be
fixed as such operations usually violate the C/C++ standards.
I can have quick look.
one of these errors has been reported in the past, and I already did
some analysis way back then:

https://github.com/scikit-learn/scikit-learn/issues/16443

Check the last comment. The relevant Cython code doesn't look wrong, so
I guess the problem is with the binary result produced during build, as
you point out.

Best,
Christian
Graham Inggs
2022-08-25 11:30:01 UTC
Permalink
Hi Adrian

On Wed, 16 Feb 2022 at 13:36, John Paul Adrian Glaubitz
Post by Christian Kastner
Post by John Paul Adrian Glaubitz
Bus errors are normally easy to spot. Just run the code in question through
GDB and see where it crashes. Then look at the backtrace with the debug
symbols installed.
Usually it's a result of bad pointer arithmetics which should definitely be
fixed as such operations usually violate the C/C++ standards.
I can have quick look.
one of these errors has been reported in the past, and I already did
https://github.com/scikit-learn/scikit-learn/issues/16443
Check the last comment. The relevant Cython code doesn't look wrong, so
I guess the problem is with the binary result produced during build, as
you point out.
I'm happy to look at this issue but first the baseline issue must be fixed
as this is a Debian Policy violation.
It was pointed out by Gard Spreemann [1], but I notice now that
it seems superficially plausible that the march=native
invocations are just instances of the compiler being probed.
I have also had a look and cannot see that '-march=native' is used in
the actual builds on any of the architectures.

It would be very much appreciated if the arm porters could take a look
at this issue, as it still plagues the scikit-learn autopkgtests on
armhf [2], and currently prevents quite a number of packages from
being part of testing. It appears that armel [1] has the same error,
so hopefully one fix could resolve both.

Regards
Graham


[1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1003165#86
[2] https://ci.debian.net/packages/s/scikit-learn/testing/armhf/
[3] https://ci.debian.net/packages/s/scikit-learn/testing/armel/
` Vagrant Cascadian
2022-08-25 16:20:01 UTC
Permalink
Post by Graham Inggs
On Wed, 16 Feb 2022 at 13:36, John Paul Adrian Glaubitz
it seems superficially plausible that the march=native
invocations are just instances of the compiler being probed.
I have also had a look and cannot see that '-march=native' is used in
the actual builds on any of the architectures.
It would be very much appreciated if the arm porters could take a look
at this issue, as it still plagues the scikit-learn autopkgtests on
armhf [2], and currently prevents quite a number of packages from
being part of testing. It appears that armel [1] has the same error,
so hopefully one fix could resolve both.
I pretty much think of myself, at best, as half an armhf/arm64 porter,
but this is a little bit outside of the scope of what I offered to look
after in the porter roll calls...

Apparently I am the only porter for armhf and arm64? I had assumed there
would be someone else to fill the gaps in my skillset, but I guess
not.

Help?

live well,
vagrant
Steve McIntyre
2022-08-25 16:40:01 UTC
Permalink
Post by ` Vagrant Cascadian
Post by Graham Inggs
On Wed, 16 Feb 2022 at 13:36, John Paul Adrian Glaubitz
it seems superficially plausible that the march=native
invocations are just instances of the compiler being probed.
I have also had a look and cannot see that '-march=native' is used in
the actual builds on any of the architectures.
It would be very much appreciated if the arm porters could take a look
at this issue, as it still plagues the scikit-learn autopkgtests on
armhf [2], and currently prevents quite a number of packages from
being part of testing. It appears that armel [1] has the same error,
so hopefully one fix could resolve both.
I pretty much think of myself, at best, as half an armhf/arm64 porter,
but this is a little bit outside of the scope of what I offered to look
after in the porter roll calls...
Apparently I am the only porter for armhf and arm64? I had assumed there
would be someone else to fill the gaps in my skillset, but I guess
not.
Argh. I used to do this, but I don't have the time or the inclination
to step up any more. I'm very surprised to not see Wookey not list
himself, tbh.
--
Steve McIntyre, Cambridge, UK. ***@einval.com
'There is some grim amusement in watching Pence try to run the typical
"politician in the middle of a natural disaster" playbook, however
incompetently, while Trump scribbles all over it in crayon and eats some
of the pages.' -- Russ Allbery
Wookey
2022-08-25 20:20:01 UTC
Permalink
Post by Steve McIntyre
Post by ` Vagrant Cascadian
Apparently I am the only porter for armhf and arm64? I had assumed there
would be someone else to fill the gaps in my skillset, but I guess
not.
Argh. I used to do this, but I don't have the time or the inclination
to step up any more. I'm very surprised to not see Wookey not list
himself, tbh.
Yeah I should be on the list, but it looks like I wrote a reply to the
'call for porters' back in Decemeber, but stopped to look something
up, got distracted and never actually sent it.

Wookey
--
Principal hats: Debian, Wookware, ARM
http://wookware.org/
Graham Inggs
2022-08-27 10:30:01 UTC
Permalink
Post by Wookey
Yeah I should be on the list, but it looks like I wrote a reply to the
'call for porters' back in Decemeber, but stopped to look something
up, got distracted and never actually sent it.
I'd be happy to add you to the list, if you sent that mail now. ;-)
Loading...