Page 2 of 2

Re: Signal Driven SocketCAN

Posted: Tue 31. Jul 2018, 15:37
by martinkepplinger
S.Grosjean wrote:Hi,

The question is: does the last frame really arrives before the previous one or is last timestamp wrong?
Can you check the sequence of the frames with the content of the data byte?

— Stéphane
Hi Stéphane,

Yes I can. I know the timestamps aren't wrong. I can also send the (known) frames from a STM32's CAN interface sequentially and I always send the exact same frames you see - the upper (correct) half of what I've printed - in order.

thanks,
martin

Re: Signal Driven SocketCAN

Posted: Wed 1. Aug 2018, 17:50
by martinkepplinger
FWIW, another example I can trigger pretty quickly, after about half an hour at 125000 kbit/s. I now have a larger random (known) response and send it from an STM32 using only one of the driver's "mailboxes", really ensuring that I put frames on the bus _sequentially_.

Again, I paste the last good tx (000000) / rx (the rest) sequence, recorded by candump.

Code: Select all

 (1533138257.113354)  can0  000   [3]  00 00 00
 (1533138257.115425)  can0  000   [8]  00 5E 71 71 77 65 72 74
 (1533138257.116335)  can0  000   [8]  7A 75 69 6F 70 FC 2B 61
 (1533138257.117307)  can0  000   [8]  73 64 66 67 68 6A 6B 6C
 (1533138257.118205)  can0  000   [8]  F6 E4 23 3C 79 78 63 76
 (1533138257.119125)  can0  000   [8]  62 6E 6D 2C 2E 2D 51 57
 (1533138257.120024)  can0  000   [8]  45 52 54 5A 55 49 4F 50
 (1533138257.120972)  can0  000   [8]  DC 2A 41 53 44 46 47 48
 (1533138257.121881)  can0  000   [8]  4A 4B 4C D6 C4 59 58 43
 (1533138257.122833)  can0  000   [8]  56 42 4E 4D 3A 5F 31 32
 (1533138257.123758)  can0  000   [8]  33 34 35 36 37 38 39 30
 (1533138257.124673)  can0  000   [8]  DF A7 24 25 26 2F 28 29
 (1533138257.125649)  can0  000   [8]  3D 3F 2D 30 2E 32 2B 00
 (1533138257.126102)  can0  000   [1]  D3
 (1533138257.127064)  can0  000   [3]  00 00 00
 (1533138257.130052)  can0  000   [8]  7A 75 69 6F 70 FC 2B 61
 (1533138257.129114)  can0  000   [8]  00 5E 71 71 77 65 72 74
 (1533138257.130991)  can0  000   [8]  73 64 66 67 68 6A 6B 6C
 (1533138257.131891)  can0  000   [8]  F6 E4 23 3C 79 78 63 76
 (1533138257.132823)  can0  000   [8]  62 6E 6D 2C 2E 2D 51 57
 (1533138257.133735)  can0  000   [8]  45 52 54 5A 55 49 4F 50
 (1533138257.134680)  can0  000   [8]  DC 2A 41 53 44 46 47 48
 (1533138257.135589)  can0  000   [8]  4A 4B 4C D6 C4 59 58 43
 (1533138257.136486)  can0  000   [8]  56 42 4E 4D 3A 5F 31 32
 (1533138257.137445)  can0  000   [8]  33 34 35 36 37 38 39 30
 (1533138257.138387)  can0  000   [8]  DF A7 24 25 26 2F 28 29
 (1533138257.139306)  can0  000   [8]  3D 3F 2D 30 2E 32 2B 00
 (1533138257.139853)  can0  000   [1]  D3
Again, 130052 is later than 129114, but read earlier!

Re: Signal Driven SocketCAN

Posted: Thu 2. Aug 2018, 11:36
by S.Grosjean
Hi,

FYI we run a test for 2 days now with a modified candump that must stop when it reads a timestamp older than the previous one... While we aren't able to reproduce this issue, could you please check whether it exists any BIOS upgrade for your PC? This issue might also come from your USB chipset.

Could you please tell us more about your Linux platform: cat /proc/cpuinfo /proc/version and dmesg please?

Re: Signal Driven SocketCAN

Posted: Thu 2. Aug 2018, 12:25
by martinkepplinger
S.Grosjean wrote:Hi,

FYI we run a test for 2 days now with a modified candump that must stop when it reads a timestamp older than the previous one... While we aren't able to reproduce this issue, could you please check whether it exists any BIOS upgrade for your PC? This issue might also come from your USB chipset.
There exists one indeed for the current machine. I'll keep that in mind an will try to test on a different workstation that I know is up to date.
S.Grosjean wrote: Could you please tell us more about your Linux platform: cat /proc/cpuinfo /proc/version and dmesg please?
sure. I only shortened dmesg because it's massive and includes tons of useless other stuff. But the peak-related output looks interesting. getting the serial number fails... anything else?

Re: Signal Driven SocketCAN

Posted: Thu 2. Aug 2018, 16:25
by S.Grosjean
Hi,

I finally have found it again:
https://marc.info/?l=linux-can&m=148000256801316&w=2

This thread from two linux-can members is about SMP and USB interfaces regarding the socket layer. They complain about receiving out-of-order CAN frames. After having contacted them again, they confirm that receiving out-of-order CAN frames is possible in SMP systems - and your system is an 8-core one - when reading from a CAN socket.

So your code should take into account that issue. Or you should use a non-USB PC-CAN interface (from PEAK-System, of course ;-) ).

Note that nothing says that this issue has not been fixed in the network layer since 2016... v4.9 is 20 months old and as said before, we're running a v4.15 and haven't seen that issue yet...

Regards,

Re: Signal Driven SocketCAN

Posted: Fri 3. Aug 2018, 12:20
by martinkepplinger
Thanks, that's interesting. Have you tried to reproduce it on Linux 4.9?

Re: Signal Driven SocketCAN

Posted: Fri 3. Aug 2018, 13:42
by S.Grosjean
We're currently running the testbed on Debian 9 with linux-4.9.30 and a 12-core i7 CPU. Waiting for the issue to occur...

Re: Signal Driven SocketCAN

Posted: Mon 6. Aug 2018, 08:32
by martinkepplinger
We've been running the same test on Linux 4.17 over the weekend and couldn't reproduce the error. It really seems to be fixed. I'd be interesting to see how the fix looks like but we're working around the issue by sorting frames at the receiver for now.

thanks for your support!