Hi,
I seem to have almost the exact same problem. I need frames to work FIFO but
(after running a simple tx/response at 115000 kbit/s for some time) I see an
out-of-order error, see below. Please note that "hours" may not be enough for
it to trigger but we can repeatedly trigger it. run at least for 1-2 days.
Our application uses Qt 5.9.5's serialbus module which uses Linux socketcan
on the current Debian 4.9 kernel (4.9.0-7-amd64 #1 SMP Debian 4.9.110-1
(2018-07-05) x86_64 GNU/Linux) and it's peak_usb mainline driver.
What you see is the last correct communication (first 4 lines) and the
out-of-order error (last 4 lines). Again, the CAN frame's timestamp shows the
_correct_ order of the frames. Look at the last 3 lines. Linux reads them
in the wrong order though! (415732us > 415213us, but read first).
Code: Select all
read-time frame-time frame-data
s ms s us
24.07.18-15:24:43.399 - CAN TX 0 0 : " 000 [3] 00 00 FF"
24.07.18-15:24:43.401 - CAN RX 1532438683 401589 : " 000 [8] 00 0F 6C 69 6E 72 70 63"
24.07.18-15:24:43.402 - CAN RX 1532438683 402493 : " 000 [8] 64 32 2D 31 38 2E 30 34"
24.07.18-15:24:43.403 - CAN RX 1532438683 403033 : " 000 [2] 00 9E"
24.07.18-15:24:43.411 - CAN TX 0 0 : " 000 [3] 00 00 FF"
24.07.18-15:24:43.414 - CAN RX 1532438683 414264 : " 000 [8] 00 0F 6C 69 6E 72 70 63"
24.07.18-15:24:43.415 - CAN RX 1532438683 415732 : " 000 [2] 00 9E"
24.07.18-15:24:43.417 - CAN RX 1532438683 415213 : " 000 [8] 64 32 2D 31 38 2E 30 34"
We run candump simultaneously for verification. It sees just the same mistake
(we trigger exit in our own application but watching the timestamps like you
did is equally fine):
Code: Select all
can0 000 [3] 00 00 FF
can0 000 [8] 00 0F 6C 69 6E 72 70 63
can0 000 [8] 64 32 2D 31 38 2E 30 34
can0 000 [2] 00 9E
can0 000 [3] 00 00 FF
can0 000 [8] 00 0F 6C 69 6E 72 70 63
can0 000 [2] 00 9E
can0 000 [8] 64 32 2D 31 38 2E 30 34
There are no error frames arriving.
We use the IPEH-002021 USB device, ser.no. 61680 (firmware version 2.8). Might
there be a firmware issue you know of?
We are further investigating, but it seems like there is a real problem.
Do you have further questions? Thanks for any advice,
Martin