Dropped CAN-Frames

This forum covers PCAN-Linux and Linux development issues concerning our products
lstutz
Posts: 8
Joined: Fri 19. Jun 2015, 16:07

Dropped CAN-Frames

Post by lstutz » Fri 19. Jun 2015, 16:52

Hello,

I'm currently developing an application using USB PEAK-Can dongles. The dongles are attached to different machines running the latest Ubuntu (kernel 3.19.0-15) and a Manjaro (kernel 4.1.0). SocketCan is used to interface with the dongles.
I'm facing an environment which will occasionally produce >90% busload at 250kbit/s. During that I noticed, that when using the machine (moving/resizing the IDE window) while receiving the messages, it can happen that a lot of frames will be dropped. I attached source for a demo which will demonstrate the issue. By setting the CONFIG_RX or CONFIG_TX defines, a transmitter and a receiver demo can be built. The TX will create an monotonically increasing sequence and the RX will check if there are any gaps in the transmission. The busload can be adjusted by changing the usleep time in the TX program. In my setup something around 50% busload is sufficient to produce the issue. Start the RX first and than the TX. Try moving or resizing some windows on your machine. At some point the RX will exit and show the amount of dropped Frames.
My question is, if the fault is somewhere in my demo, or if it's inside the peak driver and how I can make sure, that no frames are dropped during reception.
canTest.c
(2.47 KiB) Downloaded 1108 times
Best regards,
Leonhard

M.Maidhof
Support
Support
Posts: 1753
Joined: Wed 22. Sep 2010, 14:00

Re: Dropped CAN-Frames

Post by M.Maidhof » Mon 22. Jun 2015, 09:18

Hi,

do you use the kernel drivers, or did you install the peak-linux-driver with socketCAN option on your system? When using the peak-linux-driver, please post the output of cat /proc/pcan.

regards

Michael

User avatar
S.Grosjean
Software Development
Software Development
Posts: 357
Joined: Wed 4. Jul 2012, 17:02

Re: Dropped CAN-Frames

Post by S.Grosjean » Mon 22. Jun 2015, 09:23

Hello!

Same question is: do you run your application over the "peak-usb" mainline driver or above our "pcan" driver?

In both case, if the hw looses any frames, some overrun info is given to the upper layers in both cases.

Moreover, using the socket layer, you must be sure that all of the frames you're writing are really sent to the driver! But I don't see in your code (in the tx part) any mechanism that handles the fact that the outgoing buffer of the socket layer might be full (nbytes < 0 and errno=ENOBUFS).

In the last case, maybe you could increase the system's socket Rx buffer size to let the system more space to save incoming frames while you're resizing your window.

Regards,

Stéphane
— Stéphane

lstutz
Posts: 8
Joined: Fri 19. Jun 2015, 16:07

Re: Dropped CAN-Frames

Post by lstutz » Mon 22. Jun 2015, 11:05

Hi,

I'm using the peak-usb kernel drivers.
In both case, if the hw looses any frames, some overrun info is given to the upper layers in both cases.
I don't get an overrun or error indication on the RX side ('ip -details -statistics link show can0' gives no error indication).
Moreover, using the socket layer, you must be sure that all of the frames you're writing are really sent to the driver! But I don't see in your code (in the tx part) any mechanism that handles the fact that the outgoing buffer of the socket layer might be full (nbytes < 0 and errno=ENOBUFS).
I'm using the canutils (canbusload) to make sure the load is below 90%. And I'm running a cantrace on a third device which registers all frames, so I can say for sure that all frames get trasmitted over the bus and it is a problem with the receiving side.
In the last case, maybe you could increase the system's socket Rx buffer size to let the system more space to save incoming frames while you're resizing your window.
I checked my /proc/sys/net/core/rmem_default and its set to 212992 which is the same as rmem_max. So I don't think I can set it any higher?

Regards,

Leonhard

User avatar
S.Grosjean
Software Development
Software Development
Posts: 357
Joined: Wed 4. Jul 2012, 17:02

Re: Dropped CAN-Frames

Post by S.Grosjean » Mon 22. Jun 2015, 11:48

Hi,

Ok, thanks for your tests. Please, to check whether the leakage of frames really comes from the socket layer, could you please check the errors counters located int your own sysfs?

For example (supposing your PCAN-USB is connected as "can0"):

Code: Select all

for c in /sys/class/net/can0/statistics/*; do echo -n "`basename $c`: "; cat $c; done
Thanks and regards,

Stéphane
— Stéphane

lstutz
Posts: 8
Joined: Fri 19. Jun 2015, 16:07

Re: Dropped CAN-Frames

Post by lstutz » Mon 22. Jun 2015, 12:09

Thanks so far for your help!
Now it shows the Overrun errors:

Code: Select all

for c in /sys/class/net/can0/statistics/*; do echo -n "'basename $c': "; cat /$c; done
'basename /sys/class/net/can0/statistics/collisions': 0
'basename /sys/class/net/can0/statistics/multicast': 0
'basename /sys/class/net/can0/statistics/rx_bytes': 81648
'basename /sys/class/net/can0/statistics/rx_compressed': 0
'basename /sys/class/net/can0/statistics/rx_crc_errors': 0
'basename /sys/class/net/can0/statistics/rx_dropped': 0
'basename /sys/class/net/can0/statistics/rx_errors': 1
'basename /sys/class/net/can0/statistics/rx_fifo_errors': 0
'basename /sys/class/net/can0/statistics/rx_frame_errors': 0
'basename /sys/class/net/can0/statistics/rx_length_errors': 0
'basename /sys/class/net/can0/statistics/rx_missed_errors': 0
'basename /sys/class/net/can0/statistics/rx_over_errors': 1
'basename /sys/class/net/can0/statistics/rx_packets': 10232
'basename /sys/class/net/can0/statistics/tx_aborted_errors': 0
'basename /sys/class/net/can0/statistics/tx_bytes': 0
'basename /sys/class/net/can0/statistics/tx_carrier_errors': 0
'basename /sys/class/net/can0/statistics/tx_compressed': 0
'basename /sys/class/net/can0/statistics/tx_dropped': 0
'basename /sys/class/net/can0/statistics/tx_errors': 0
'basename /sys/class/net/can0/statistics/tx_fifo_errors': 0
'basename /sys/class/net/can0/statistics/tx_heartbeat_errors': 0
'basename /sys/class/net/can0/statistics/tx_packets': 0
'basename /sys/class/net/can0/statistics/tx_window_errors': 0
And ip link shows them as well:

Code: Select all

ip -details -statistics link show can0
22: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-ACTIVE restart-ms 0 
	  bitrate 250000 sample-point 0.875 
	  tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
	  pcan_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
	  clock 8000000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         
    RX: bytes  packets  errors  dropped overrun mcast   
    228776     28597    1       0       1       0      
    TX: bytes  packets  errors  dropped carrier collsns 
    0          0        0       0       0       0
So how can I make sure, that the hardware buffer is always read before it overruns?

Regards,
Leonhard

User avatar
S.Grosjean
Software Development
Software Development
Posts: 357
Joined: Wed 4. Jul 2012, 17:02

Re: Dropped CAN-Frames

Post by S.Grosjean » Mon 22. Jun 2015, 17:08

Hi,

Ok, your stats show that an overrun issue has occurred in the PCAN-USB. Could you please confirm that, each time you loose some CAN frames (when you're re-sizing the window), the "rx_over_errors" is increasing? In other words, we would want to be sure that the frames you're losing are always lost because of the PCAN-USB overrun.

Thanks and regards,

Stéphane
— Stéphane

lstutz
Posts: 8
Joined: Fri 19. Jun 2015, 16:07

Re: Dropped CAN-Frames

Post by lstutz » Mon 22. Jun 2015, 18:00

Hi,
Ok, your stats show that an overrun issue has occurred in the PCAN-USB. Could you please confirm that, each time you loose some CAN frames (when you're re-sizing the window), the "rx_over_errors" is increasing? In other words, we would want to be sure that the frames you're losing are always lost because of the PCAN-USB overrun.
That is somewhat inconsistent. On my Ubuntu machine, the overrun counter will always increment when packages are lost. So after 4 lost frames I get:

Code: Select all

ip -details -statistics link show can0
5: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
    link/can  promiscuity 0 
    can state ERROR-ACTIVE restart-ms 0 
	  bitrate 250000 sample-point 0.875 
	  tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
	  pcan_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
	  clock 8000000
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          0          0          0         
    RX: bytes  packets  errors  dropped overrun mcast   
    3047392    380924   4       0       4       0      
    TX: bytes  packets  errors  dropped carrier collsns 
    803896     100487   0       0       0       0 
On my manjaro, loosing frames will not result in reported overruns at all (The error reporting seems to be broken with the latest kernel driver version?). Here is a screencapture I did. It shows, that I sometimes don't need to do anything in oder to have dropped frames. Here the 0x073E package is missing. This was at 58% busload. Is there some way to increase the internal buffer of the usb dongle via a driver parameter? I really need this to work reliably and I never had issues with the dongles running on windows machines.

Regards,
Leonhard

User avatar
S.Grosjean
Software Development
Software Development
Posts: 357
Joined: Wed 4. Jul 2012, 17:02

Re: Dropped CAN-Frames

Post by S.Grosjean » Tue 23. Jun 2015, 10:12

Hello!

Thanks for your tests. Your Manjaro system is running a *very* recent kernel. And it seems that it has been discovered some loss of CAN frames issues in these recent kernels (see https://lkml.org/lkml/2015/6/21/115). This could be some explanation of your issue on this system...

To help you in working around this, I would suggest to use our "pcan" driver in netdev mode instead of the peak-usb driver. I personally did some high-bus load tests with the dongle @500kbps and didn't encounter any loss of frame issue as you did. This could be a good way of trying to find where the leakage comes from.

Thus, first, download the last version of the pcan driver from http://www.peak-system.com/fileadmin/me ... .14.tar.gz. Then untar it and make it. Be sure to rmmod the peak-usb module before isnmod'ing pcan.ko, otherwise the PCAN-USB CAN channel won't be visible by pcan.

Your socket-CAN application should not be impacted by this change.

Tell us what happens please.

Regards,

Stéphane
— Stéphane

lstutz
Posts: 8
Joined: Fri 19. Jun 2015, 16:07

Re: Dropped CAN-Frames

Post by lstutz » Tue 23. Jun 2015, 14:52

Hello Stéphane,
Your Manjaro system is running a *very* recent kernel. And it seems that it has been discovered some loss of CAN frames issues in these recent kernels (see https://lkml.org/lkml/2015/6/21/115). This could be some explanation of your issue on this system...
I applied the patch and recompiled my kernel, but issue is still present.
To help you in working around this, I would suggest to use our "pcan" driver in netdev mode instead of the peak-usb driver.
I downloaded and build the driver according to the instructions. Unfortunately, they changed the asm stuff in the recent 4.1 kernel, so I was only able to test this on my Ubuntu running the 3.19. It is loaded and set to 250kbit/s.

Code: Select all

lsmod | grep pcan
pcan                   94208  0 
pcmcia                 65536  1 pcan
i2c_algo_bit           16384  1 pcan
parport                45056  4 lp,pcan,ppdev,parport_pc
'lsmod | grep peak' returns no result.

Code: Select all

cat /proc/pcan 

*------------- PEAK-System CAN interfaces (www.peak-system.com) -------------
*------------- Release_20141219_n (7.14.0) Jun 23 2015 11:30:01 --------------
*------------- [mod] [isa] [pci] [dng] [par] [usb] [pcc] [net] --------------
*--------------------- 1 interfaces @ major 250 found -----------------------
*n -type- -ndev- --base-- irq --btr- --read-- --write- --irqs-- -errors- status
32    usb   can0 ffffffff 225 0x011c 00000000 00000000 00000002 00000000 0x0000
But the issue remains the same, I'm still loosing frames. Any further ideas?

Edit: When using the current peak-driver the missed frames also do not produce an overrun error.

Regards,
Leonhard

Post Reply