Dropped CAN-Frames
Dropped CAN-Frames
Hello,
I'm currently developing an application using USB PEAK-Can dongles. The dongles are attached to different machines running the latest Ubuntu (kernel 3.19.0-15) and a Manjaro (kernel 4.1.0). SocketCan is used to interface with the dongles.
I'm facing an environment which will occasionally produce >90% busload at 250kbit/s. During that I noticed, that when using the machine (moving/resizing the IDE window) while receiving the messages, it can happen that a lot of frames will be dropped. I attached source for a demo which will demonstrate the issue. By setting the CONFIG_RX or CONFIG_TX defines, a transmitter and a receiver demo can be built. The TX will create an monotonically increasing sequence and the RX will check if there are any gaps in the transmission. The busload can be adjusted by changing the usleep time in the TX program. In my setup something around 50% busload is sufficient to produce the issue. Start the RX first and than the TX. Try moving or resizing some windows on your machine. At some point the RX will exit and show the amount of dropped Frames.
My question is, if the fault is somewhere in my demo, or if it's inside the peak driver and how I can make sure, that no frames are dropped during reception. Best regards,
Leonhard
I'm currently developing an application using USB PEAK-Can dongles. The dongles are attached to different machines running the latest Ubuntu (kernel 3.19.0-15) and a Manjaro (kernel 4.1.0). SocketCan is used to interface with the dongles.
I'm facing an environment which will occasionally produce >90% busload at 250kbit/s. During that I noticed, that when using the machine (moving/resizing the IDE window) while receiving the messages, it can happen that a lot of frames will be dropped. I attached source for a demo which will demonstrate the issue. By setting the CONFIG_RX or CONFIG_TX defines, a transmitter and a receiver demo can be built. The TX will create an monotonically increasing sequence and the RX will check if there are any gaps in the transmission. The busload can be adjusted by changing the usleep time in the TX program. In my setup something around 50% busload is sufficient to produce the issue. Start the RX first and than the TX. Try moving or resizing some windows on your machine. At some point the RX will exit and show the amount of dropped Frames.
My question is, if the fault is somewhere in my demo, or if it's inside the peak driver and how I can make sure, that no frames are dropped during reception. Best regards,
Leonhard
Re: Dropped CAN-Frames
Hi,
do you use the kernel drivers, or did you install the peak-linux-driver with socketCAN option on your system? When using the peak-linux-driver, please post the output of cat /proc/pcan.
regards
Michael
do you use the kernel drivers, or did you install the peak-linux-driver with socketCAN option on your system? When using the peak-linux-driver, please post the output of cat /proc/pcan.
regards
Michael
- S.Grosjean
- Software Development
- Posts: 357
- Joined: Wed 4. Jul 2012, 17:02
Re: Dropped CAN-Frames
Hello!
Same question is: do you run your application over the "peak-usb" mainline driver or above our "pcan" driver?
In both case, if the hw looses any frames, some overrun info is given to the upper layers in both cases.
Moreover, using the socket layer, you must be sure that all of the frames you're writing are really sent to the driver! But I don't see in your code (in the tx part) any mechanism that handles the fact that the outgoing buffer of the socket layer might be full (nbytes < 0 and errno=ENOBUFS).
In the last case, maybe you could increase the system's socket Rx buffer size to let the system more space to save incoming frames while you're resizing your window.
Regards,
Stéphane
Same question is: do you run your application over the "peak-usb" mainline driver or above our "pcan" driver?
In both case, if the hw looses any frames, some overrun info is given to the upper layers in both cases.
Moreover, using the socket layer, you must be sure that all of the frames you're writing are really sent to the driver! But I don't see in your code (in the tx part) any mechanism that handles the fact that the outgoing buffer of the socket layer might be full (nbytes < 0 and errno=ENOBUFS).
In the last case, maybe you could increase the system's socket Rx buffer size to let the system more space to save incoming frames while you're resizing your window.
Regards,
Stéphane
— Stéphane
Re: Dropped CAN-Frames
Hi,
I'm using the peak-usb kernel drivers.
Regards,
Leonhard
I'm using the peak-usb kernel drivers.
I don't get an overrun or error indication on the RX side ('ip -details -statistics link show can0' gives no error indication).In both case, if the hw looses any frames, some overrun info is given to the upper layers in both cases.
I'm using the canutils (canbusload) to make sure the load is below 90%. And I'm running a cantrace on a third device which registers all frames, so I can say for sure that all frames get trasmitted over the bus and it is a problem with the receiving side.Moreover, using the socket layer, you must be sure that all of the frames you're writing are really sent to the driver! But I don't see in your code (in the tx part) any mechanism that handles the fact that the outgoing buffer of the socket layer might be full (nbytes < 0 and errno=ENOBUFS).
I checked my /proc/sys/net/core/rmem_default and its set to 212992 which is the same as rmem_max. So I don't think I can set it any higher?In the last case, maybe you could increase the system's socket Rx buffer size to let the system more space to save incoming frames while you're resizing your window.
Regards,
Leonhard
- S.Grosjean
- Software Development
- Posts: 357
- Joined: Wed 4. Jul 2012, 17:02
Re: Dropped CAN-Frames
Hi,
Ok, thanks for your tests. Please, to check whether the leakage of frames really comes from the socket layer, could you please check the errors counters located int your own sysfs?
For example (supposing your PCAN-USB is connected as "can0"):
Thanks and regards,
Stéphane
Ok, thanks for your tests. Please, to check whether the leakage of frames really comes from the socket layer, could you please check the errors counters located int your own sysfs?
For example (supposing your PCAN-USB is connected as "can0"):
Code: Select all
for c in /sys/class/net/can0/statistics/*; do echo -n "`basename $c`: "; cat $c; done
Stéphane
— Stéphane
Re: Dropped CAN-Frames
Thanks so far for your help!
Now it shows the Overrun errors:
And ip link shows them as well:
So how can I make sure, that the hardware buffer is always read before it overruns?
Regards,
Leonhard
Now it shows the Overrun errors:
Code: Select all
for c in /sys/class/net/can0/statistics/*; do echo -n "'basename $c': "; cat /$c; done
'basename /sys/class/net/can0/statistics/collisions': 0
'basename /sys/class/net/can0/statistics/multicast': 0
'basename /sys/class/net/can0/statistics/rx_bytes': 81648
'basename /sys/class/net/can0/statistics/rx_compressed': 0
'basename /sys/class/net/can0/statistics/rx_crc_errors': 0
'basename /sys/class/net/can0/statistics/rx_dropped': 0
'basename /sys/class/net/can0/statistics/rx_errors': 1
'basename /sys/class/net/can0/statistics/rx_fifo_errors': 0
'basename /sys/class/net/can0/statistics/rx_frame_errors': 0
'basename /sys/class/net/can0/statistics/rx_length_errors': 0
'basename /sys/class/net/can0/statistics/rx_missed_errors': 0
'basename /sys/class/net/can0/statistics/rx_over_errors': 1
'basename /sys/class/net/can0/statistics/rx_packets': 10232
'basename /sys/class/net/can0/statistics/tx_aborted_errors': 0
'basename /sys/class/net/can0/statistics/tx_bytes': 0
'basename /sys/class/net/can0/statistics/tx_carrier_errors': 0
'basename /sys/class/net/can0/statistics/tx_compressed': 0
'basename /sys/class/net/can0/statistics/tx_dropped': 0
'basename /sys/class/net/can0/statistics/tx_errors': 0
'basename /sys/class/net/can0/statistics/tx_fifo_errors': 0
'basename /sys/class/net/can0/statistics/tx_heartbeat_errors': 0
'basename /sys/class/net/can0/statistics/tx_packets': 0
'basename /sys/class/net/can0/statistics/tx_window_errors': 0
Code: Select all
ip -details -statistics link show can0
22: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
link/can promiscuity 0
can state ERROR-ACTIVE restart-ms 0
bitrate 250000 sample-point 0.875
tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
pcan_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
clock 8000000
re-started bus-errors arbit-lost error-warn error-pass bus-off
0 0 0 0 0 0
RX: bytes packets errors dropped overrun mcast
228776 28597 1 0 1 0
TX: bytes packets errors dropped carrier collsns
0 0 0 0 0 0
Regards,
Leonhard
- S.Grosjean
- Software Development
- Posts: 357
- Joined: Wed 4. Jul 2012, 17:02
Re: Dropped CAN-Frames
Hi,
Ok, your stats show that an overrun issue has occurred in the PCAN-USB. Could you please confirm that, each time you loose some CAN frames (when you're re-sizing the window), the "rx_over_errors" is increasing? In other words, we would want to be sure that the frames you're losing are always lost because of the PCAN-USB overrun.
Thanks and regards,
Stéphane
Ok, your stats show that an overrun issue has occurred in the PCAN-USB. Could you please confirm that, each time you loose some CAN frames (when you're re-sizing the window), the "rx_over_errors" is increasing? In other words, we would want to be sure that the frames you're losing are always lost because of the PCAN-USB overrun.
Thanks and regards,
Stéphane
— Stéphane
Re: Dropped CAN-Frames
Hi,
On my manjaro, loosing frames will not result in reported overruns at all (The error reporting seems to be broken with the latest kernel driver version?). Here is a screencapture I did. It shows, that I sometimes don't need to do anything in oder to have dropped frames. Here the 0x073E package is missing. This was at 58% busload. Is there some way to increase the internal buffer of the usb dongle via a driver parameter? I really need this to work reliably and I never had issues with the dongles running on windows machines.
Regards,
Leonhard
That is somewhat inconsistent. On my Ubuntu machine, the overrun counter will always increment when packages are lost. So after 4 lost frames I get:Ok, your stats show that an overrun issue has occurred in the PCAN-USB. Could you please confirm that, each time you loose some CAN frames (when you're re-sizing the window), the "rx_over_errors" is increasing? In other words, we would want to be sure that the frames you're losing are always lost because of the PCAN-USB overrun.
Code: Select all
ip -details -statistics link show can0
5: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 16 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 10
link/can promiscuity 0
can state ERROR-ACTIVE restart-ms 0
bitrate 250000 sample-point 0.875
tq 250 prop-seg 6 phase-seg1 7 phase-seg2 2 sjw 1
pcan_usb: tseg1 1..16 tseg2 1..8 sjw 1..4 brp 1..64 brp-inc 1
clock 8000000
re-started bus-errors arbit-lost error-warn error-pass bus-off
0 0 0 0 0 0
RX: bytes packets errors dropped overrun mcast
3047392 380924 4 0 4 0
TX: bytes packets errors dropped carrier collsns
803896 100487 0 0 0 0
Regards,
Leonhard
- S.Grosjean
- Software Development
- Posts: 357
- Joined: Wed 4. Jul 2012, 17:02
Re: Dropped CAN-Frames
Hello!
Thanks for your tests. Your Manjaro system is running a *very* recent kernel. And it seems that it has been discovered some loss of CAN frames issues in these recent kernels (see https://lkml.org/lkml/2015/6/21/115). This could be some explanation of your issue on this system...
To help you in working around this, I would suggest to use our "pcan" driver in netdev mode instead of the peak-usb driver. I personally did some high-bus load tests with the dongle @500kbps and didn't encounter any loss of frame issue as you did. This could be a good way of trying to find where the leakage comes from.
Thus, first, download the last version of the pcan driver from http://www.peak-system.com/fileadmin/me ... .14.tar.gz. Then untar it and make it. Be sure to rmmod the peak-usb module before isnmod'ing pcan.ko, otherwise the PCAN-USB CAN channel won't be visible by pcan.
Your socket-CAN application should not be impacted by this change.
Tell us what happens please.
Regards,
Stéphane
Thanks for your tests. Your Manjaro system is running a *very* recent kernel. And it seems that it has been discovered some loss of CAN frames issues in these recent kernels (see https://lkml.org/lkml/2015/6/21/115). This could be some explanation of your issue on this system...
To help you in working around this, I would suggest to use our "pcan" driver in netdev mode instead of the peak-usb driver. I personally did some high-bus load tests with the dongle @500kbps and didn't encounter any loss of frame issue as you did. This could be a good way of trying to find where the leakage comes from.
Thus, first, download the last version of the pcan driver from http://www.peak-system.com/fileadmin/me ... .14.tar.gz. Then untar it and make it. Be sure to rmmod the peak-usb module before isnmod'ing pcan.ko, otherwise the PCAN-USB CAN channel won't be visible by pcan.
Your socket-CAN application should not be impacted by this change.
Tell us what happens please.
Regards,
Stéphane
— Stéphane
Re: Dropped CAN-Frames
Hello Stéphane,
'lsmod | grep peak' returns no result.
But the issue remains the same, I'm still loosing frames. Any further ideas?
Edit: When using the current peak-driver the missed frames also do not produce an overrun error.
Regards,
Leonhard
I applied the patch and recompiled my kernel, but issue is still present.Your Manjaro system is running a *very* recent kernel. And it seems that it has been discovered some loss of CAN frames issues in these recent kernels (see https://lkml.org/lkml/2015/6/21/115). This could be some explanation of your issue on this system...
I downloaded and build the driver according to the instructions. Unfortunately, they changed the asm stuff in the recent 4.1 kernel, so I was only able to test this on my Ubuntu running the 3.19. It is loaded and set to 250kbit/s.To help you in working around this, I would suggest to use our "pcan" driver in netdev mode instead of the peak-usb driver.
Code: Select all
lsmod | grep pcan
pcan 94208 0
pcmcia 65536 1 pcan
i2c_algo_bit 16384 1 pcan
parport 45056 4 lp,pcan,ppdev,parport_pc
Code: Select all
cat /proc/pcan
*------------- PEAK-System CAN interfaces (www.peak-system.com) -------------
*------------- Release_20141219_n (7.14.0) Jun 23 2015 11:30:01 --------------
*------------- [mod] [isa] [pci] [dng] [par] [usb] [pcc] [net] --------------
*--------------------- 1 interfaces @ major 250 found -----------------------
*n -type- -ndev- --base-- irq --btr- --read-- --write- --irqs-- -errors- status
32 usb can0 ffffffff 225 0x011c 00000000 00000000 00000002 00000000 0x0000
Edit: When using the current peak-driver the missed frames also do not produce an overrun error.
Regards,
Leonhard