netdev with PCAN-USB FD, wrong error frames

This forum covers PCAN-Linux and Linux development issues concerning our products
ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Fri 5. Mar 2021, 15:14

Hi Guys,

I have a Ubuntu Linux (x64) kernel 5.4.0-66-generic with a PCAN-USB FD attached and configured it for CAN FD as follows:

Code: Select all

ip link set can0 up type can bitrate 500000 sample-point 0.750 dbitrate 2000000 dsample-point 0.800 fd on
In my application it works well but the issue is how to get error information. I have enabled the receive of error frames (and turned off loopback as well as enabled FD) as follows:

Code: Select all

  /* set socket options */
  {
    int value;

    value = 0;
    // disable loopback
    setsockopt(mCANhandle[CAN_PORT_INDEX], SOL_CAN_RAW, CAN_RAW_LOOPBACK,
           &value, sizeof(value));
    // enable error filter, note that actual error frame support depends on kernel driver
    value = CAN_ERR_MASK;
    setsockopt(mCANhandle[CAN_PORT_INDEX], SOL_CAN_RAW, CAN_RAW_ERR_FILTER,
           &value, sizeof(value));
    // enable can fd
    value = 1;
    setsockopt(mCANhandle[CAN_PORT_INDEX], SOL_CAN_RAW, CAN_RAW_FD_FRAMES,
           &value, sizeof(value));
  }
My application is right now sending only (periodically, 2-4 messages per second). There is no other sending node on the bus, only another PCAN-USB Pro FD to acknowledge and capture messages. When I pull the PCAN-USB FD from the bus (LED turns red) and reconnect, I receive the following error frames:

Error (16 bytes, ID, data0-4): 0x00000004 0x00 0x04 0x00 0x00 0x00
Error (16 bytes, ID, data0-4): 0x00000004 0x00 0x10 0x00 0x00 0x00
Error (16 bytes, ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (16 bytes, ID, data0-4): 0x00000004 0x00 0x04 0x00 0x00 0x00

According to linux/can/error.h, ID=0x4 means "controller problems", ok. The important information is in data byte 1:

Code: Select all

/* error status of CAN-controller / data[1] */
#define CAN_ERR_CRTL_UNSPEC      0x00 /* unspecified */
#define CAN_ERR_CRTL_RX_OVERFLOW 0x01 /* RX buffer overflow */
#define CAN_ERR_CRTL_TX_OVERFLOW 0x02 /* TX buffer overflow */
#define CAN_ERR_CRTL_RX_WARNING  0x04 /* reached warning level for RX errors */
#define CAN_ERR_CRTL_TX_WARNING  0x08 /* reached warning level for TX errors */
#define CAN_ERR_CRTL_RX_PASSIVE  0x10 /* reached error passive status RX */
#define CAN_ERR_CRTL_TX_PASSIVE  0x20 /* reached error passive status TX */
				      /* (at least one error counter exceeds */
				      /* the protocol-defined level of 127)  */
#define CAN_ERR_CRTL_ACTIVE      0x40 /* recovered to error active state */
So the errors I received were RX_WARNING, error passive RX and RX buffer overflow. This is all receive related, even though my application was only transmitting. This correlates with the device statistics:

Code: Select all

$ ip -details -statistics link show can0
4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0 
    can <FD> state ERROR-PASSIVE (berr-counter tx 0 rx 135) restart-ms 0 
	  bitrate 500000 sample-point 0.750 
	  tq 12 prop-seg 59 phase-seg1 60 phase-seg2 40 sjw 1
	  pcan_usb_fd: tseg1 1..256 tseg2 1..128 sjw 1..128 brp 1..1024 brp-inc 1
	  dbitrate 2000000 dsample-point 0.800 
	  dtq 12 dprop-seg 15 dphase-seg1 16 dphase-seg2 8 dsjw 1
	  pcan_usb_fd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..1024 dbrp-inc 1
	  clock 80000000 
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          2          2          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    13300      1939     2       0       2       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    150003     102478   0       0       0       0       
So also here, only receive errors are shown. When I have plugged the device back in and transmit from the other side, eventually the device will become error active again and the LED is green. Statistics then:

Code: Select all

$ ip -details -statistics link show can0
4: can0: <NOARP,UP,LOWER_UP,ECHO> mtu 72 qdisc pfifo_fast state UP mode DEFAULT group default qlen 10
    link/can  promiscuity 0 minmtu 0 maxmtu 0 
    can <FD> state ERROR-ACTIVE (berr-counter tx 0 rx 83) restart-ms 0 
	  bitrate 500000 sample-point 0.750 
	  tq 12 prop-seg 59 phase-seg1 60 phase-seg2 40 sjw 1
	  pcan_usb_fd: tseg1 1..256 tseg2 1..128 sjw 1..128 brp 1..1024 brp-inc 1
	  dbitrate 2000000 dsample-point 0.800 
	  dtq 12 dprop-seg 15 dphase-seg1 16 dphase-seg2 8 dsjw 1
	  pcan_usb_fd: dtseg1 1..32 dtseg2 1..16 dsjw 1..16 dbrp 1..1024 dbrp-inc 1
	  clock 80000000 
	  re-started bus-errors arbit-lost error-warn error-pass bus-off
	  0          0          0          2          2          0         numtxqueues 1 numrxqueues 1 gso_max_size 65536 gso_max_segs 65535 
    RX: bytes  packets  errors  dropped overrun mcast   
    13578      1985     2       0       2       0       
    TX: bytes  packets  errors  dropped carrier collsns 
    151749     102847   0       0       0       0       
Two issues:
  • Why are the error frames and device status showing receive errors even though my application only transmits?
  • Why after recovery to error active is no error frame with status CAN_ERR_CRTL_ACTIVE generated?
Any light you can shed on it is appreciated.

Cheers,
Chris

User avatar
S.Grosjean
Software Development
Software Development
Posts: 302
Joined: Wed 4. Jul 2012, 17:02

Re: netdev with PCAN-USB FD, wrong error frames

Post by S.Grosjean » Mon 8. Mar 2021, 14:50

Hello,

Just a clarification: when you reconnect the PCAN-USB FD, your application is still periodically writing, right?

I mean: you disconnect the PCAN-USB FD. Then, the driver detects the disconnection and closes the CAN interface. Therefore, in my opinion, the CAN interface has first to be configured again before starting writing.

Regards,
— Stéphane

M.Heidemann
Sales & Support
Sales & Support
Posts: 673
Joined: Fri 20. Sep 2019, 13:31

Re: netdev with PCAN-USB FD, wrong error frames

Post by M.Heidemann » Mon 8. Mar 2021, 15:43

Hello Chris,

Additional info regarding this:

Do you also disconnect the termination when disconnnecting the device from the bus?

This will cause Rx-errors as the CAN-Controller will be unable to recognize the sent frame at all (Stuff Bit Errors).

When the node is disconnected with termination still attached to it, TX errors will be generated (Acknowledgment Slot).

Please let us know.

Best Regards

Marvin

ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

Re: netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Fri 12. Mar 2021, 16:19

Hello Marvin,

Thank you for your answer. That is an excellent explanation and I've just tried it. This is what I get when I pull the PCAN-USB FD off the bus, leaving the termination on:

Error (ID, data0-4): 0x00000004 0x00 0x08 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x20 0x00 0x00 0x00

I checked and can0 becomes error passive. Then after quite some time:

Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00
Error (ID, data0-4): 0x00000004 0x00 0x01 0x00 0x00 0x00

After I reconnect the bus:

Error (ID, data0-4): 0x00000004 0x00 0x08 0x00 0x00 0x00

can0 becomes error-active again.

So yes, you're right and now it sees the transmit errors. But the transition to error active is not indicated.

Cheers,
Chris

ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

Re: netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Fri 12. Mar 2021, 16:21

S.Grosjean wrote:
Mon 8. Mar 2021, 14:50
Just a clarification: when you reconnect the PCAN-USB FD, your application is still periodically writing, right?

I mean: you disconnect the PCAN-USB FD. Then, the driver detects the disconnection and closes the CAN interface. Therefore, in my opinion, the CAN interface has first to be configured again before starting writing.
Hello Stephane,

This was a misunderstanding, I disconnect the PCAN-USB FD from the CAN FD bus. I don't disconnect it from the USB port.

Cheers,
Chris

M.Heidemann
Sales & Support
Sales & Support
Posts: 673
Joined: Fri 20. Sep 2019, 13:31

Re: netdev with PCAN-USB FD, wrong error frames

Post by M.Heidemann » Wed 17. Mar 2021, 15:28

Hello Chris,

Sorry for the late reply,
i missed the question regarding the state-indication,
pardon me.

AFAIK SocketCAN does not relay "Error-Active" to the application as it does not consider it as an error.
The message that inform the SocketCAN user of a bus state change is an "ERROR" message which also carries Rx and Tx error counters, as this isn't the case with "Error-Active" it is not relayed to the application.

Best Regards

Marvin

ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

Re: netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Wed 17. Mar 2021, 15:44

M.Heidemann wrote:
Wed 17. Mar 2021, 15:28
AFAIK SocketCAN does not relay "Error-Active" to the application as it does not consider it as an error.
The message that inform the SocketCAN user of a bus state change is an "ERROR" message which also carries Rx and Tx error counters, as this isn't the case with "Error-Active" it is not relayed to the application.
Hello Marvin,

Thank you for your reply. In the meantime I had the opportunity to try the same test program on a Raspberry Pi with a (SPI-connected) CAN FD shield. There, the "error" message with CAN_ERR_CRTL_ACTIVE in byte 1 is generated when the controller becomes error active after having been error passive before. Which makes sense, because why would it be defined in linux/can/error.h otherwise?

The value in this message is that an application can internally reset error status variables/flags if this notification happens. This information is difficult to obtain by other means.

So, it seems this is not a problem with SocketCAN but the kernel driver for the CAN interface. IOW, this is something the person in charge of the PEAK kernel drivers should probably look into.

Cheers,
Chris

User avatar
S.Grosjean
Software Development
Software Development
Posts: 302
Joined: Wed 4. Jul 2012, 17:02

Re: netdev with PCAN-USB FD, wrong error frames

Post by S.Grosjean » Wed 17. Mar 2021, 16:00

Hi,

Could you please tell us what exactly is this other socket-can CANFD driver you run with the RPi?

Would be interesting to understand why two drivers don't work the same way when they should. We would like to compare them and if it turns out that there is indeed a difference, if necessary, report it to the linux-can maintainers.

Thank you for your feedback,

Regards,
— Stéphane

ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

Re: netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Wed 17. Mar 2021, 16:38

S.Grosjean wrote:
Wed 17. Mar 2021, 16:00
Could you please tell us what exactly is this other socket-can CANFD driver you run with the RPi?

Would be interesting to understand why two drivers don't work the same way when they should. We would like to compare them and if it turns out that there is indeed a difference, if necessary, report it to the linux-can maintainers.
Hi Stéphane,

Yes, no problem. I am running the latest Raspberry Pi OS Lite from https://www.raspberrypi.org/software/operating-systems/

I have an MCP2518FD click module (https://www.mikroe.com/mcp2518fd-click) connected to SPI0 (MISO/MOSI/SCK/CS) and in /boot/config.txt I configure:

dtoverlay=mcp251xfd,spi0-0,interrupt=25

That's all that's needed to make it detect the CAN controller during boot (check dmesg). Then I configure can0 exactly the same as in my first post and I build and run the same example. While it is running, I disconnect the CAN bus (controller goes into error passive) and then re-connect it and send a few CAN FD messages from the other side (controller goes into error active again).

The error messages are very much the same between PCAN-USB FD on Ubuntu and MCP2518FD on Raspberry PI OS, except the one when it goes into error active, which is missing with the PCAN-USB FD.

Cheers,
Chris

ckeydel
Posts: 26
Joined: Thu 4. Nov 2010, 16:06

Re: netdev with PCAN-USB FD, wrong error frames

Post by ckeydel » Wed 17. Mar 2021, 16:46

Another piece of information:

The Ubuntu with PCAN-USB FD runs kernel 5.4.0-66-generic.

The Raspberry Pi OS is kernel 5.10.11+.

So the Ubuntu is a few versions "behind". Not much but I don't know when the CAN_ERR_CRTL_ACTIVE status was added. Could that be it?

Cheers,
Chris

Post Reply