CAN messages silently dropped by controller

Universal Plug-in Module with I/O and CAN FD Interface
Post Reply
Adrian
Posts: 5
Joined: Fri 19. Apr 2024, 14:42

CAN messages silently dropped by controller

Post by Adrian » Mon 31. Mar 2025, 17:30

Hello everyone,

I'm encountering an issue with a custom firmware for the PCAN-MicroMod FD. Occasionally, some CAN messages are not being sent on the CAN bus, despite no error frames being present.

The CAN bus configuration is as follows: ISO CAN FD, 80MHz, Nominal 500kbps, Data 2Mbps. The bitrate settings remain unchanged from the example code.

The firmware is based on the provided examples, with SingleShot explicitly disabled during CAN initialization.
The software queue does not fill up, as the CAN_Write function consistently returns CAN_ERR_OK (I verified this by checking the return value to trigger an LED, which remains off).
I have increased the CAN1_TX_QUEUE_SIZE to 45, but I expect all queued messages to eventually be sent on the bus.

Any suggestions on where to investigate this issue would be greatly appreciated.

Thank you!

G.Bohlen
Hardware Development
Hardware Development
Posts: 66
Joined: Wed 22. Sep 2010, 21:38

Re: CAN messages silently dropped by controller

Post by G.Bohlen » Tue 1. Apr 2025, 17:34

Hello,

for a test I have created a function that sends "number" CAN-Messages and counts how often the write-attempt to the tx-queue was not successful.A 32bit number is incremented with each message, in case the number of receive messages are smaller than expected it is possible to check a trace-file which messages are missing.

Code: Select all

static void  txTest (uint32_t number)
{
	CANTxMsg_t  Msg;
	static uint32_t counter=0;
	static uint32_t retry_count=0;
	uint32_t ret;
	
	Msg.bufftype = CAN_BUFFER_TX_MSG;
	Msg.dlc      = CAN_LEN8_DLC;
	Msg.msgtype  = CAN_MSGTYPE_FDF|CAN_MSGTYPE_BRS;
	Msg.id       = 0x124;
	
	Msg.data32[0] = counter;
	Msg.data32[1] = 0;
	
	for(uint32_t n=0;n<number;n++) {
		// Send Msg
		Msg.data32[0] = counter;
		Msg.data32[1] = retry_count;
		do {
			ret=CAN_Write ( CAN_BUS1, &Msg);
			if(ret!=CAN_ERR_OK) retry_count++;
		}while (ret!=CAN_ERR_OK);
		counter++;
	}
}
The function is called when CAN-ID 0x100 is received, number of messages is a 32bit number located in the first 4 data bytes of the message:

Code: Select all

// main loop
	while ( 1)
	{
		CANRxMsg_t RxMsg;
		if(CAN_UserRead(CAN_BUS1, &RxMsg)==CAN_ERR_OK)	// read the next CAN-Message from the receive quere
		{
			// message received from CAN
			switch(RxMsg.id)
			{
				case 0x7e7:
					mCAN_EvalInitialCmd(&RxMsg);
					break;

				case 0x100:	
				{
					uint32_t count=RxMsg.data32[0];
					txTest(count);
				}break;

				default:
					break;

			}
			...
		}
	}
I started the tests several times with a number of 0x100000 (~1million) and didn't see any missing message at the receiver. I use PCAN-View to receive the messages.

How often do you see missing messages?
Maybe you can repeat this test with your device.

Adrian
Posts: 5
Joined: Fri 19. Apr 2024, 14:42

Re: CAN messages silently dropped by controller

Post by Adrian » Wed 2. Apr 2025, 10:06

Actually it was easy to spot when the message was missing since it contains an alive counter.
The problem occurred sporadically, ranging from 1 to 4 times within a 24-hour period.
Unlike your test, my implementation uses different cycle times, and occasionally, requests are sent in bursts.
However after I had a look at the source code (it is unchanged in all provided examples), I suspect there might be a problem with the function CAN_TxQueueWriteNext:

Code: Select all

	if (( pBus->TxQueueFree == pBus->TxQueueSize) && ((CAN_Channel->TXBRP&1)==0))	  // Queue leer & Hardware frei ? Direkt senden!
	{
		can_frame_t      *pMsg;											// Zeiger auf Msg
	
		pMsg = pBus->pTxQueueWrite;				        				// Nachrichtenzeiger holen

		/* use message buffer 0 */
		mbtxfer.mbIdx = 0;
		mbtxfer.frame = pMsg;//;&txmsg;
		CAN_TransferSendNonBlocking(CAN_Channel, &s_handle, &mbtxfer);
		
If the queue is empty and the hardware is idle, the message is copied directly to the TX buffer, with no updates to the queue or any other flag.
However if another rapid enough call to CAN_Write() occurs, before the Tx buffer is picked up for transmission, the original message gets overwritten in the TX buffer.
To test this, I sent every time a dummy message immediately before my message of interest. After running for over 48 hours, the message with the alive counter was never missing, while the dummy message was missing 10 times (monitored with PCAN-View).

Post Reply