Team-BHP - I authored an RTOS (Real Time Operating System) for an Indian Auto OEM
Team-BHP

Team-BHP (https://www.team-bhp.com/forum/)
-   Technical Stuff (https://www.team-bhp.com/forum/technical-stuff/)
-   -   I authored an RTOS (Real Time Operating System) for an Indian Auto OEM (https://www.team-bhp.com/forum/technical-stuff/217980-i-authored-rtos-real-time-operating-system-indian-auto-oem.html)

Prelude

I know that, not all the BHPians are from software background. But I know for sure that most have interest in knowing how stuff works, when it comes to automotive. With that in mind, I will try to keep things simple for all you fellow enthusiasts while explaining - what it takes to create a Real Time Operating System (RTOS) for Electronic Control Units (ECU).

Considering that this is a public forum, I off-course have some limitation on what I share. Also, there are many implementation complexities, which are beyond this thread. I'll try my best, nonetheless.

Background

Me:
I work for an Indian 2-wheeler Auto OEM. I have been designing and developing Software for Embedded Systems* for a decade and half. My personal interest lies in Firmware* design and development. In my current role, I am responsible for developing RTOS and Bootloader*.

We:
We are an Embedded Software team working mainly on building Software Infrastructure, upon which Application Software* can be built. Our Software Infrastructure consists of the RTOS, input/output interface software (called drivers), configuration tools (that help auto generate the code for the RTOS and the drivers to run correctly), debugging and programming tools that support the development of the ECUs, etc.

Sections in this thread:
  1. Real Time Operating System
  2. Internals of OS
  3. Considerations for Auto RTOS
  4. Existing RTOS
  5. New RTOS

Notes:

I do not claim to be an expert on the subject. Whatever I know, I've learnt through experience. Hence I request you to pardon any mistake I may have made here in the thread (and do point it out, so wrong information is not propagated).__________________
Akshay

Lot of information is already out there if you just search, but still, I'll try to compile basic concepts to explain what an RTOS is.

Operating System vs Real Time Operating System

First thing first, a Real Time Operating System guarantees timely response to any event as against an Operating System. (It is hard to explain to millennial, but) Do you remember times when pointer on your Desktop PC used to stop responding to mouse movements and then you'd press NumLock button on keyboard to find out if PC has hanged? Well, that's the difference between Real Time Operating System and Operating System.

Hard Real Time OS vs Soft Real Time OS

Within RTOS, there are 2 different branches - Hard vs Soft RTOS. Hard RTOS guarantees immediate response to an event. Soft RTOS, on the other hand, guarantees response within realistically acceptable time.

That means, if an Operating System guarantees response to an event within well defined, stipulated, realistic and predictable time period, every single time, then it can be termed as Soft Real Time Operating System. For example, if switch state change is responded within 1 millisecond in worst case, it's Soft Real Time.

Whereas, if an Operating System guarantees response to an even at exact same time from occurrence of event, it is a Hard Real Time Operating System. For example, if switch state change is responded exactly at 100 microseconds from change, every single time, then it's a Hard Real Time.

Need for an Operating System

In a simpler system, one may just monitor the inputs, to drive the outputs, in one-by-one fashion. Let's take example of manual AC in a car. You read state of inputs like AC on/off switch, temperature knob, fan speed, etc. one-by-one. Then do some calculations to decide the on/off state of compressor and heater, fan motor speed and heated/cooled air mixture ratio etc. Then output the calculated values to drive the mechanical components so that heated/cooled air can be blown.

Now, if you want to move to automatic temperature control, the scope has increased. Now, you don't just have to read inputs to drive outputs, but run some closed loop control of:
  1. cabin temperature - which is function of all the below variables
  2. heater temperature to get warm air for optimum energy consumption and to make above control loop 1 easy
  3. fan speed to change rate of cooling/heating
  4. compressor on/off control to maintain optimum cooling, again, for optimum energy consumption and to make above control loop 1 easy
  5. etc.

Now, considering that all these closed loop controls have to run in parallel, it becomes difficult to build a software using one-by-one approach. This is where an Operating System comes in. An OS runs multiple loops (typically termed tasks) concurrently. While running these tasks, it makes sure that they do not interfere with each other. Advantage that the tasks have is, they never have to worry about the other running tasks in the system, they just mind their own business and be done.

Need for Real Time Operating System

Above example shows need for an Operating System, but it does not demand Real Time responsiveness. Even if the temperature changes over few hundred milliseconds, no-one will complain - or rather even notice.

So, let's take example of ABS. ABS needs to monitor the applied brake pressure, and push the calipers proportionally. Then read speed of each wheel individually and compare the rate of speed drop against brake command and against realistic deceleration under load. Based on this rate of deceleration, do some math to decide if brake caliper pressure is to be maintained or dropped or raised. Continue monitoring any brake pedal position change from the operator all the while, to make sure primary command is always obeyed, during ABS correction.

This operation cannot rely on an Operating System that does not guarantee predictable and realistic, timely response. The required corrections demand instantaneous response to all the inputs (break pressure, speed change, etc.). Hence the need for an RTOS.

Need for Hard Real Time Operating System

ABS demands timely response, but, it can still be within few milliseconds range. Fact is that, the braking system itself has sluggish response time which runs into some milliseconds. As long as the RTOS responds within that period of time, the ABS runs smoothly.

So let's take example of Airbag. For correct operation of an Airbag, the system needs to read arrays of impact sensors. It needs to detect the intensity of the impact. And it needs to be absolutely sure of the severity of an impact. The system must not deploy airbag in a car running at 100 kmph, just because a pebble hit one of the sensors. Such false airbag deployment would be a cause for an accident.

Hence, all these complex calculations need to be complete within few milliseconds from an impact and the system needs to run multiple iterations to be absolutely sure of actual hazard. Once deploy signal is sent, there is no turning back. You don't need me to tell you that this system would require a Hard Real Time Operating System for reliable operation.

Note:
Although I have thoughtfully selected the examples, there is no hard-and-fast rule. Meaning, Auto AC can still be run without an Operating System. Off-course it's performance will be lame. On top of that, if you avoid using Operating System, task synchronization becomes a headache. Then the designer/developer of the system has to spend major time in debugging/testing/understanding system from software point of view, rather than functional point of view.

An Operating System takes a lot of load away from the application developer, so that (s)he can focus on functionality rather than software engineering.

Well, someone has to take the effort to engineer software - so then it's responsibility of the one who creates an OS (yeah, that's me... :D)

Similarly, Auto AC can be built on Hard RTOS, but it's an overkill. It's just too much to ask.

Operating System Internals

After all, RTOS is still an OS. So internally they are more or less the same - conceptually.

Any OS can be broken down into 3 major parts :
  1. Task Scheduling (the Scheduler)
  2. Inter-Task Communication/Synchronization (the ITC)
  3. Drivers
Scheduler

Nowadays Operating Systems are multi-tasking, essentially (it was not the case always. Remember DOS?). When OS allows to invoke multiple tasks, it is responsible to give each task time to run. This can be achieved in multiple ways. So there exist different types of Schedulers. I'll only list the names of some such methods here.
Bottom 3 are used in RTOS as they are of preemptive type. Preemption is a method, in which, if task of higher priority is set ready to run, OS stops lower priority Task from running and passes control to the higher priority Task. This is how Real Time responsiveness is achieved. (This is a very vast topic to discuss, so, I'll just leave it here.)

Inter-Task Communication

When the entire system's operation is divided into tasks, they depend upon data served by other tasks. For example, in Auto AC, heater task's set-point is decided by the cabin temperature control task.

Because the tasks run independent of each other, OS has to provide a mean to pass this data among tasks and to synchronize their execution. This objective is achieved through Inter-Task Communication modules of OS.

Drivers

Drivers are the blocks of software that provides application software an interface to the underlying hardware. For example, in ABS, a dedicated (mostly frequency input (aka encoder)) driver will be sensing the wheel speeds.

Considerations for Auto RTOS

One major risk that any software put in Auto has is, human life depends upon it. A small malfunction may cost life. On top of it, there is no user interface for most of the components. Even if something has failed, the system must work around it. There is no option of (well, very famous Windows) Blue Screen. Think about it, what could you (as a driver of vehicle) do if one wheel's speed sensor handling becomes erratic in ABS controller, what could you do if auto transmission's clutch actuator suddenly stops responding on a highway?

As a designer of ECU software, one has to consider these things. And the severity grows exponentially as you go to lower levels of software. In case of ABS sensor handling failure or auto transmission's clutch operation failure, the application programmer already knows the severity of the failure and design the system to take corrective action, because he knows the the system inside-out. But the RTOS designer has no idea where his RTOS would be used. There, such work-around is not possible. (S)he has to consider worst of the worst scenario and provide support/interface to application software, so that care can be taken in application software.

One handy tool that RTOS can implement is, provide maximum information to application software during development phase, so that design errors can be communicated back to the application developer. Secondly, RTOS can also implement a monitoring software that keeps track of misbehaving task and build+provide sufficient information to developer to fix the issue. Additionally RTOS may implement a feature that blocks tasks from accessing each other's data to avoid misbehavior. Well, in short, there are numerous ways, but realizing and implementing them is easier said than done.

Prescript

Here onward, I'd be using many technical terms, which are not so easy to elaborate in notes. Hence, I'll be marking such terms with[*] and add details in the last post of the thread - https://www.team-bhp.com/forum/assem...ml#post4736517.

Existing RTOS

Yes, we already have a proprietary RTOS (let's call it ERT for short) running our latest fleet of vehicles. It is a capable one. It has been built by my boss and I'm proud to say, this is the first time that our company released a product in the market that has end-to-end software developed in-house, built using this RTOS.

ERT can be categorized as Soft-Real-Time. It uses Rate Monotonic Scheduling[*] and has guaranteed response time of 1.25 milliseconds (why odd number : it is an integral divisor of 5 and 10 and our Rate Monotonic Scheduler depends on that).

As I said earlier in one of the notes, application is not necessarily restricted by the type of an RTOS. Even when this is a Soft Real Time Operating System, it can very well handle even an Engine Control Unit. ERT provides ways for the Application Developer to achieve better response time in special cases. By taking advantage of such methods very time critical implementations are possible with ERT.

Software Structure of ERT

ERT uses AUTOSAR[*] (AUTomotive Open System ARchitecture) like division of software parts. These are, MCAL (Micro-Controller Abstraction Layer), BSW (Basic Soft-Ware), LiL (Link Layer) and ASW (Application Soft-Ware). Except ASW, all other components are part of ERT.

MCAL provides core hardware interface. BSW provides hardware independent, common interface. LiL provides OS to Application interface. Advantage of having these sections separated is, it keeps application software away from and independent of the hardware. Change in the hardware does not demand change in the application software (provided its not a functional change).

Additionally, MCAL more or less remains the same. Because the driver (in conventional sense) is separated in 3 parts - MCAL, BSW and LiL, adding a feature to driver becomes simpler. For example if one wants to add a high level protocol on CAN, in this architecture, you just add another BSW module and support interface through LiL. CAN MCAL remains just the same. (see note^)

Notes:

The ERT architecture, although similar on the division of software parts, has nothing to do with AUTOSAR beyond that.

^This is not the approach all take while developing Embedded Software. In many cases you won't find the famous 7 layers of protocol developed separately in software. Embedded Software is always short on resources; especially memory. So one always takes all the shortcuts possible.

So, why create a new RTOS?

Now I'll come to the point. The Micro-controllers (uC - pronounced myu-c), for which the ERT has been built, are destined to be discontinued. The ERT has been designed around these uC. The Interrupt[*] structure of this uC is unique and the Rate Monotonic Scheduler of ERT makes use of this structure for Task Switch[*]. Which means ERT cannot be ported to other uC. It will require a complete redesign.

While the market is moving towards EVs, we need to be future ready. The processing power needed for Motor Control is higher*, it is difficult for existing uC to keep up. Anyways, it makes business sense to have uC that can be used in all future ECUs, including Motor Controller, as the current uC family is retiring.

That's when new RTOS comes into being.

New RTOS

The new RTOS (let's call it NRT for short) is a generic, preemptive, multitasking, Hard Real Time Operating System. It guarantees instantaneous response (within a micro-second) to events. NRT can be ported to any uC family.

Scheduler

NRT uses priority based task scheduling. The task's priorities are defined beforehand, based on the severity of the events, handled by task. Upon occurrence of an event, the Interrupt Handler (typically called ISR[*]) will pass the event data to the task to further process it. If this task's priority is higher than currently running task, the control is instantaneously handed over to it.

While event triggered tasks are handled in this fashion, NRT continues to support Rate Monotonic Scheduling of ERT on top of priority based Scheduler. This ensures that, existing tasks and (BSW/LiL of) drivers can be easily ported with minimum modification, into NRT.

The task switching is done in assembly[*] code, which gives full control of operation and it results into reliable, predictable, precise timed switch.

Inter-Task Communication

NRT provides many ways of ITC. The ITC provides mean for data communication as well as synchronization. As mentioned above (in the scheduler section of NRT), the data from ISR is passed to a task through ITC object. These ITC objects schedule/synchronize the readiness of the task to run.

The overall synchronization of the running tasks with each other and with the events is achieved through the ITC objects.

Notes:

Additional information and details

I have written this entire thread in simplest possible (to me) language. I may have touched upon some concepts, but not written enough to understand. On the other hand, there might be some things that I have not given much detail on, because of IP reasons. Still, I'd definitely try my best to clarify doubts of my fellow BHPians, if requested.

I hope you enjoy reading it, as much I have writing it.

Rate Monotonic Scheduling

This task scheduling method is based on the rate of activity. Any high priority operation is run at higher periodicity and vice-versa. The periodicity of each task is predefined, based on its priority, and the Scheduler keeps calling these tasks on time.

AUTOSAR

It is an architecture defined by an Automotive consortium. It uses a standard approach to build a software system. It is very vast subject to discuss here, but in short I can say it's a service based implementation supported by many Auto Tier-3 and silicon manufacturer giants.

Interrupt

It is an expected asynchronous event. An ECU may want to read some external parameters, but can't just stop doing everything else for them to be available. For example, to sense wheel speed, an encoder (a rotating toothed wheel) will send on/off pulses to the ECU. ECU then has to calculate the on/off time to figure out the speed. This being a slow changing parameter (in software time of frame), ECU can't keep waiting to see when the input went off. So, the software configures an Interrupt that pokes when the state of this input changes.

Task Switch

aka Context Switch is the operation which will pause a running task and starts different task. This also being done asynchronously, the running task may not handle it by itself. It is then the Scheduler's job to pause the running task and when other high priority tasks are done doing their work, resume the paused task, as if it was never interrupted.

The "as if never interrupted" part calls for preserving the exact state of the software in order to resume it. This operation of preserve and resume is termed Context Switch.

ISR

Interrupt Service Routine is piece of software that handles the asynchronous event called Interrupt.

Assembly

Assembly language is lower level language, which is direct representation of machine language in readable form. Every micro-controller/micro-processor will have a dedicated language that works only with it. We typically write programs in high/mid level languages and compilers (again, a software) then converts the logic of the program to machine level language.

For very core operations, it is not possible to represent the logic in the high level language. In such cases one has to fall back to assembly language.

Thread moved out from the Assembly Line. Thanks for sharing!

A good read.

I am working for a similar industry, working on unmanned platforms now, for application software development.

Excellent read! I have been working on many general purpose operating systems for a while. I have worked on file systems, the i/o path and drivers. I have not looked into RTOS much. Auto RTOS would have its own challenges and I'm sure you would love to solve them! Now the considerations will be even more challenging, as automotive space itself is moving to a different paradigm itself, with all the self driving car, etc.

Bit confused about difference between hard and soft RTOSs.

Regards
Sutripta

Quote:

Originally Posted by akshye (Post 4732498)
Prelude
.
.[*]Bootloader : is a software that takes care of reprogramming of the ECU, among other things.
.
.[/list]__________________
Akshay

Interesting topic.
I believe the bootloader serves a slightly different purpose.
For further reading, please refer:
https://www.microcontrollertips.com/...ootloader-faq/

Quote:

Originally Posted by Sutripta (Post 4736970)
Bit confused about difference between hard and soft RTOSs.

Regards
Sutripta

In case of a soft RTOS, the stipulated response time isn't necessarily met on all occasions.

^^^
Thanks.
So soft - not a true deterministic system. Nothing really to do with response time.

Regards
Sutripta

Quote:

Originally Posted by Sutripta (Post 4737010)
^^^
Thanks.
So soft - not a true deterministic system. Nothing really to do with response time.

Regards
Sutripta

True to an extent, but a soft RTOS cannot afford to miss all response deadlines. Else, it won't qualify as a RTOS.

Excellent thread and very informative! I always had a liking for the embedded systems and this information you have provided is of great help to understand it in a simpler way!
Thankyou for all your time to put this information together.


All times are GMT +5.5. The time now is 10:56.