Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ISR-safe operating system functions #138

Open
sy2002 opened this issue Sep 17, 2020 · 17 comments
Open

ISR-safe operating system functions #138

sy2002 opened this issue Sep 17, 2020 · 17 comments
Assignees
Labels

Comments

@sy2002
Copy link
Owner

sy2002 commented Sep 17, 2020

Make all operating system functions that are working with hardware registers ISR-safe. For context, read #109 (comment).

isr_mulu
isr_muls
isr_divu
isr_div
isr_vga_copyfont
... and more ...

Make sure that we do this in a cautious way for not wasting more ROM space than necessary.

One idea would be to have a generic "save registers" Monitor function that just starts at R8 and iterates through a given amount of registers (R9). Having such a function, the ISR versions could be as easy and memory saving as this (pseudo-code):

INCRB
MOVE IO$EAE_OPERAND_0, R0     ; do not use R8 .. R12 as these are the parameters of the caller
MOVE 5, R1
RBRA SAVE_REGISTERS, 1
RBRA MULU, 1                                ; R8/R9 input and R10/R11 output
RBRA RESTORE_REGISTERS, 1      ; R0 and R1 are not modified by functions, so still pointing correctly
DECRB

and then for the end-user in the ISR it would be SYSCALL(isr_mulu, 1) instead of SYSCALL(mulu, 1).

The necessity for ISR writers to use the ISR-safe versions of the OS functions is something that we should document in our programming best practices.

@sy2002
Copy link
Owner Author

sy2002 commented Sep 17, 2020

Here is one argument packed in two examples why we indeed do need ISR-safe Monitor functions for selected functions (obviously not for all of them):

  • The whole point of "operating system" functions such as the Monitor is, that if somebody calls SYSCALL(mulu, 1), he does not need to be aware, that under the hood the EAE is used. He just wants a multiplication, and he gets it.

  • If you write in C an ISR and inside this ISR write c = a * b; then you are also not aware, that this is using the EAE in the background.

So we need to look through the monitor functions and decice which ones are worth the effort. Obviously we also need to make sure that the C system works stable:

  • We need something like this: #define __isr __interrupt __norbank because we do not want an INCRB/DECRB in an ISR any more

  • Already today VBCC is smart enough to end a __interrupt tagged function with a RTI instead of a RET

  • TODO is then: Use ISR-safe (slower) math routines, when being inside __interrupt

The whole VBCC part needs an extra issue. This issue is about the Monitor part.

@bernd-ulmann
Copy link
Collaborator

When we will have user/kernel mode then systhing like SYSCALL(mulu, 1) will eventually issue an interrupt and switch to kernel mode since this will be the only way to directly address physical memory addresses. Thus only the kernel (which itself is not interruptable) can access external devices and we won't have any problem with states of IO devices. :-)

@sy2002
Copy link
Owner Author

sy2002 commented Sep 17, 2020

Just had a phone call:

Bernd came up with a better and more elegant solution to make our operating system functions ISR-safe:

We need the ability to have a "clear interrupt processing" / "enable interrupt procecssing".

And then we "just" need to update all relevant Monitor functions to do a "clear interrupts" before the relevant parts (for example when accessing IO devices such as EAE) and re-enable it afterwards.

Advantages:

  • We do not need to write a lot of redundant functions (Monitor size, etc.)
  • Easier for the programmer: Nobody needs to distinguish between for example isr_mulu and mulu.
  • Also VBCC does not need to distinguish anything.

So this issue is now all about adding this capabilities to to the Monitor and about documenting everything.

It is blocked by issue #139, where the mechanism itself is developed.

@sy2002
Copy link
Owner Author

sy2002 commented Sep 19, 2020

@MJoergen and @bernd-ulmann Now that #139 is done, I stumbled over one more thing: In the operating system functions such as mulu that we want to make ISR-safe, we need to perform these operations every time in every function that we want to make ISR-safe:

  1. Check if the Interrupts are on or off
  2. Remember that
  3. Switch them off
  4. Depending on (2): Switch them on again

This costs precious performance and demands writing a lot of template code.

One solution would be to have a second bit in the CSR of the interrupt controller that performs "pause/resume": When interrupts are off then pause/resume does nothing. When interrupts are on, then it pauses and then resumes. This way we only need two instructions per function that we want to make ISR-safe and we do not need to remember and restore anything.

@MJoergen
Copy link
Collaborator

I like the idea of pause/resume. But that still requires two instructions in the beginning of a Monitor function:

MOVE IC$CSR, R0
OR 0x0002, @R0    ; Set bit 1 means pause all interrupt acceptance.

and then another one or two instructions at the end

MOVE IC$CSR, R0
AND 0xFFFD, @R0    ; Clear bit 1 means resume interrupt acceptance.

So that is a total of (approximately) 16 clock cycles, just for handling a potentially rare race condition. If the programmer is not using interrupts, then these clock cycles are unnecessary and a complete waste.

@bernd-ulmann : You have several times mentioned adding support for user mode / kernel mode. That sounds like the way to go. Maybe we just implement that ?

What I would like is a special SYSCALL instruction that does almost the same as a ASUB, but which also sets an internal bit in the CPU indicating that all interrupts are to be paused (just like Mirko's idea, except inside the CPU). To return from a SYSCALL would require a new instruction RTS that does the same as RET, except it clears the internal pause/resume bit. That way, calls to the Monitor are just as fast as today, and will be free from race conditions. I don't think this pause/resume bit should be accessible to the user, i.e. it should not be part of R14.

@bernd-ulmann
Copy link
Collaborator

Thinking of user/kernel mode I would like to suggest the following:

The only way from user to kernel mode is via an interrupt (either hardware i.e. external or software - INT instruction). A system call would then be executed, as usual, by setting some parameters for the ISR in registers or the like and then issue an INT instruction.

Thus we won't need any additional instructions. We would get everything else for free:

Since the switch to kernel mode is done with an INT instruction, interrupts are automatically disabled when in kernel mode.

Reanabling interrupts is just done by the existing RTI instruction.

What do you think? :-)

@MJoergen
Copy link
Collaborator

@bernd-ulmann That's exactly what I'm looking for! We can discuss this tonight, but it seems to me this is the best solution.

However, there is one thing: We will want to call Monitor functions from an ISR (e.g. mulu). Currently, the CPU HALT's if an INT instruction is called from the ISR. So the ISR needs another way to call mulu.

@sy2002
Copy link
Owner Author

sy2002 commented Sep 19, 2020

For me, user/mode kernel mode is a very big project and not just something we would do just like a small thing. Always think in terms of an overall stable project including all documentation, demos, thinking through all details, etc. Therefore I would opt for something big like that in V1.8 as V1.7 is already more than large enough.

And just as a reminder for our discussion tonight:

I personally love to use QNICE a lot for recreational programming: Writing a game like Q-TRIS, fiddling with something like fancy or PacNice or something like this. This needs device access and therefore does a lot of register manipulation everywhere. The whole argument "we loose 16 clock cycles on deactivating/reactivating interrupts in Monitor functions" is multiplied by 100 when for each pixel you want to set, each sprite register you want to manipulate or each sound effect you want to create, you need to cross between user mode and kernel mode.

For sure: This is solveable. But this is why I call this a large project.

And of course this is how modern PCs work, but then we need to start writing smart device drivers, need to have DMA, and a lot of other things and mechanisms to still be able to have a high performance way to write recreational stuff like a game.

So would then the solution be: Doing this kind of recreational programming stuff (games) in kernel mode?

If yes, how would the ISR problem be solved then when the game and the ISR run in kernel mode and then the game calls a monitor function? I think we would be back in square one.

Maybe for V1.7 we should opt for just stopping and restarting the interrupts using our new controller and a to be added pause/resume in the monitor function and accept the 16 clock cycles, just copied the code from Michaels #138 (comment)

Start of the Monitor function:

MOVE IC$CSR, R0
OR 0x0002, @R0    ; Set bit 1 means pause all interrupt acceptance.

And then another one or two instructions at the end:

MOVE IC$CSR, R0
AND 0xFFFD, @R0    ; Clear bit 1 means resume interrupt acceptance.

@sy2002
Copy link
Owner Author

sy2002 commented Sep 19, 2020

P.S. Another solution for V1.7 could also be to make the programmers of ISRs responsible for either not using Monitor functions at all or for exactly knowhing what they are doing and saving the device's registers inside the ISR. This would save us the 16 clock cycles in general and put more load onto the shoulders of ISR programmers - which would be absolutely acceptable for me (at least for V1.7) because writing ISRs has always been a form of art...

@sy2002
Copy link
Owner Author

sy2002 commented Sep 19, 2020

Solution for V1.7

  1. We introduce in monitor/sysdef.asm a BLOCK bit in the CSR of the interrupt controller: bit 1. Bit 0 stays the interrupt enable bit.

  2. We agree that for V1.7 not all Monitor functions are ISR-safe but guarantee that the arithmetics functions are not affected by side-effects. This has the nice effect that we still can write ISRs in C.

  3. We are setting up the convention - TO BE DOCUMENTED - that an ISR is either not allowed to call a Monitor function - or the programmer exactly knows what he is doing and for example saves the device's registers inside his ISR

@sy2002
Copy link
Owner Author

sy2002 commented Sep 19, 2020

@MJoergen Can you take over this issue and finalize it? The following tasks are left to be done and then this issue can be closed:

  • Update the documentation
  • Proof-read the updated documentation. Also check: Is it complete? (Or did I forget anything important around this topic?)
  • Enhance the hardware
  • Enhance the following Monitor functions to use the new block/unblock feature:
    mulu
    muls
    divu
    divs
    mulu32
    divu32
    
  • Test if everything works on hardware
  • Test if everything works on Bernd's enhanced Emulator

@MJoergen
Copy link
Collaborator

I think we still have a problem.

When a regular program uses the EAE (e.g. by a multiply operation) that operation gets translated directly to accesses to the EAE, and not via a Monitor call.

So I think we would need to change this, so that accesses to the EAE are guarded by the IC$BLOCK_INTERRUPTS. This should be done in c/qnice/compiler-backend/machine.c, where there are references to e.g. IO_EAE_OPERAND_0.

MJoergen added a commit that referenced this issue Oct 28, 2020
See Issues #77 and #138.
@MJoergen
Copy link
Collaborator

As mentioned above the current C-compiler does not support protecting the EAE. Currently, the programmer therefore has to do this manually.

I've written a proof-of-concept program in test_isr.c. This program has a main loop where the background color is updated continuously (using the EAE) which gives rise to a steady pattern of horizontal lines. At the same time, a timer interrupt is running at 10 kHz that uses the EAE too.

The crucial lines in the program are lines 50 and 52 in the main loop, where interrupts are temporarily blocked and unblocked, while using the EAE. Without this blocking/unblocking, the program will still generate the same numerical output, but there will be a lot of flickering on the screen, as the calculation of the background color gets corrupted.

@sy2002
Copy link
Owner Author

sy2002 commented Oct 28, 2020

@MJoergen great observations. Thank you! It is cool that there exists now a test program which can be used to "visually test", if everything works.

So I think we would need to change this, so that accesses to the EAE are guarded by the IC$BLOCK_INTERRUPTS. This should be done in c/qnice/compiler-backend/machine.c, where there are references to e.g. IO_EAE_OPERAND_0

There are actually two places where this needs to be changed:

  1. On the one hand, for 16-bit math, c/qnice/compiler-backend/machine.c needs to be changed. You might want to approach Volker for that.

  2. On the other hand, for 32bit integer math, you need to change c/qnice/vclib/machines/qnice/libsrc/_lmul.s and add the block/unblock code here, too. You could do that, Volker is not needed for that. Please be aware, that the syntax of the assembler code used there is not "our" assembler syntax but Volker's VASM syntax. And don't forget to recompile the compiler (just being paranoid) and the monitor- and (mandatorily) vclib after your change as described here.

@sy2002
Copy link
Owner Author

sy2002 commented Oct 28, 2020

@MJoergen One more thought / addition to my comment above: We might want to change to code emitter, so that C uses the monitor functions mulu and muls, divu and divs instead of fiddling directly with the EAE.

This would have the advantage, that there is exactly one place in our code-base that needs to be 100% correct and "isr-hardened": The Monitor "Operating System".

Less redundancy means less error prone: As long as the monitor functions mulu, muls, divu and divs work correctly.

The price we would need to have to pay for that is that from then on, each 16bit multiplication in C means a subroutine call, which is less performant than today's case where direct EAE code is being emitted.

On the other hand, if we would implement this thought consequently, then also the 32-bit version _lmul.s would not be allowed to fiddle with EAE any more directly like it does today. Instead, it would also call mulu from the Monitor. This would mean, that each 32-bit multiplication consist of two sub-routine calls in future instead of one today, because _lmul.s is implementing the 32-bit multiplication as one multiplication and two additions.

(We need to distinguish always two cases: The code emitter in machine.c emits code for 16-bit cases and calls specialized functions such as _lmul.s for 32-bit cases.)

Your thoughts/preferences?

@MJoergen
Copy link
Collaborator

I like the idea of not duplicating code, and instead have 16-bit operations call the monitor functions.

I've written another test program test_math_isr.c that specifically tests the three different ways to do arithmetic:

  • 16 bit operations, which currently generates inline assembly.
  • 32 bit operations, which call ___mulint32.
  • Explicit calls to qmon_mulu.

I looked briefly at c/qnice/compiler-backend/machine.c, but it is beyond my ability to change that, so it calls the monitor function. So I think we need the assistance from Volker.

@sy2002
Copy link
Owner Author

sy2002 commented Oct 29, 2020

Cool stuff! There is a challenge: Volker is not using GitHub and his local copy of VBCC must not have dependencies but needs to be stand alone. I think we need to brainstorm with Volker, how we can solve that and do a video call.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants