Skip to content

optimisation: compare uint16_t against uint16_t in FinishHomingAndPlanMoveToParkPos #335

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Dec 24, 2024

Conversation

gudnimg
Copy link
Collaborator

@gudnimg gudnimg commented Dec 22, 2024

AxisDistance returns uint16_t type and is currently compared with long double axis length. The axis lengths fit easily into uint16_t:

selectorLimits.lenght = 75.0
idlerLimits.lenght = 225.0

Change in memory:
Flash: -122 bytes
SRAM: 0 bytes


Below is the changes to the assembly output for the selector limits. This optimisation removes a call to __floatunsisf and __cmpsf2

Before

if (AxisDistance(mm::axisUnitToTruncatedUnit<config::U_mm>(mm::motion.CurPosition<mm::Selector>())) < (config::selectorLimits.lenght.v - 3)) { //@@TODO is 3mm ok?
    2174:	ce 01       	movw	r24, r28
    2176:	0e 94 79 07 	call	0xef2	; 0xef2 <modules::motion::MovableBase::AxisDistance(long) const>
    217a:	bc 01       	movw	r22, r24
    217c:	90 e0       	ldi	r25, 0x00	; 0
    217e:	80 e0       	ldi	r24, 0x00	; 0
    2180:	0e 94 93 34 	call	0x6926	; 0x6926 <__floatunsisf>
    2184:	20 e0       	ldi	r18, 0x00	; 0
    2186:	30 e0       	ldi	r19, 0x00	; 0
    2188:	40 e9       	ldi	r20, 0x90	; 144
    218a:	52 e4       	ldi	r21, 0x42	; 66
    218c:	0e 94 58 34 	call	0x68b0	; 0x68b0 <__cmpsf2>
    2190:	87 fd       	sbrc	r24, 7
    2192:	1a c0       	rjmp	.+52     	; 0x21c8 <modules::selector::Selector::FinishHomingAndPlanMoveToParkPos()+0x70>
    2194:	88 e9       	ldi	r24, 0x98	; 152
    2196:	9a e3       	ldi	r25, 0x3A	; 58
    2198:	a0 e0       	ldi	r26, 0x00	; 0
    219a:	b0 e0       	ldi	r27, 0x00	; 0
    219c:	80 93 8b 05 	sts	0x058B, r24	; 0x80058b <modules::motion::motion+0x142>
    21a0:	90 93 8c 05 	sts	0x058C, r25	; 0x80058c <modules::motion::motion+0x143>
    21a4:	a0 93 8d 05 	sts	0x058D, r26	; 0x80058d <modules::motion::motion+0x144>
    21a8:	b0 93 8e 05 	sts	0x058E, r27	; 0x80058e <modules::motion::motion+0x145>

After:

if (AxisDistance(mm::axisUnitToTruncatedUnit<config::U_mm>(mm::motion.CurPosition<mm::Selector>())) < uint16_t(config::selectorLimits.lenght.v - 3)) { //@@TODO is 3mm ok?
    2174:	ce 01       	movw	r24, r28
    2176:	0e 94 79 07 	call	0xef2	; 0xef2 <modules::motion::MovableBase::AxisDistance(long) const>
    217a:	88 34       	cpi	r24, 0x48	; 72
    217c:	91 05       	cpc	r25, r1
    217e:	d0 f0       	brcs	.+52     	; 0x21b4 <modules::selector::Selector::FinishHomingAndPlanMoveToParkPos()+0x5c>
    2180:	88 e9       	ldi	r24, 0x98	; 152
    2182:	9a e3       	ldi	r25, 0x3A	; 58
    2184:	a0 e0       	ldi	r26, 0x00	; 0
    2186:	b0 e0       	ldi	r27, 0x00	; 0
    2188:	80 93 8b 05 	sts	0x058B, r24	; 0x80058b <modules::motion::motion+0x142>
    218c:	90 93 8c 05 	sts	0x058C, r25	; 0x80058c <modules::motion::motion+0x143>
    2190:	a0 93 8d 05 	sts	0x058D, r26	; 0x80058d <modules::motion::motion+0x144>
    2194:	b0 93 8e 05 	sts	0x058E, r27	; 0x80058e <modules::motion::motion+0x145>

Copy link

github-actions bot commented Dec 22, 2024

All values in bytes. Δ Delta to base

ΔFlash ΔSRAM Used Flash Used SRAM Free Flash Free SRAM
-122 0 28132 1667 540 893

@gudnimg
Copy link
Collaborator Author

gudnimg commented Dec 22, 2024

To add to the PR description... you will notice that applying this optimisation on only one of the if statements does not save a lot of resources. However, once you apply it to both if statements, __cmpsf2 is entirely removed from the firmware image. I suspect majority of the savings come from that.

For reference (https://gcc.gnu.org/onlinedocs/gccint/Soft-float-library-routines.html), what does __cmpsf2 do? It essentially compares two floating point values. We try to eliminate all floating point operations in the MMU firmware.

image

Copy link
Collaborator

@DRracer DRracer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch 👍 , please carry on 🙏 ... the compiler is not very intelligent

Copy link

Automated Test Code Coverage Report

View details...

File Lines Exec Cover
src/application.cpp 169 14 8%
src/application.h 3 3 100%
src/hal/circular_buffer.h 52 52 100%
src/hal/eeprom.h 1 0 0%
src/hal/gpio.h 18 7 38%
src/hal/progmem.h 6 6 100%
src/hal/tmc2130.cpp 113 9 7%
src/hal/tmc2130.h 31 27 87%
src/logic/command_base.cpp 139 118 84%
src/logic/command_base.h 4 3 75%
src/logic/cut_filament.cpp 117 80 68%
src/logic/eject_filament.cpp 77 60 77%
src/logic/feed_to_bondtech.cpp 60 57 95%
src/logic/feed_to_bondtech.h 1 1 100%
src/logic/feed_to_finda.cpp 48 45 93%
src/logic/feed_to_finda.h 1 1 100%
src/logic/home.cpp 18 12 66%
src/logic/load_filament.cpp 85 77 90%
src/logic/load_filament.h 1 0 0%
src/logic/move_selector.cpp 21 0 0%
src/logic/no_command.h 2 1 50%
src/logic/retract_from_finda.cpp 27 21 77%
src/logic/retract_from_finda.h 1 1 100%
src/logic/set_mode.cpp 5 0 0%
src/logic/set_mode.h 1 0 0%
src/logic/start_up.cpp 38 26 68%
src/logic/start_up.h 4 4 100%
src/logic/tool_change.cpp 108 82 75%
src/logic/unload_filament.cpp 76 69 90%
src/logic/unload_to_finda.cpp 40 39 97%
src/logic/unload_to_finda.h 1 1 100%
src/modules/axisunit.h 21 21 100%
src/modules/buttons.cpp 11 11 100%
src/modules/buttons.h 7 7 100%
src/modules/crc.h 13 13 100%
src/modules/debouncer.cpp 28 24 85%
src/modules/debouncer.h 7 7 100%
src/modules/finda.cpp 7 3 42%
src/modules/finda.h 2 2 100%
src/modules/fsensor.cpp 6 6 100%
src/modules/fsensor.h 3 3 100%
src/modules/globals.cpp 47 42 89%
src/modules/globals.h 34 24 70%
src/modules/idler.cpp 89 82 92%
src/modules/idler.h 12 12 100%
src/modules/leds.cpp 44 42 95%
src/modules/leds.h 16 15 93%
src/modules/math.h 6 6 100%
src/modules/motion.cpp 59 40 67%
src/modules/motion.h 66 64 96%
src/modules/movable_base.cpp 73 70 95%
src/modules/movable_base.h 16 16 100%
src/modules/permanent_storage.cpp 144 89 61%
src/modules/protocol.cpp 216 184 85%
src/modules/protocol.h 72 70 97%
src/modules/pulley.cpp 33 25 75%
src/modules/pulley.h 8 5 62%
src/modules/pulse_gen.cpp 95 89 93%
src/modules/pulse_gen.h 53 51 96%
src/modules/selector.cpp 69 62 89%
src/modules/selector.h 5 5 100%
src/modules/speed_table.h 26 24 92%
src/modules/user_input.cpp 39 39 100%
src/modules/user_input.h 12 12 100%
src/modules/voltage.cpp 4 0 0%
src/registers.cpp 104 37 35%
src/unit.h 12 12 100%
TOTAL 2727 2030 74%

TOTAL: 2727 lines of code, 2030 lines executed, 74% covered.

AxisDistance returns uint16_t type and is currently compared with long double axis length. The axis lengths fit easily into uint16_t:

selectorLimits.lenght = 75
idlerLimits.lenght = 225

Change in memory:
Flash: -122 bytes
SRAM: 0 bytes
@gudnimg
Copy link
Collaborator Author

gudnimg commented Dec 24, 2024

Rebased to sync with main. Will merge if all CI checks pass again.

@gudnimg gudnimg merged commit 4ab07d6 into prusa3d:main Dec 24, 2024
3 checks passed
@gudnimg gudnimg deleted the opt-idea-gudni branch December 24, 2024 13:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants