Write patcher/new loader to load at $9000 and relocate itself over the old loader at $9C6F:
org $9000
clc
xce
rep #$30
ldx #$0000
relo lda patch,x
sta $9C6F,x
inx
inx
cpx #pend-patch2
bcc relo
* TEST area to insert other stuff
nop
nop
nop
nop
nop
nop
nop
nop
* TEST end
jmp $9478
patch = *
* new regular smartport loader to replace Load_Segment
org $9C6F
patch2
pha
sec
xce
sep #$30
lda #$41
sta $C029
lda #$F0
sta $C022
clc
xce
rep #$30
pla
sta $00
lda ($00)
sta blknum
]loadlp inc $00
inc $00
lda ($00)
sta buffer
inc $00
inc $00
lda ($00)
sta buffer+2
inc $00
inc $00
lda ($00)
beq ret
lsr
tay
inc $00
inc $00
]blklp phy
sec
xce
sep #$30
* TEST show info about the block we're about to read
lda blknum+1
jsr $FDDA
lda blknum
jsr $FDDA
lda #$A0
jsr $FDED
lda #$C0
jsr $FDED
lda #$A0
jsr $FDED
lda buffer+2
jsr $FDDA
lda buffer+1
jsr $FDDA
lda buffer
jsr $FDDA
lda #$A0
jsr $FDED
lda $01
jsr $FDDA
lda $00
jsr $FDDA
lda #$8D
jsr $FDED
* TEST end
jsr $C50D
db $41
adrl params
clc
xce
rep #$30
inc buffer+1
inc buffer+1
inc blknum
ply
dey
cpy #$00
bne ]blklp
jmp ]loadlp
ret rts
params db $03
db $01
buffer adrl $0
blknum adrl $0
pend = *
Compares with zero are for clarity. The interesting thing is that I jump to $9478 instead of $9400, which means I jump over the rewriting of the toolbox vectors. If I jump straight to $9400, the loader will crash about halfway through on my ROM 3 + CFFA3000, although it runs fine in an emulator. Does CFFA3000 extended smartport call invoke tools? Maybe, too lazy to find out right now. Also I don't derive $C50D properly, which is fine in this case because the target interface is known (CFFA card).
Used Block Warden to Follow the assembled Merlin output, then Change to slot5 drive1, then Write it to block $0600 of the Modulae disk.