New Release: Doom
Moderator: Moderators
- Kweepa
- Vic 20 Scientist
- Posts: 1314
- Joined: Fri Jan 04, 2008 5:11 pm
- Location: Austin, Texas
- Occupation: Game maker
I procrastinated on fixing up the AI by converting the map code to assembly. It's now quite smooth. The entire automap is 1.25k. The C version was about 1.8k.
As usual you can see the code and the latest version (d64) at:
https://github.com/Kweepa/vicdoom
In xvic, set the memory to Full (blocks 0/1/2/3/5).
As usual you can see the code and the latest version (d64) at:
https://github.com/Kweepa/vicdoom
In xvic, set the memory to Full (blocks 0/1/2/3/5).
I have to be perfectly honest with you...I have in my possession a 3D game programming book that's nearly 1200 pages long that I've pretty much read cover to cover.
The math alone puts me to shame but what gets me is the ability to get 3 FPS out of the VIC with 2048 pixel screen just blows me away.
I'm trying to wrap my head around it and it seems that there shouldn't be enough cycles to render anything more than a single frame per second.
Incredible!
The math alone puts me to shame but what gets me is the ability to get 3 FPS out of the VIC with 2048 pixel screen just blows me away.
I'm trying to wrap my head around it and it seems that there shouldn't be enough cycles to render anything more than a single frame per second.
Incredible!
Learning all the time...
- Kweepa
- Vic 20 Scientist
- Posts: 1314
- Joined: Fri Jan 04, 2008 5:11 pm
- Location: Austin, Texas
- Occupation: Game maker
Well, it's a bit of a cheat since the doom engine renders walls in vertical strips. The inner loop of the wall renderer runs at 48 cycles/pixel, so the theoretical max framerate is 10FPS when pressed up against a wall, and 20FPS when there's 50% coverage. Of course there's a lot of overhead in transforming sectors to screen space and setting up the wall drawing. And so on.
Here's the inner loop, by the way:
Here's the inner loop, by the way:
Code: Select all
loop:
texAddr:
lda textures ; self modified texture address
; shift into position for the screen
; modify these as required (asl/lsr/nop)
shiftcodegoeshere:
lsr
lsr
lsr
lsr
tmask:
and #0 ; self modified immediate operand
scrbuf:
ora buffer,x ; self modified screen addr
scrbuf2:
sta buffer,x ; self modified screen addr
dex
bmi end
tya
clc
stepLo:
adc #0 ; self modified immediate operand
tay
lda texAddr+1
stepHi:
adc #0 ; self modified immediate operand
sta texAddr+1
jmp loop
Just wondering, did you see this C64 (and C128)-demo?
http://noname.c64.org/csdb/release/?id=81157
Of special interest is the "Doom Workstage" included in it:
http://noname.c64.org/csdb/release/down ... ?id=102609
Looks quite nice on the C64 already, and even better on the C128. Maybe you can adapt some of the tricks they used.
http://noname.c64.org/csdb/release/?id=81157
Of special interest is the "Doom Workstage" included in it:
http://noname.c64.org/csdb/release/down ... ?id=102609
Looks quite nice on the C64 already, and even better on the C128. Maybe you can adapt some of the tricks they used.
- Kweepa
- Vic 20 Scientist
- Posts: 1314
- Joined: Fri Jan 04, 2008 5:11 pm
- Location: Austin, Texas
- Occupation: Game maker
Hmm, would that still work when the "shifting" is all NOPs?TNT wrote:How about masking before shifting, that will eliminate the need for "clc".Kweepa wrote:Here's the inner loop, by the way:
I guess I need to determine if anything can set the carry flag in the inner loop.
Thanks! I will roll that in!Also, "dex; bpl" will save two cycles, the cost being that you need to "jmp loop" instead of dropping into it as texture address increment must be moved just before the loop.
That's a beautiful demo!tokra wrote:Just wondering, did you see this C64 (and C128)-demo?
Looks quite nice on the C64 already, and even better on the C128. Maybe you can adapt some of the tricks they used.
I don't know what I can get from it though. I'm way too far in another direction.
Oh wow, I just noticed you can walk around with joy 2! I figured it was "just" rendering some precomputed scenes. Jaw dropping. I feel inadequate now. Thanks.
Yes, if you clear carry before you enter the loop and if you don't read textures backwards (step is always positive). I doubt you wrap address from $ffff to $0000 on purpose.TNT wrote:Hmm, would that still work when the "shifting" is all NOPs?Kweepa wrote:How about masking before shifting, that will eliminate the need for "clc".
You can combine NOPs to undocumented "nop #imm / nop $abs" (or use "bit $zp / bit $abs" if you don't like undocumented opcodes) to save couple of cycles.
- Mike
- Herr VC
- Posts: 4816
- Joined: Wed Dec 01, 2004 1:57 pm
- Location: Munich, Germany
- Occupation: electrical engineer
The renderer in the Andropolis Doom stage "just" needs to draw the top and bottom lines of all walls. Then there's a completely unrolled EOR filler thrown against the bitmap. The halftones are realised by handling even and odd rasters separately.Kweepa wrote:Oh wow, I just noticed you can walk around with joy 2! I figured it was "just" rendering some precomputed scenes. Jaw dropping. I feel inadequate now. Thanks.
But this renderer won't do textures.
Anyway, over the years I have made the observation that the speed a "core" routine might deliver nearly always gets gobbled up by overhead of the surrounding program in the order of ~50%. I.e. if you have a line routine available which does ~30000 pixels/second if driven directly (I do have ), you will only get around ~15000 pixels/second in a "real world" application.
So, something around 45 cycles/pixel surely isn't bad for a constant-Z texture-mapper, but then you also need to clear the screen, exchange display buffers somehow, calculate slopes, clip lines, etc. And then 4-5 fps for that window size is in order.
- Kweepa
- Vic 20 Scientist
- Posts: 1314
- Joined: Fri Jan 04, 2008 5:11 pm
- Location: Austin, Texas
- Occupation: Game maker
Totally agreed.Mike wrote: So, something around 45 cycles/pixel surely isn't bad for a constant-Z texture-mapper, but then you also need to clear the screen, exchange display buffers somehow, calculate slopes, clip lines, etc. And then 4-5 fps for that window size is in order.
Clear the screen - not much you can do about that. With a bit of loop unrolling, about 40 scan lines.
Exchange display buffers can be done in two instructions - just set the character map address.
- Kweepa
- Vic 20 Scientist
- Posts: 1314
- Joined: Fri Jan 04, 2008 5:11 pm
- Location: Austin, Texas
- Occupation: Game maker
I got the inner loop down to 43 cycles average with the suggestions in this thread - thanks all!
Worked on the ai some more. They can now follow the player from sector to sector, shoot, react to being shot, and (sort of) die. Getting there! Some tuning, clean up, and optimization to go.
I don't think I'll be getting this finished for Christmas... sorry, Denial, I don't have a Christmas present for you. I still have to lay out the other levels, which will take a few days. I might also convert a few more songs.
Worked on the ai some more. They can now follow the player from sector to sector, shoot, react to being shot, and (sort of) die. Getting there! Some tuning, clean up, and optimization to go.
I don't think I'll be getting this finished for Christmas... sorry, Denial, I don't have a Christmas present for you. I still have to lay out the other levels, which will take a few days. I might also convert a few more songs.
No problem, we waited nearly 30 years for that, so it makes no difference to wait another couple of weeks or month ...Kweepa wrote:sorry, Denial, I don't have a Christmas present for you. I still have to lay out the other levels, which will take a few days. I might also convert a few more songs.
But it is really great and I enjoy to see this project growing!
Have a nice Christmas all you Denial members!!
- Pedro Lambrini
- Vic 20 Scientist
- Posts: 1132
- Joined: Mon Dec 01, 2008 11:36 am