Shorten text on a cc65 program for Vic20?

You need an actual VIC.

Moderator: Moderators

HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

Hi! I have a program called AdvSkelVic. It is the source code around which one can build one's own text adventure. It uses CBMSimpleIO and some Assembler, so it's pretty efficient. I have a cc65 function for cc65/CBM targets called printtok() that substitutes tokens in a string for tokenized text. It is re-entrant, so you can embed tokens in other tokens. I will reveal the URLs on request. Other than tokens, is there any way to shorten strings with little to no effort and extra code?
DarwinNE
Vic 20 Devotee
Posts: 231
Joined: Tue Sep 04, 2018 2:40 am
Website: http://davbucci.chez-alice.fr
Location: Grenoble - France

Re: Shorten text on a cc65 program for Vic20?

Post by DarwinNE »

Tokenized text seems a nice approach. In my adventures, I implemented a Huffman compression. It allows to get compress ratio of about 45-50%. Decompression is reasonably quick with a binary tree and is done on the fly.
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

Thank you. :)
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

I have code to compress Adaptive Huffman Codes, but it's not ready for use yet, as I'm still trying to get big numbers from my compression techniques. Can you kindly post your compression code here, please?
DarwinNE
Vic 20 Devotee
Posts: 231
Joined: Tue Sep 04, 2018 2:40 am
Website: http://davbucci.chez-alice.fr
Location: Grenoble - France

Re: Shorten text on a cc65 program for Vic20?

Post by DarwinNE »

Yes, of course. Here is it:

https://github.com/DarwinNE/aws2c/blob/ ... compress.c

Probably the compression code is not very elegant or efficient, but it is meant to work on a modern computer. The decompression code is generated by the output_decoder function:

https://github.com/DarwinNE/aws2c/blob/ ... ess.c#L207

The whole is part of AWS2C that generates the C source code of an adventure from a sort of a meta-language called AWS.
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

Again, thank you. Does anybody here have any other ideas?
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

DarwinNE: I ask you to add tokens to your text adventure creator and mention they're from me. :)
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

I ask you to support cc65 directly and use my Cubbyhole technique and other optimizations mentioned at https://sourceforge.net/projects/cc65extra/files/. The Cubbyhole technique puts some code and data in the first 1k on a Vic20 and C64, 2k on the Plus4 and 7k on a C128 that is not used during the course of a cc65 program. My MadLib* and AdvSkel*65 programs demonstrate their use.
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Shorten text on a cc65 program for Vic20?

Post by Mike »

HarryP2 wrote:[...] Does anybody here have any other ideas?
I used a text compression method in Bible Series, part II: Pentateuch that encodes three subsequent characters of a fixed 5-bit alphabet (all lowercase letters, blank, comma, period, semi-colon, hyphen and single quote) into two bytes.

Chuck Guzis had this algorithm published in Dr. Dobb's Journal #207 (November 1993), and I made some small enhancements to ensure binary transparency. With this algorithm, most texts compress by ~30%, which was sufficient in this case to fit the whole Torah on one disk for the 1581.

Before you "ask" me about something - said application and its support algorithms work fine as they are.
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

I can't use your algorithm as you stated, as I have a need for 7 bits per entry. But, I could compress to 7 bits per entry, including tokens. What do you think?

BTW, on some versions of my text adventure code, I store most strings in a system's extra memory. I could create a function to decompress and access the text on-the-fly. Thank you for the info! :)
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

I have a technique called POBasic, short for Placement Offset Basic. It shortens some literals to an offset to the previous version of the literal. Sort of LZ77 for single bytes. The problem with both this and Adaptive Huffman is knowing the values before the current string. I don't think POBasic would be better than Huffman, but both combined seems to help. :)

BTW, if you use this method, all I ask is that you reveal in your docs. that you use this technique and e-mail me that you did so and with the name and URL of the program. If you have any questions, just reply here, PM me or ask me for my e-mail address.
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Shorten text on a cc65 program for Vic20?

Post by HarryP2 »

One more idea: if you have a Huffman lit that will compress to >=8 bits, don't compress it. Rather, write it directly. Unfortunately, you need an extra bit to determine whether a byte is compressed or not, but my tests show that this works. Then again, if a block doesn't include such bytes, one bit at the start of the block will determine whether to do my idea. Try it out!

BTW, Is there an easy way to compress text other than tokenization? I'm currently not ready for Huffman. :(
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Exchange of compression ideas

Post by HarryP2 »

Hi! I am offering an exchange of compression techniques, and I want some ideas not listed on Wikipedia or in addition to something listed on Wikipedia in exchange. My ideas:

* Tokenization: often-repeated text is shortened to a one byte token. AN example of this is my cc65 printtok() function at https://sourceforge.net/projec.../files/ui/.
* POBasic: If a lit is the same as a near-by previous lit, shorten it to an offset to the previous version.
* Lasrt16: If a LZ77-compressed block was a repeat of one of the last 32 LZ77-compressed blocks, compress it to a number indicating which previous block is used.
* If a Huffman Code-compressed literal compresses to more than 8 bits, don't compress it. Rather, copy it as an 8-bit literal. This requires an extra bit per literal, but my experiments show that this usually helps.

You may use these ideas in your programs provided you mention me as the source of the ideas in them and tell me via PM or e-mail you're using these ideas. Right now, I'm especially looking for something which requires little work.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: Exchange of compression ideas

Post by chysn »

It doesn't seem like a one-byte token would provide a big enough lexicon for a reasonably-complex text adventure. I'd probably use an extendible tokenization system (where $ff indicates that there's a second byte that switches to a second table, and so on), or maybe a straight 12-bit token.
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5

WIP: MIDIcast BASIC extension

he/him/his
HarryP2
Vic 20 Dabbler
Posts: 85
Joined: Sat Sep 26, 2015 8:40 am
Location: New York, U.S.A.

Re: Exchange of compression ideas

Post by HarryP2 »

Thank you. You're right about the tokens. I managed to compress the text of one of my text adventures by 13.6%, and I still have room for a few more tokens.
Post Reply