Programming language implementation ...

Basic and Machine Language

Moderator: Moderators

Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

It wouldn't let me attach a .c file, but here is a .zip file as you suggested.
Attachments
parse.zip
(1.62 KiB) Downloaded 52 times
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

This version builds with cc65 (but I haven't tested the resulting code on 6502.)
I had to comment out the power function to get it to build with cc65.
This is very quick and dirty code, needless to say.
Attachments
parse65.zip
(1.65 KiB) Downloaded 57 times
User avatar
Mike
Herr VC
Posts: 4816
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Programming language implementation ...

Post by Mike »

One quick stab at it:

The operators "+" and "-" are supposed to be on the same precedence level, as are "*" and "/". Two consecutive operators of same precedence level are evaluated left to right.

Your parser places "+" above "-" and "/" above "*", which is wrong.

Both expressions 1 + 2 - 3 and 1 - 3 + 2 should evaluate to 0, however the parser evaluates these as (1 + 2) - 3 = 0 != -4 = 1 - (3 + 2).

:(
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

Like I say it is quick and dirty code - just my tinkering from this afternoon.

It is based on the URL referenced in the source, and there may be some shortcomings to his algorithm implementation. I suspect it is possible to do it without the recursion which would be more stack friendly (important on 6502!) I am not sure if there is an easy way for two operators to have the same precedence with this method.

Incidentally I got the second version to compile with cc65 and can get as far as the gets() call under VICE, but it doesn't evaluate. I am looking to see where it chokes.

(EDIT: I remember learning in school 'BODMAS' Brackets Of (meaning division) Multiplication Addition and Subtraction ... so there is some idea that division has higher precedence than multiplication and addition over subtraction.)
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

I think the shortcoming is that it uses the token value as the precedence but these need to be independent to allow two different tokens to have the same precedence ...
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

I think this tweak fixes the problem you noticed Mike. It makes precedence independent of the token value. The implementation is very quick and dirty just to prove the point, but it seems to do the trick.

Thank you for the alpha testing!

Code: Select all

Enter expression: 
1-3+2
0
Enter expression: 
1+2-3
0
Attachments
parse.zip
(1.69 KiB) Downloaded 62 times
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

Yay! I got the expression parser working on VICE. (Will try it on the real VIC once I get access to the TV!)

There were a couple of lurking bugs in the code which compiling with cc65 exposed.
snapshot.png
snapshot.png (2.48 KiB) Viewed 1437 times
I'll send the updated C source once I clean it up a bit!

I still want to hand code this in 6502 though. I may take a look at the cc65 assembly to see how good a job it did. I haven't tried optimization yet. It's nearly 6K which seems a bit excessive, but I expect a lot of that is the C runtime. (How big is a null program compiled with cc65 I wonder ... ?)
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

The attached debugged C source builds with cc65 and runs on VICE.
Let me know if you find any more bugs! :roll:

(BTW this workload runs 7800 times faster on my PC than on the VIC!)
Attachments
parse.zip
(1.94 KiB) Downloaded 59 times
User avatar
Mike
Herr VC
Posts: 4816
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Programming language implementation ...

Post by Mike »

Now this looks more like it. :)

Another quick session this morning didn't reveal any obvious bugs.
Bobbi wrote:I remember learning in school 'BODMAS' Brackets Of (meaning division) Multiplication Addition and Subtraction ... so there is some idea that division has higher precedence than multiplication and addition over subtraction.
The concepts taught in school are geared to be easy to grasp, and hopefully correct in most circumstances.

"BODMAS" however fails in several cases (No doubt: a bracketed subexpression must be evaluated in full before the result is used):

1. A division actually has *less* precedence when written as horizontal divider - both numerator and denominator are implicitly bracketed:

Code: Select all

    3 + 5     8
    -----  = --- = 4 = (3 + 5)/(1 + 1) != 3 + 5 / 1 + 1 = 9
    1 + 1     2
Note: Both sides of the inequality are *correct*. This example just shows the interpretation changes between fractional notation and writing as "linear" expression.

2. use of "shortened" multiplication can lead to ambiguity:

=> what is "1/2x"? (1/2)*x or 1/(2*x)? The answer depends on your maths professor. Once again, fractional notation clears this up, the expressions:

Code: Select all

   1       1               1        1
  --- x = --- * x  and  ------- = -----
   2       2             2 * x     2 x
can clearly be distinguished. In linear notation, multiplication and division have the same precedence and are evaluated left-to-right, i.e. 1 / 2 * x = (1 / 2) * x. If what you really wanted was 1 / (2 * x), you have to write it this way.

3. There's no precedence of addition over subtraction:

=> a running sum of positive and negative constants can be evaluated in any order without changing the result. Likewise it is permissible to reorder the expression in that case:

4 + (-3) + 2 + (-1) = 4 - 3 + 2 - 1 = 2 ... = 4 + 2 - 3 - 1 = 2 - 1 - 3 + 4 = -3 + 2 - 1 + 4 ... you get the idea.

Of course when a bracketed expression is preceded by a minus operator, and we want to solve the bracket, *all* terms inside need to be negated:

3 + 4 - (2 - 1) = 3 + 4 - 2 + 1

As you see it's quite common to change between addition and subtraction when negative constants or bracketed subexpressions are involved, it makes no sense to give addition a higher priority over subtraction.

It's however a good style to put negative constants into brackets. Not only makes this the precedence level clear in case of exponentiation ( -3^2 = -9 != 9 = (-3)^2 ), but it also prevents reading errors: when multiplication is written as single point in the middle of the line (a notation quite common) the case "2 ·- 3" could easily be mistaken for "2 - 3", when really "2 * (- 3)" was meant.


...


Just some random thoughts. :)

P.S. http://sleepingelephant.com/ipw-web/bul ... 3&start=34 ;)
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

When coding I tend to use lots of brackets, rather than remember all the precedence rules. Makes the code easier to read, especially for those of us that use lots of different languages with slightly different rules!

I will have to take a look at how much work it is to rewrite this little parser in assembly.
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

I am still playing with this with the intention of first putting together a minimal language (think Tiny Basic) and then seeing if I can extend it into something a little more useful.
snapshot.png
snapshot.png (3.27 KiB) Viewed 1403 times
So far all that is possible is to declare integer variables (not update them yet, that is coming soon!) and to use those variables in expressions.

Code: Select all

int name = expression
print expression
quit
# comment
It's still buggy, but getting somewhere I think.
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

Incidentally, I am still working on this project - developing a simple interpreted language with the VIC-20 as the target. Mostly I have been pursuing this as a learning exercise, but I hope the resulting language is somewhat useful!

It is in reasonable shape now, and I am going to have to stop adding features soon because the available memory is dropping fast! I need to do a bit more testing and improve the error handling and then I think I can make an alpha release.

Right now it is useful as a programmable calculator and text editor combo :) I also tried to make it a nice language for poking memory and bit twiddling. It's good for doing stuff like this:

Code: Select all

int vic <- $900f
int bord <- 3
int scrn <- 4
*vic <- (bord<<4)#(scrn&$07)
(Where <- is assignment and # is bitwise or, because VIC doesn't have |).

Features:
- Integrated line editor which is also usable for plain PETSCII files (provided they fit in available memory). Feels a bit like one of the UNIX line editors, which is OK for me but I expect others will hate it!
- Supports loading and saving programs (or random text files) to disc (tested with SD2IEC and VICE, we'll see if it works with real drives!)
- Language supports the following:
- Signed 16 bit integer variables ("int")
- Named labels ("lbl")
- Unconditional jump ("jump")
- Conditional if ("if")
- Named subroutines ("sub" / "return") (no parameters though)
- C style expressions including logical, bitwise and shift operators.
- Memory access using C-style prefix * for dereference.
- Displays values in decimal or hex.
- Constants can be input in decimal or hex.

Minuses:
- It is written in C and compiled with cc65. C is not a great language for 6502 / cc65 is not a great compiler. The code is pretty bloated and I have to stop adding functions before I have nothing left from my 32K expansion! It would be better (prob half the size) directly written in assembler and I may do that as some point. However it is easier to play with the code in C.
- It does not tokenize - it just executes the raw program text. I may change this to make it more economical with memory and a little faster but for now I prefer it as it is because I can use the line editor to edit plain PETSCII files too.
- Related to the two points above, it is slow. It seems to be about half the speed of CBM BASIC (and CBM BASIC is using floats, not ints.)
snapshot.png
snapshot.png (1.64 KiB) Viewed 1379 times
Sample program (<- is the CBM left arrow key which I use for assignment):

Code: Select all

call setscrcol
int i <- 99
lbl mainloop
  i <- i - 1
  print i
  if i>1 prstr " bottles of beer"
  if i==1 prstr " bottle of beer"
if i>0 jump mainloop

# Set screen color
# to my favorite combo
sub setscrcol
  *($900f) <- 254
return
User avatar
Mike
Herr VC
Posts: 4816
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Programming language implementation ...

Post by Mike »

Bobbi wrote:<- is the CBM left arrow key which I use for assignment
Using the PETSCII left-arrow symbol for assignments is just plain lovely. <3
It is in reasonable shape now, and I am going to have to stop adding features soon because the available memory is dropping fast! [...] C is not a great language for 6502 / cc65 is not a great compiler. The code is pretty bloated and I have to stop adding functions before I have nothing left from my 32K expansion!
One major aspect what "kills" the code size for C on the 65xx is the parameter passing of functions. The cc65 C runtime has to simulate the C stack in all its glory, and that housekeeping blows up the function entry, exit and variable access. You really can't blame this on cc65.

If you declare all variables as extern and all functions as "void fn(void)", you can expect the code size to drop by a good amount. Of course you're then letting go all the goodies from the C language and just use it as portable assembler, but that's the way it is.
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

I don't have too many locals anyhow for the reasons you describe. However I have been focusing on getting things working and there is probably some slimming down that can be done. There is common code that can be factored out, for example.
Bobbi
Vic 20 Afficionado
Posts: 355
Joined: Thu Oct 13, 2016 11:35 am
Location: Toronto
Occupation: Programmer

Re: Programming language implementation ...

Post by Bobbi »

Incidentally, didn't the APL programming language use the left arrow for assignment?

I know notations like Register Transfer Language (RTL) use left arrow this way, and I also believe that the := used in Pascal type languages is meant to look like a left arrow. I did think of using other PETSCII characters for operators, but the code gets really crazy looking really fast!
Post Reply