Filename Character Filter

Basic and Machine Language

Moderator: Moderators

Post Reply
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Filename Character Filter

Post by chysn »

If I'm building a filename input box that validates filenames in real time, what characters do I absolutely not want to allow, to maintain compatibility with both original 154x drives, SD2IEC, and VICE file systems? I've found that filenames with * don't get written in VICE, so that's an obvious BEQ FORGETABOUTIT. I'm also filtering out / because I don't want to deal with directories for more modern systems.

Anything else that just shouldn't be permitted?
VIC-20 Projects: wAx Assembler, TRBo: Turtle RescueBot, Helix Colony, Sub Med, Trolley Problem, Dungeon of Dance, ZEPTOPOLIS, MIDI KERNAL, The Archivist, Ed for Prophet-5

WIP: MIDIcast BASIC extension

he/him/his
User avatar
srowe
Vic 20 Scientist
Posts: 1340
Joined: Mon Jun 16, 2014 3:19 pm

Re: Filename Character Filter

Post by srowe »

chysn wrote: Sun Dec 31, 2023 11:08 am Anything else that just shouldn't be permitted?
Definately commas, and probably colons.
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Filename Character Filter

Post by Mike »

Good question.

The file requesters for load and save picture in MINIPAINT are somewhat strict: besides all digits, lower- and uppercase letters, they only allow for the following 15 symbols:

! % ' ( ) + - . / ; < > [ ] ^

... and space. This list comprises a common subset of PETSCII and 7-bit ASCII which cannot be mistaken by disc drives as commands, or parts thereof. The text display in MP actually is based on ASCII, but the filenames are sent as PETSCII to the drives (and for the latter, the lower case letters span the codes 65..90, whereas the upper case letters span the codes 193..218). I left in space and "/" as these are quite common in CBM DOS file names, you probably might want to exclude these two as well.

The "load picture" requester additionally allows for "?" and "*" as wild cards.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: Filename Character Filter

Post by chysn »

All right, thanks! Looks like the best strategy is to plow through a disallowed character table for each keypress.

Since I'm shooting for interoperability with other systems, I'm also disallowing < and > (apparently forbidden by Windows, maybe as redirection operators?). I'm disallowing / but not space, since every modern file system is fine with spaces.

Why would we disallow @, =, and # from Commodore filenames?
User avatar
Mike
Herr VC
Posts: 4841
Joined: Wed Dec 01, 2004 1:57 pm
Location: Munich, Germany
Occupation: electrical engineer

Re: Filename Character Filter

Post by Mike »

chysn wrote:Why would we disallow @, =, and # from Commodore filenames?
I disallow "@" to prevent users using it in case they think they need to do a save-with-replace (anyhow, MP does a scratch command on the filename first).

As the file name is also used with the CBM DOS command line, I disallow "=" as that is part of some commands, and I do not want the parser to choke on that one.

Finally, "#" is reserved for buffers in the drive RAM (for use with the sector read/write commands, for example), and does not work as regular file name.
User avatar
chysn
Vic 20 Scientist
Posts: 1205
Joined: Tue Oct 22, 2019 12:36 pm
Website: http://www.beigemaze.com
Location: Michigan, USA
Occupation: Software Dev Manager

Re: Filename Character Filter

Post by chysn »

Gotcha. Well, I switched the code over to an allowlist, which makes things easier when writing the manual. I'm using your list, Mike, except I'm not including < and > (for whatever reasons Windows has. I don't have any Windows machines around to test with), nor the forward slash.

Not that you allow it, but I'm leaving out the GBP sign because (1) It's the stupidest waste of a key ever, and (2) Its $5c is the escape character in Linux, which is just asking for trouble.

I am going to add the left-arrow, which corresponds to underscore in other systems, and is commonly-used in filenames. I'm actually changing left-arrow to an ersatz underscore in the interface, because that's how it'll show in other filesystems.

Thank you both!
Screen Shot 2024-01-01 at 1.51.18 PM.png
Screen Shot 2024-01-01 at 1.54.09 PM.png
Post Reply