Pathnames

From tango.info wiki
Jump to navigation Jump to search
The printable version is no longer supported and may have rendering errors. Please update your browser bookmarks and please use the default browser print function instead.

Non-ASCII characters

The Issue

Non-ASCII characters in the pathnames (foldernames and file names) of music library tracks can cause problems due to incompatiblities with:

  • Macintosh
  • Portable music players e.g. Rockbox. These chars (perhaps only Unicode ones) can prevent play of the file and of subsequent files.
  • FTP clients and servers (where Unicode characters often appear as a garbage character pairs)
  • Some utility programs e.g. FLAC encoder, Beyond Compare 2
  • File-sharing programs e.g. SoulSeek v157 (also fails to preserve upper-case, converting to lower-case)
  • Librarian programs e.g.

Some of these cases may be confined to characters outside the Extended ASCII set.

Workarounds

Replace non-ASCII characters with e.g. ASCII stand-ins:

  1. accented -> non-accented
  2. left and right quotes -> upright quote
  3. character with no ASCII lookalike -> underscore

Such replacements can be performed by Mp3tag:

  1. Action Group containing a Replace action for each known character
  2. Ditto
  3. $ansi() (which reduces to Extended ASCII) followed by $regexp() to replace all remaining non-ASCII chars with a lookalike or substitute, e.g. underscore. $ansi() uses ? as a substitute, so for Windows-compatible pathnames (in which ? is disallowed), handle this either in the $regexp() or together with all other pathname-invalid characters by using $validate() .

further restrictions

One could also like to avoid urlencoding in urls. Or avoid that case has a meaning.

  • restrictions further as ascii could be:
    • prefer usage of only A-Za-z0-9_
      • or only a-z_
    • . for fileextensions, otherwise undecided, probably avoid
    • ' not decided yet, if removed, how to write D'Arienzo?
    • , avoid, at least in work titles
    • []() maybe reserve for special meaning
    • space always replaced with _

work_name_az

There can be lot of discussion whether a title should include ",!?." or such special chars like the spanish leading question mark. There can also be discussion about correct case (upper/lower). The a-z-title is a title that can be derived from lot of different original opinions about correctness. For filenames and references in databases it could be helpfull to have a-z worktitles.

  • remove all diacritics
  • make all characters lower case
  • replace space, comma, dot, with _
  • reduce repeating _ to single _

External references

See also