Pathnames: Difference between revisions
(Redirecting to filenames) |
No edit summary |
||
Line 1: | Line 1: | ||
[[Category:computer]] | |||
==Non-ASCII characters== | |||
===The Issue=== | |||
Non-ASCII characters in the Windows pathnames (foldernames and file names) of music library tracks can cause problems due to incompatiblities with: | |||
* Macintosh | |||
* Portable music players (where they can prevent play of the file and of subsequent files) | |||
* FTP clients and servers (where Unicode characters often appear as a garbage character pairs) | |||
* Some utility programs e.g. FLAC encoder, Beyond Compare 2 | |||
* File-sharing programs | |||
* Librarian programs e.g. | |||
** http://www.mediamonkey.com/forum/viewtopic.php?t=15384&start=15 4290 Fixed Unicode characters in Custom fields cause the field to not be stored in ID3 tags | |||
===Workaround=== | |||
Replace non-ASCII characters with e.g. the nearest ASCII equivalent: | |||
* accented char -> non-accented equivalent | |||
* left and right quotes -> upright quote | |||
Replacement can be performed by [[Mp3tag]] using an Action Groups containing a Replace action for each character. | |||
====further restrictions==== | |||
One could also like to avoid urlencoding in urls. Or avoid that case has a meaning. | |||
*restrictions further as ascii could be: | |||
**prefer usage of only A-Za-z0-9_ | |||
***or only a-z_ | |||
**. for fileextensions, otherwise undecided, probably avoid | |||
**' not decided yet, if removed, how to write D'Arienzo? | |||
**, avoid, at least in work titles | |||
**[]() maybe reserve for special meaning | |||
**space always replaced with _ | |||
===work_name_az=== | |||
There can be lot of discussion whether a title should include ",!?." or such special chars like the spanish leading question mark. There can also be discussion about correct case (upper/lower). The a-z-title is a title that can be derived from lot of different original opinions about correctness. For filenames and references in databases it could be helpfull to have a-z worktitles. | |||
* remove all diacritics | |||
* make all characters lower case | |||
* replace space, comma, dot, with _ | |||
* reduce repeating _ to single _ | |||
*examples | |||
**http://eng.tango.info/work:la_maleva - disambiguation page | |||
**http://eng.tango.info/work:la_viruta - work page - unique match | |||
**Un jardín de ilusión -> http://eng.tango.info/work:un_jardin_de_ilusion | |||
**La puñalada -> http://eng.tango.info/work:la_punalada | |||
**Qué falta que me hacés! -> http://eng.tango.info/work:que_falta_que_me_haces | |||
===External references=== | |||
* Most common non-ASCII 8-bit characters: http://www.microsoft.com/GLOBALDEV/Reference/sbcs/1252.mspx |
Revision as of 2008-01-19T18:58:47
Non-ASCII characters
The Issue
Non-ASCII characters in the Windows pathnames (foldernames and file names) of music library tracks can cause problems due to incompatiblities with:
- Macintosh
- Portable music players (where they can prevent play of the file and of subsequent files)
- FTP clients and servers (where Unicode characters often appear as a garbage character pairs)
- Some utility programs e.g. FLAC encoder, Beyond Compare 2
- File-sharing programs
- Librarian programs e.g.
- http://www.mediamonkey.com/forum/viewtopic.php?t=15384&start=15 4290 Fixed Unicode characters in Custom fields cause the field to not be stored in ID3 tags
Workaround
Replace non-ASCII characters with e.g. the nearest ASCII equivalent:
- accented char -> non-accented equivalent
- left and right quotes -> upright quote
Replacement can be performed by Mp3tag using an Action Groups containing a Replace action for each character.
further restrictions
One could also like to avoid urlencoding in urls. Or avoid that case has a meaning.
- restrictions further as ascii could be:
- prefer usage of only A-Za-z0-9_
- or only a-z_
- . for fileextensions, otherwise undecided, probably avoid
- ' not decided yet, if removed, how to write D'Arienzo?
- , avoid, at least in work titles
- []() maybe reserve for special meaning
- space always replaced with _
- prefer usage of only A-Za-z0-9_
work_name_az
There can be lot of discussion whether a title should include ",!?." or such special chars like the spanish leading question mark. There can also be discussion about correct case (upper/lower). The a-z-title is a title that can be derived from lot of different original opinions about correctness. For filenames and references in databases it could be helpfull to have a-z worktitles.
- remove all diacritics
- make all characters lower case
- replace space, comma, dot, with _
- reduce repeating _ to single _
- examples
- http://eng.tango.info/work:la_maleva - disambiguation page
- http://eng.tango.info/work:la_viruta - work page - unique match
- Un jardín de ilusión -> http://eng.tango.info/work:un_jardin_de_ilusion
- La puñalada -> http://eng.tango.info/work:la_punalada
- Qué falta que me hacés! -> http://eng.tango.info/work:que_falta_que_me_haces
External references
- Most common non-ASCII 8-bit characters: http://www.microsoft.com/GLOBALDEV/Reference/sbcs/1252.mspx