[gull] character set on MAC OS X
Daniel Cordey
dc at mjt.ch
Fri Feb 22 14:37:07 CET 2008
On Friday 22 February 2008, Félix Hauri wrote:
> As-tu essayé de le monter à la mains, avec ``-t hfs''?
ce sont juste les noms des fichiers et repertoires qui me posent probleme. Le
reste est bien monte en ISO. Mais il faut savoir qu'Apple a aussi des
tendances discutables, tel que
(http://en.wikipedia.org/wiki/UTF-8#Mac_OS_X) :
"...The Mac OS X Operating System uses canonically decomposed Unicode, encoded
using UTF-8 for file names in the filesystem. This is sometimes referred to
as UTF-8-MAC. In canonically decomposed Unicode, the use of precomposed
characters is forbidden and combining diacritics must be used to replace
them.
A common argument is that this makes sorting far simpler, but this argument is
easily refuted: for one, sorting is language dependent (in German, the ä
character sorts just after the a character, while in Scandinavian languages ä
sorts after z). Therefore, it can be confusing for software built around the
assumption that precomposed characters are the norm and combining diacritics
are only used to form unusual combinations. This is an example of the NFD
variant of Unicode normalization—most other platforms, including Windows and
Linux, use the NFC form of Unicode normalization, which is also used by W3C
standards, so NFD data must typically be converted to NFC for use on other
platforms or the Web..."
Ainsi que :
http://www.j3e.de/linux/convmv/man/
"...Apple has modified UFS in a way, which makes it impossible to create
filenames in UTF-8 NFC, they will always be NFD. Also creating filenames in
other (non UTF-8) encodings is not possible. This hacks on UFS makes Darwin a
real crappy Unix..."
Super !!!
J'ai essaye plein de trucs avec *MAC*, *adobe*, rien a faire. Naturellement,
l'auteur du CD ne comprend pas ma question et me dit qu'il a fait tout ca
avec 'Adobe Lightroom', etc.
Finalement, j'ai reussi a utilise la commande suivante :
convmv -f MacRoman -t utf8 --nfc /media/cdrom/*
ET c'est la seule forme d' "encodage" qui soit accepte. Il se contente de
replacer un catactere code '248' en MAC Roman, vers le caractere 248 UTF-8...
Ouai... :-(
dc
More information about the gull
mailing list