NTFS defines the concept of "
reparse
point" which is an optional attribute of files and directories
meant to define some sort of preprocessing before accessing the said
file or directory. For instance reparse points can be used to redirect
access to files which have been moved to long term storage so that some
application would retrieve them and make them directly accessible.
A
Junction point
is a specific
reparse point to redirect a directory access to
another
directory which can be on the same volume or another volume. There are
two sorts of junction points :
volume junctions, which redirect
directories to a whole volume (for instance to escape the 26 drive
letters limit in Windows) and
directory junctions, which
redirect directories to another directory. In both situations the
redirection target is defined by an absolute path.
The similar concept of
symbolic link is also available
in Windows Vista. The symbolic links can redirect to a file or a
directory defined by an absolute or a relative path. When defined on a
remote file system, they are processed on the local system, whereas the
directory junctions are processed on the file server, which makes a
difference when the target is not accessible by the file server. The
symbolic links in Vista are different from Interix symbolic links
created by ntfs-3g which are also interoperable with Windows XP and
Vista.
Junction points were
available in Windows 2000 and Windows XP, but they were not widely
used until Windows Vista used directory junctions to
redirect
access to legacy directories (such as
\Documents and
Settings), in order to avoid
breaking older
software accessing directories for which Vista defines a new location.
The symbolic links are new to Vista and used in paths (such as
\Users\All Users)
which were not used before Vista.
We will hereafter describe how junction points and symbolic links are
made to appear in Linux as symbolic links. Dereferencing junction
points and symbolic links created by Windows is thus made possible, so
are
hard linking, renaming and deleting, but creating new ones is not.
A directory junction, as created by Windows, always defines the
full
(case-insensitive) path
to the target, including a drive letter. Examples of target
definitions are :
C:\Users
c:\users (this is the same as C:\Users)
d:\
C:\Users\Tom\AppData\Local
Notes :
- Windows does not accept the character '/' as a
directory separator in the target definition,
- when creating a junction, Windows translates a relative target
definition to a full target,
- only void directories can be made directory junctions by setting
reparse data.
In order to translate a directory junction to a Linux symbolic link,
the
following points have to be addressed :
- translate the drive letter to a mount point
- translate the case-insensitive path to a case-sensitive one
- and, as these are not always possible, detect and signal problems
Translating the drive letter
The drive letter is a physical address loosely related to the semantics
of the target. A pluggable device (such as a USB key) gets different
drive letters on different computers and on a specific computer
different devices get the same drive letter if they are plugged in
turn into the same slot.
Translating drive letters to Linux paths can probably not be done
automatically, but there are two possible ways to deal with them :
recognizing directory
junctions local to a device, which can be translated to relative paths,
and relying on some user defined mapping of drive letters to mount
points.
Checking whether the drive letter designates the current volume can be
approximated by making sure the target path designates an existing
directory in the volume. After validity checks
C:\Users
can
be converted to
./Users
and
C:\Users\Tom\AppData\Local converted to
../AppData/Local.
This is subject to errors, as a similar (case-insensitive) path meant
for another volume may be
found on the current volume. This would be the case for any target
defined as the root of a
volume, as there would be no directory to be checked, and it is
wise to always reject such target guesses.
Another option is to let the user define what a drive letter should be
mapped to in Linux. Such definitions should be located in the .
NTFS-3G
directory of
the current
file system, as symbolic links to the matching mount point. Then,
C:\Users
can be converted automatically to
./.NTFS-3G/C:/Users
with
C: having to be defined explicitly as a symbolic link to
some mount point.
Both methods are implemented in ntfs-3g, according to the following
rules :
- if the drive letter is not defined in /.NTFS-3G,
an attempt to
interpret the junction point target as a path to an existing directory
on the same volume is first made. If such directory is found, the path
is converted to a relative symbolic link whose name is translated to
match the directory chain exactly.
- if the drive letter is defined in /.NTFS-3G or
if the attempt to
find a local directory fails (even if there is no drive letter defined
in /.NTFS-3G), the junction is translated to a
relative symbolic link
referring the possible definition. The drive letter should be defined
with an upper case followed by a colon, and the path should match the
characters used in the junction point definition.
Note that
.NTFS-3G
is a hidden directory located at the root of the file system containing
the junction point. It may have to be replicated if there are several
NTFS file systems with junction points in them.
Translating the case-insensitive path
The target is defined in Windows as a case-insensitive path, with chars
which may have a different "casing" from those stored in directory
levels, but an exact case-sensitive match is required for a symbolic
link to be valid on Linux.
This obviously leads to examining the path and adjusting the names to
those defined
in the directory levels. However walking along a case-insensitive path
may lead to ambiguities. For instance both
c:\Users
and
c:\users may be
present and designate different directories. Trying to solve such
ambiguity is probably useless as the target is supposed to have been
created by
Windows according to its own rules, and Windows would not be able to
make a
better guess when faced to the same ambiguity.
Because of the possible ambiguities, the translation of a case
insensitive path is only done when searching the target on the current
volume. Only the drive letter is translated (and made upper case) when
redirecting to a definition in
.NTFS-3G, and user
definitions should
always match the target.
Examples
Assuming the
C: volume is mounted on
/Vista
and
/Vista/.NTFS-3G/D:/Packages is defined as a symbolic
link to
/shared/packages :
- if /Vista/Documents and Settings is a directory
junction to C:\USERS and /Vista/Users
exists
on the same volume, it will be seen as
a symbolic
link to ./Users
- if /Vista/global is defined as a directory
junction to c:\Shared
and there is no directory /Vista/shared to be found
whatever the letter
case, it will be seen as a symbolic link to ./.NTFS-3G/C:/Shared
- if /Vista/Users/Tom/TomData is a directory
junction to d:\shared\TomData, it will be seen as a
symbolic
link to ../../.NTFS-3G/D:/shared/Tom/TomData, even if
there
is no such
directory.
Except in the first case, a second symbolic link has to be defined to
get to the target directory.
A volume junction, as created by Windows, defines a GUID
to designate a physical drive. For example a target
definition for a volume junction would appear as :
\\?\Volume{cb71f9d2-945f-11dd-8eac-00188b73099c}\
As the GUID is related to the physical drive (or USB port), the
relation to the semantics of the data is poorly established, much like
a drive letter. The Volume junction itself can apparently not be
defined on a
pluggable file system, but the target can be, allowing the usage of the
same path to mean different data when the media is changed.
The way to make a volume junction appear like a symbolic link is also
to define the volumes as symbolic links in the predefined
location
.NTFS-3G of the volume in which the junction is
defined.
For instance, a volume junction in
C:\Users\Tom\Data defined as
\\?\Volume{[ID]}\ meaning a USB key which mounts in
/media/TomData in Linux, will be seen in
/Vista/Users/Tom/Data
as a
symbolic link to
../../.NTFS-3G/Volume{[ID]} which is
expected to be
defined as a symbolic link to
/media/TomData. Of
course
plugging in another USB key with a different label can only be done if
the definition is adjusted.
A symbolic link, as created by Windows, is much similar to a
directory junction, but unlike a directory junction it can point to a
file or a remote network file or directory. The target may be defined
as a path relative to the symbolic link position, or an absolute path
in the current volume or another one. Also note that symbolic links
to files are different from symbolic links to directories and the
target must match the definition.
If the target is defined as an absolute
path, it is processed like a directory junction :
- if no drive letter is present in the target definition, an
attempt is made to translate the path to a case-sensitive one in the
current volume,
- if a drive letter is present in the target definition and not
defined in .NTFS-3G,
an attempt is also made to recognize the path in
the current volume,
- if a drive letter is present in the target definition, and
defined in .NTFS-3G,
the path is interpreted as it were relative to .NTFS-3G.
The path is not translated or checked, only the drive letter is
capitalized.
In the three situations a symbolic link relative to the current
location is generated.
If the target is defined as a relative path, an attempt is made to
translate the path to a case-sensitive one. The translation fails if it
leads to a loop or leads out of the current volume. If successfull, a
new symbolic link with the translated path is generated.