Class for scanning a directory for files/directories that match a certain criteria.

These criteria consist of a set of include and exclude patterns. With these patterns, you can select which files you want to have included, and which files you want to have excluded.

The idea is simple. A given directory is recursively scanned for all files and directories. Each file/directory is matched against a set of include and exclude patterns. Only files/directories that match at least one pattern of the include pattern list, and don't match a pattern of the exclude pattern list will be placed in the list of files/directories found.

When no list of include patterns is supplied, "**" will be used, which means that everything will be matched. When no list of exclude patterns is supplied, an empty list is used, such that nothing will be excluded.

The pattern matching is done as follows: The name to be matched is split up in path segments. A path segment is the name of a directory or file, which is bounded by DIRECTORY_SEPARATOR ('/' under UNIX, '\' under Windows). E.g. "abc/def/ghi/xyz.php" is split up in the segments "abc", "def", "ghi" and "xyz.php". The same is done for the pattern against which should be matched.

Then the segments of the name and the pattern will be matched against each other. When '**' is used for a path segment in the pattern, then it matches zero or more path segments of the name.

There are special case regarding the use of DIRECTORY_SEPARATOR at the beginning of the pattern and the string to match: When a pattern starts with a DIRECTORY_SEPARATOR, the string to match must also start with a DIRECTORY_SEPARATOR. When a pattern does not start with a DIRECTORY_SEPARATOR, the string to match may not start with a DIRECTORY_SEPARATOR. When one of these rules is not obeyed, the string will not match.

When a name path segment is matched against a pattern path segment, the following special characters can be used: '*' matches zero or more characters, '?' matches one character.

Examples:

"***.php" matches all .php files/dirs in a directory tree.

"test\a??.php" matches all files/dirs which start with an 'a', then two more characters and then ".php", in a directory called test.

"**" matches everything in a directory tree.

"**\test*\XYZ" matches all files/dirs that start with "XYZ" and where there is a parent directory called test (e.g. "abc\test\def\ghi\XYZ123").

Case sensitivity may be turned off if necessary. By default, it is turned on.

Example of usage: $ds = new DirectroyScanner(); $includes = array("***.php"); $excludes = array("modules***"); $ds->SetIncludes($includes); $ds->SetExcludes($excludes); $ds->SetBasedir("test"); $ds->SetCaseSensitive(true); $ds->Scan();

print("FILES:"); $files = ds->GetIncludedFiles(); for ($i = 0; $i < count($files);$i++) { println("$files[$i]\n"); }

This will scan a directory called test for .php files, but excludes all .php files in all directories under a directory called "modules"

This class is complete preg/ereg free port of the Java class org.apache.tools.ant.DirectoryScanner. Even functions that use preg/ereg internally (like split()) are not used. Only the fast string functions and comparison operators (=== !=== etc) are used for matching and tokenizing.

author Arnout J. Kuiper, ajkuiper@wxs.nl
author Magesh Umasankar, umagesh@rediffmail.com
author Andreas Aderhold, andi@binarycloud.com
version $Id: e092ad3bc1b2a28320f23b721bea34a6c89719c4 $
package phing.util

 Methods

Adds the array with default exclusions to the current exclusions set.

addDefaultExcludes() 

Gets the basedir that is used for scanning.

getBasedir() : \the

This is the directory that is scanned recursively.

Returns

\thebasedir that is used for scanning

<p>Returns the names of the directories which were selected out and therefore not ultimately included.</p>

getDeselectedDirectories() : \the

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

see \#slowScan

Returns

\thenames of the directories which were deselected.

<p>Returns the names of the files which were selected out and therefore not ultimately included.</p>

getDeselectedFiles() : \the

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

see \#slowScan

Returns

\thenames of the files which were deselected.

Get the names of the directories that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

getExcludedDirectories() : \the

The names are relative to the basedir.

Returns

\thenames of the directories

Get the names of the files that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

getExcludedFiles() : \the

The names are relative to the basedir.

Returns

\thenames of the files

Get the names of the directories that matched at least one of the include patterns, an matched none of the exclude patterns.

getIncludedDirectories() : \the

The names are relative to the basedir.

Returns

\thenames of the directories

Get the names of the files that matched at least one of the include patterns, and matched none of the exclude patterns.

getIncludedFiles() : \the

The names are relative to the basedir.

Returns

\thenames of the files

Get the names of the directories that matched at none of the include patterns.

getNotIncludedDirectories() : \the

The names are relative to the basedir.

Returns

\thenames of the directories

Get the names of the files that matched at none of the include patterns.

getNotIncludedFiles() : \the

The names are relative to the basedir.

Returns

\thenames of the files

Returns whether or not the scanner has included all the files or directories it has come across so far.

isEverythingIncluded() : \<code>true</code>

Returns

\<code>true</code>if all files and directories which have been found so far have been included.

Lists contens of a given directory and returns array with entries

listDir(\src $_dir) : array
access public
author Albert Lash, alash@plateauinnovation.com

Parameters

$_dir

\src

String. Source path and name file to copy.

Returns

arraydirectory entries

Matches a string against a pattern.

match(\pattern $pattern, \str $str, $isCaseSensitive) : boolean

The pattern contains two special characters: '*' which means zero or more characters, '?' which means one and only one character.

access public

Parameters

$pattern

\pattern

the (non-null) pattern to match against

$str

\str

the (non-null) string that must be matched against the pattern

$isCaseSensitive

Returns

booleantrue when the string matches against the pattern, false otherwise.

Matches a path against a pattern.

matchPath(\pattern $pattern, \str $str, \isCaseSensitive $isCaseSensitive) : true

Static

Parameters

$pattern

\pattern

the (non-null) pattern to match against

$str

\str

the (non-null) string (path) to match

$isCaseSensitive

\isCaseSensitive

must a case sensitive match be done?

Returns

truewhen the pattern matches against the string. false otherwise.

Does the path match the start of this pattern up to the first "**".

matchPatternStart(\pattern $pattern, \str $str, \isCaseSensitive $isCaseSensitive) : boolean

This is a static mehtod and should always be called static

This is not a general purpose test and should only be used if you can live with false positives.

pattern=**\a and str=b will yield true.

Parameters

$pattern

\pattern

the (non-null) pattern to match against

$str

\str

the (non-null) string (path) to match

$isCaseSensitive

\isCaseSensitive

must matches be case sensitive?

Returns

booleantrue if matches, otherwise false

Scans the base directory for files that match at least one include pattern, and don't match any exclude patterns.

scan() 

Sets the basedir for scanning.

setBasedir(\basedir $_basedir) 

This is the directory that is scanned recursively. All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR

Parameters

$_basedir

\basedir

the (non-null) basedir for scanning

Sets the case sensitivity of the file system

setCaseSensitive(\specifies $_isCaseSensitive) 

Parameters

$_isCaseSensitive

\specifies

if the filesystem is case sensitive

Sets the set of exclude patterns to use.

setExcludes(\excludes $_excludes) 

All '/' and '\' characters are replaced by

File.separatorChar

. So the separator used need not match

File.separatorChar

.

When a pattern ends with a '/' or '\', "**" is appended.

Parameters

$_excludes

\excludes

list of exclude patterns

Sets the set of include patterns to use.

setIncludes(\includes $_includes) 

All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR. So the separator used need not match DIRECTORY_SEPARATOR.

When a pattern ends with a '/' or '\', "**" is appended.

Parameters

$_includes

\includes

list of include patterns

Sets the selectors that will select the filelist.

setSelectors(\selectors $selectors) 

Parameters

$selectors

\selectors

specifies the selectors to be invoked on a scan

Tests whether a name matches the start of at least one include pattern.

couldHoldIncluded(\name $_name) : \<code>true</code>

Parameters

$_name

\name

the name to match

Returns

\<code>true</code>when the name matches against at least one include pattern, false otherwise.

Tests whether a name matches against at least one exclude pattern.

isExcluded(\name $_name) : \<code>true</code>

Parameters

$_name

\name

the name to match

Returns

\<code>true</code>when the name matches against at least one exclude pattern, false otherwise.

Tests whether a name matches against at least one include pattern.

isIncluded(\name $_name) : \<code>true</code>

Parameters

$_name

\name

the name to match

Returns

\<code>true</code>when the name matches against at least one include pattern, false otherwise.

Tests whether a name should be selected.

isSelected(string $name, string $file) : boolean

Parameters

$name

string

The filename to check for selecting.

$file

string

The full file path.

Returns

booleanFalse when the selectors says that the file should not be selected, True otherwise.

Toplevel invocation for the scan.

slowScan() 

Returns immediately if a slow scan has already been requested.

Scans the passed dir for files and directories.

scandir(\dir $_rootdir, \vpath $_vpath, $_fast) 

Found files and directories are placed in their respective collections, based on the matching of includes and excludes. When a directory is found, it is scanned recursively.

access private
see \#filesIncluded
see \#filesNotIncluded
see \#filesExcluded
see \#dirsIncluded
see \#dirsNotIncluded
see \#dirsExcluded

Parameters

$_rootdir

\dir

the directory to scan

$_vpath

\vpath

the path relative to the basedir (needed to prevent problems with an absolute path when using dir)

$_fast

 Properties

 

$DEFAULTEXCLUDES 
 

$basedir 
 

$dirsDeselected 
 

$dirsExcluded 
 

$dirsIncluded 
 

$dirsNotIncluded 
 

$everythingIncluded 
 

$excludes 
   

$filesDeselected 
 

$filesExcluded 

Trie object.

 

$filesIncluded 
 

$filesNotIncluded 

Trie

 

$haveSlowResults 
 

$includes 
 

$isCaseSensitive 
 

$selectors