util/DirectoryScanner.php

Show: inherited
Table of Contents

Class for scanning a directory for files/directories that match a certain criteria.

These criteria consist of a set of include and exclude patterns. With these patterns, you can select which files you want to have included, and which files you want to have excluded.

The idea is simple. A given directory is recursively scanned for all files and directories. Each file/directory is matched against a set of include and exclude patterns. Only files/directories that match at least one pattern of the include pattern list, and don't match a pattern of the exclude pattern list will be placed in the list of files/directories found.

When no list of include patterns is supplied, "**" will be used, which means that everything will be matched. When no list of exclude patterns is supplied, an empty list is used, such that nothing will be excluded.

The pattern matching is done as follows: The name to be matched is split up in path segments. A path segment is the name of a directory or file, which is bounded by DIRECTORY_SEPARATOR ('/' under UNIX, '\' under Windows). E.g. "abc/def/ghi/xyz.php" is split up in the segments "abc", "def", "ghi" and "xyz.php". The same is done for the pattern against which should be matched.

Then the segments of the name and the pattern will be matched against each other. When '**' is used for a path segment in the pattern, then it matches zero or more path segments of the name.

There are special case regarding the use of DIRECTORY_SEPARATOR at the beginning of the pattern and the string to match: When a pattern starts with a DIRECTORY_SEPARATOR, the string to match must also start with a DIRECTORY_SEPARATOR. When a pattern does not start with a DIRECTORY_SEPARATOR, the string to match may not start with a DIRECTORY_SEPARATOR. When one of these rules is not obeyed, the string will not match.

When a name path segment is matched against a pattern path segment, the following special characters can be used: '*' matches zero or more characters, '?' matches one character.

Examples:

"***.php" matches all .php files/dirs in a directory tree.

"test\a??.php" matches all files/dirs which start with an 'a', then two more characters and then ".php", in a directory called test.

"**" matches everything in a directory tree.

"**\test*\XYZ" matches all files/dirs that start with "XYZ" and where there is a parent directory called test (e.g. "abc\test\def\ghi\XYZ123").

Case sensitivity may be turned off if necessary. By default, it is turned on.

Example of usage: $ds = new DirectroyScanner(); $includes = array("***.php"); $excludes = array("modules***"); $ds->SetIncludes($includes); $ds->SetExcludes($excludes); $ds->SetBasedir("test"); $ds->SetCaseSensitive(true); $ds->Scan();

print("FILES:"); $files = ds->GetIncludedFiles(); for ($i = 0; $i < count($files);$i++) { println("$files[$i]\n"); }

This will scan a directory called test for .php files, but excludes all .php files in all directories under a directory called "modules"

This class is complete preg/ereg free port of the Java class org.apache.tools.ant.DirectoryScanner. Even functions that use preg/ereg internally (like split()) are not used. Only the fast string functions and comparison operators (=== !=== etc) are used for matching and tokenizing.

Author
Arnout J. Kuiper, ajkuiper@wxs.nl  
Author
Magesh Umasankar, umagesh@rediffmail.com  
Author
Andreas Aderhold, andi@binarycloud.com  
Package
phing.util  
Version
$Id: 7aef4b4e372e89055248ab063660dbee92a98cc3 $  

\DirectoryScanner

Package: phing\util

Class for scanning a directory for files/directories that match a certain criteria.

These criteria consist of a set of include and exclude patterns. With these patterns, you can select which files you want to have included, and which files you want to have excluded.

The idea is simple. A given directory is recursively scanned for all files and directories. Each file/directory is matched against a set of include and exclude patterns. Only files/directories that match at least one pattern of the include pattern list, and don't match a pattern of the exclude pattern list will be placed in the list of files/directories found.

When no list of include patterns is supplied, "**" will be used, which means that everything will be matched. When no list of exclude patterns is supplied, an empty list is used, such that nothing will be excluded.

The pattern matching is done as follows: The name to be matched is split up in path segments. A path segment is the name of a directory or file, which is bounded by DIRECTORY_SEPARATOR ('/' under UNIX, '\' under Windows). E.g. "abc/def/ghi/xyz.php" is split up in the segments "abc", "def", "ghi" and "xyz.php". The same is done for the pattern against which should be matched.

Then the segments of the name and the pattern will be matched against each other. When '**' is used for a path segment in the pattern, then it matches zero or more path segments of the name.

There are special case regarding the use of DIRECTORY_SEPARATOR at the beginning of the pattern and the string to match: When a pattern starts with a DIRECTORY_SEPARATOR, the string to match must also start with a DIRECTORY_SEPARATOR. When a pattern does not start with a DIRECTORY_SEPARATOR, the string to match may not start with a DIRECTORY_SEPARATOR. When one of these rules is not obeyed, the string will not match.

When a name path segment is matched against a pattern path segment, the following special characters can be used: '*' matches zero or more characters, '?' matches one character.

Examples:

"***.php" matches all .php files/dirs in a directory tree.

"test\a??.php" matches all files/dirs which start with an 'a', then two more characters and then ".php", in a directory called test.

"**" matches everything in a directory tree.

"**\test*\XYZ" matches all files/dirs that start with "XYZ" and where there is a parent directory called test (e.g. "abc\test\def\ghi\XYZ123").

Case sensitivity may be turned off if necessary. By default, it is turned on.

Example of usage: $ds = new DirectroyScanner(); $includes = array("***.php"); $excludes = array("modules***"); $ds->SetIncludes($includes); $ds->SetExcludes($excludes); $ds->SetBasedir("test"); $ds->SetCaseSensitive(true); $ds->Scan();

print("FILES:"); $files = ds->GetIncludedFiles(); for ($i = 0; $i < count($files);$i++) { println("$files[$i]\n"); }

This will scan a directory called test for .php files, but excludes all .php files in all directories under a directory called "modules"

This class is complete preg/ereg free port of the Java class org.apache.tools.ant.DirectoryScanner. Even functions that use preg/ereg internally (like split()) are not used. Only the fast string functions and comparison operators (=== !=== etc) are used for matching and tokenizing.

Parent(s)
\SelectorScanner
Children
\PearPackageScanner
Author
Arnout J. Kuiper, ajkuiper@wxs.nl  
Author
Magesh Umasankar, umagesh@rediffmail.com  
Author
Andreas Aderhold, andi@binarycloud.com  
Version
$Id: 7aef4b4e372e89055248ab063660dbee92a98cc3 $  

Properties

Propertyprotected  $DEFAULTEXCLUDES= 'array( "**/*~", "**/#*#", "**/.#*", "**/%*%", "**/CVS", "**/CVS/**", "**/.cvsignore", "**/SCCS", "**/SCCS/**", "**/vssver.scc", "**/.svn", "**/.svn/**", "**/._*", "**/.DS_Store", "**/.darcs", "**/.darcs/**", "**/.git", "**/.git/**", "**/.gitattributes", "**/.gitignore", "**/.gitmodules", )'

default set of excludes

Default valuearray( "**/*~", "**/#*#", "**/.#*", "**/%*%", "**/CVS", "**/CVS/**", "**/.cvsignore", "**/SCCS", "**/SCCS/**", "**/vssver.scc", "**/.svn", "**/.svn/**", "**/._*", "**/.DS_Store", "**/.darcs", "**/.darcs/**", "**/.git", "**/.git/**", "**/.gitattributes", "**/.gitignore", "**/.gitmodules", )Details
Type
n/a
Propertyprotected  $basedir= ''

The base directory which should be scanned.

Details
Type
n/a
Propertyprotected  $dirsDeselected= ''
Details
Type
n/a
Propertyprotected  $dirsExcluded= ''

The files that where found and matched at least one includes, and also matched at least one excludes.

Details
Type
n/a
Propertyprotected  $dirsIncluded= ''

The directories that where found and matched at least one includes, and matched no excludes.

Details
Type
n/a
Propertyprotected  $dirsNotIncluded= ''

The directories that where found and did not match any includes.

Details
Type
n/a
Propertyprotected  $everythingIncluded= 'true'

if there are no deselected files

Default valuetrueDetails
Type
n/a
Propertyprotected  $excludes= 'null'

The patterns for the files that should be excluded.

Default valuenullDetails
Type
n/a
Propertyprotected  $expandSymbolicLinks= 'false'

Whether to expand/dereference symbolic links, default is false

Default valuefalseDetails
Type
n/a
Propertyprotected  $filesDeselected= ''
Details
Type
n/a
Propertyprotected  $filesExcluded= ''

The files that where found and matched at least one includes, and also matched at least one excludes. Trie object.

Details
Type
n/a
Propertyprotected  $filesIncluded= ''

The files that where found and matched at least one includes, and matched no excludes.

Details
Type
n/a
Propertyprotected  $filesNotIncluded= ''

The files that where found and did not match any includes. Trie

Details
Type
n/a
Propertyprotected  $haveSlowResults= 'false'

Have the vars holding our results been built by a slow scan?

Default valuefalseDetails
Type
n/a
Propertyprotected  $includes= 'null'

The patterns for the files that should be included.

Default valuenullDetails
Type
n/a
Propertyprotected  $isCaseSensitive= 'true'

Should the file system be treated as a case sensitive one?

Default valuetrueDetails
Type
n/a
Propertyprotected  $selectors= 'null'

Selectors

Default valuenullDetails
Type
n/a

Methods

methodpublicaddDefaultExcludes( ) : void

Adds the array with default exclusions to the current exclusions set.

methodprotectedcouldHoldIncluded( \name $_name ) : \<code>true</code>

Tests whether a name matches the start of at least one include pattern.

Parameters
Name Type Description
$_name \name

the name to match

Returns
Type Description
\<code>true</code> when the name matches against at least one include pattern, <code>false</code> otherwise.
methodpublicgetBasedir( ) : \the

Gets the basedir that is used for scanning. This is the directory that is scanned recursively.

Returns
Type Description
\the basedir that is used for scanning
methodpublicgetDeselectedDirectories( ) : \the

Returns the names of the directories which were selected out and therefore not ultimately included.

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

Returns
Type Description
\the names of the directories which were deselected.
Details
See
\#slowScan  
methodpublicgetDeselectedFiles( ) : \the

Returns the names of the files which were selected out and therefore not ultimately included.

The names are relative to the base directory. This involves performing a slow scan if one has not already been completed.

Returns
Type Description
\the names of the files which were deselected.
Details
See
\#slowScan  
methodpublicgetExcludedDirectories( ) : \the

Get the names of the directories that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the directories
methodpublicgetExcludedFiles( ) : \the

Get the names of the files that matched at least one of the include patterns, an matched also at least one of the exclude patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the files
methodpublicgetIncludedDirectories( ) : \the

Get the names of the directories that matched at least one of the include patterns, an matched none of the exclude patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the directories
methodpublicgetIncludedFiles( ) : \the

Get the names of the files that matched at least one of the include patterns, and matched none of the exclude patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the files
methodpublicgetNotIncludedDirectories( ) : \the

Get the names of the directories that matched at none of the include patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the directories
methodpublicgetNotIncludedFiles( ) : \the

Get the names of the files that matched at none of the include patterns.

The names are relative to the basedir.

Returns
Type Description
\the names of the files
methodpublicisEverythingIncluded( ) : \<code>true</code>

Returns whether or not the scanner has included all the files or directories it has come across so far.

Returns
Type Description
\<code>true</code> if all files and directories which have been found so far have been included.
methodprotectedisExcluded( \name $_name ) : \<code>true</code>

Tests whether a name matches against at least one exclude pattern.

Parameters
Name Type Description
$_name \name

the name to match

Returns
Type Description
\<code>true</code> when the name matches against at least one exclude pattern, <code>false</code> otherwise.
methodprotectedisIncluded( \name $_name ) : \<code>true</code>

Tests whether a name matches against at least one include pattern.

Parameters
Name Type Description
$_name \name

the name to match

Returns
Type Description
\<code>true</code> when the name matches against at least one include pattern, <code>false</code> otherwise.
methodprotectedisSelected( string $name, string $file ) : boolean

Tests whether a name should be selected.

Parameters
Name Type Description
$name string

The filename to check for selecting.

$file string

The full file path.

Returns
Type Description
boolean False when the selectors says that the file should not be selected, True otherwise.
methodpubliclistDir( \src $_dir ) : array

Lists contens of a given directory and returns array with entries

Parameters
Name Type Description
$_dir \src

String. Source path and name file to copy.

Returns
Type Description
array directory entries
Details
Access
public  
Author
Albert Lash, alash@plateauinnovation.com  
methodpublicmatch( \pattern $pattern, \str $str,  $isCaseSensitive = true ) : boolean

Matches a string against a pattern. The pattern contains two special characters: '*' which means zero or more characters, '?' which means one and only one character.

Parameters
Name Type Description
$pattern \pattern

the (non-null) pattern to match against

$str \str

the (non-null) string that must be matched against the pattern

$isCaseSensitive
Returns
Type Description
boolean true when the string matches against the pattern, false otherwise.
Details
Access
public  
methodpublicmatchPath( \pattern $pattern, \str $str, \isCaseSensitive $isCaseSensitive = true ) : true

Matches a path against a pattern. Static

Parameters
Name Type Description
$pattern \pattern

the (non-null) pattern to match against

$str \str

the (non-null) string (path) to match

$isCaseSensitive \isCaseSensitive

must a case sensitive match be done?

Returns
Type Description
true when the pattern matches against the string. false otherwise.
methodpublicmatchPatternStart( \pattern $pattern, \str $str, \isCaseSensitive $isCaseSensitive = true ) : boolean

Does the path match the start of this pattern up to the first "**".

This is a static mehtod and should always be called static

This is not a general purpose test and should only be used if you can live with false positives.

pattern=**\a and str=b will yield true.

Parameters
Name Type Description
$pattern \pattern

the (non-null) pattern to match against

$str \str

the (non-null) string (path) to match

$isCaseSensitive \isCaseSensitive

must matches be case sensitive?

Returns
Type Description
boolean true if matches, otherwise false
methodpublicscan( ) : void

Scans the base directory for files that match at least one include pattern, and don't match any exclude patterns.

methodprivatescandir( \dir $_rootdir, \vpath $_vpath,  $_fast ) : void

Scans the passed dir for files and directories. Found files and directories are placed in their respective collections, based on the matching of includes and excludes. When a directory is found, it is scanned recursively.

Parameters
Name Type Description
$_rootdir \dir

the directory to scan

$_vpath \vpath

the path relative to the basedir (needed to prevent problems with an absolute path when using dir)

$_fast
Details
Access
private  
See
\#filesIncluded  
See
\#filesNotIncluded  
See
\#filesExcluded  
See
\#dirsIncluded  
See
\#dirsNotIncluded  
See
\#dirsExcluded  
methodpublicsetBasedir( \basedir $_basedir ) : void

Sets the basedir for scanning. This is the directory that is scanned recursively. All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR

Parameters
Name Type Description
$_basedir \basedir

the (non-null) basedir for scanning

methodpublicsetCaseSensitive( \specifies $_isCaseSensitive ) : void

Sets the case sensitivity of the file system

Parameters
Name Type Description
$_isCaseSensitive \specifies

if the filesystem is case sensitive

methodpublicsetExcludes( \excludes $_excludes = array() ) : void

Sets the set of exclude patterns to use. All '/' and '\' characters are replaced by File.separatorChar. So the separator used need not match File.separatorChar.

When a pattern ends with a '/' or '\', "**" is appended.

Parameters
Name Type Description
$_excludes \excludes

list of exclude patterns

methodpublicsetExpandSymbolicLinks( \expandSymbolicLinks $expandSymbolicLinks ) : void

Sets whether to expand/dereference symbolic links

Parameters
Name Type Description
$expandSymbolicLinks \expandSymbolicLinks

boolean value

methodpublicsetIncludes( \includes $_includes = array() ) : void

Sets the set of include patterns to use. All '/' and '\' characters are replaced by DIRECTORY_SEPARATOR. So the separator used need not match DIRECTORY_SEPARATOR.

When a pattern ends with a '/' or '\', "**" is appended.

Parameters
Name Type Description
$_includes \includes

list of include patterns

methodpublicsetSelectors( \selectors $selectors ) : void

Sets the selectors that will select the filelist.

Parameters
Name Type Description
$selectors \selectors

specifies the selectors to be invoked on a scan

methodprotectedslowScan( ) : void

Toplevel invocation for the scan.

Returns immediately if a slow scan has already been requested.

Documentation was generated by DocBlox 0.18.1.