App-cloc
view release on metacpan or search on metacpan
directory a version of the file which has blank
and commented lines removed (in-line comments
persist). The name of each stripped file is the
original file name with .<ext> appended to it.
It is written to the current directory unless
--original-dir is on.
--sum-reports Input arguments are report files previously
created with the --report-file option. Makes
a cumulative set of results containing the
sum of data from the individual report files.
--processes=NUM [Available only on systems with a recent version
of the Parallel::ForkManager module. Not
available on Windows.] Sets the maximum number of
cores that cloc uses. The default value of 0
disables multiprocessing.
--unix Override the operating system autodetection
logic and run in UNIX mode. See also
--windows, --show-os.
--use-sloccount If SLOCCount is installed, use its compiled
executables c_count, java_count, pascal_count,
php_count, and xml_count instead of cloc's
counters. SLOCCount's compiled counters are
substantially faster than cloc's and may give
a performance improvement when counting projects
with large files. However, these cloc-specific
features will not be available: --diff,
--count-and-diff, --strip-comments, --unicode.
--windows Override the operating system autodetection
logic and run in Microsoft Windows mode.
See also --unix, --show-os.
${BB}Filter Options${NN}
--exclude-dir=<D1>[,D2,] Exclude the given comma separated directories
D1, D2, D3, et cetera, from being scanned. For
example --exclude-dir=.cache,test will skip
all files and subdirectories that have /.cache/
or /test/ as their parent directory.
Directories named .bzr, .cvs, .hg, .git, .svn,
and .snapshot are always excluded.
This option only works with individual directory
names so including file path separators is not
allowed. Use --fullpath and --not-match-d=<regex>
to supply a regex matching multiple subdirectories.
--exclude-ext=<ext1>[,<ext2>[...]]
Do not count files having the given file name
extensions.
--exclude-lang=<L1>[,L2,] Exclude the given comma separated languages
L1, L2, L3, et cetera, from being counted.
--exclude-list-file=<file> Ignore files and/or directories whose names
appear in <file>. <file> should have one file
name per line. Only exact matches are ignored;
relative path names will be resolved starting from
the directory where cloc is invoked.
See also --list-file.
--fullpath Modifies the behavior of --match-f, --not-match-f,
and --not-match-d to include the file's path
in the regex, not just the file's basename.
(This does not expand each file to include its
absolute path, instead it uses as much of
the path as is passed in to cloc.)
Note: --match-d always looks at the full
path and therefore is unaffected by --fullpath.
--include-lang=<L1>[,L2,] Count only the given comma separated languages
L1, L2, L3, et cetera.
--match-d=<regex> Only count files in directories matching the Perl
regex. For example
--match-d='/(src|include)/'
only counts files in directories containing
/src/ or /include/. Unlike --not-match-d,
--match-f, and --not-match-f, --match-d always
compares the fully qualified path against the
regex.
--not-match-d=<regex> Count all files except those in directories
matching the Perl regex. Only the trailing
directory name is compared, for example, when
counting in /usr/local/lib, only 'lib' is
compared to the regex.
Add --fullpath to compare parent directories to
the regex.
Do not include file path separators at the
beginning or end of the regex.
--match-f=<regex> Only count files whose basenames match the Perl
regex. For example
--match-f='^[Ww]idget'
only counts files that start with Widget or widget.
Add --fullpath to include parent directories
in the regex instead of just the basename.
--not-match-f=<regex> Count all files except those whose basenames
match the Perl regex. Add --fullpath to include
parent directories in the regex instead of just
the basename.
--skip-archive=<regex> Ignore files that end with the given Perl regular
expression. For example, if given
--skip-archive='(zip|tar(\.(gz|Z|bz2|xz|7z))?)'
the code will skip files that end with .zip,
.tar, .tar.gz, .tar.Z, .tar.bz2, .tar.xz, and
.tar.7z.
--skip-win-hidden On Windows, ignore hidden files.
${BB}Debug Options${NN}
--categorized=<file> Save names of categorized files to <file>.
--counted=<file> Save names of processed source files to <file>.
--diff-alignment=<file> Write to <file> a list of files and file pairs
showing which files were added, removed, and/or
compared during a run with --diff. This switch
forces the --diff mode on.
--explain=<lang> Print the filters used to remove comments for
language <lang> and exit. In some cases the
filters refer to Perl subroutines rather than
regular expressions. An examination of the
source code may be needed for further explanation.
--help Print this usage information and exit.
--found=<file> Save names of every file found to <file>.
--ignored=<file> Save names of ignored files and the reason they
were ignored to <file>.
--print-filter-stages Print processed source code before and after
each filter is applied.
--show-ext[=<ext>] Print information about all known (or just the
given) file extensions and exit.
--show-lang[=<lang>] Print information about all known (or just the
given) languages and exit.
}
}
}
}
} # 1}}}
sub print_language_filters { # {{{1
my ($language,) = @_;
if (!$Filters_by_Language{$language} or
!@{$Filters_by_Language{$language}}) {
warn "Unknown language: $language\n";
warn "Use --show-lang to list all defined languages.\n";
return;
}
printf "%s\n", $language;
foreach my $filter (@{$Filters_by_Language{$language}}) {
printf " filter %s", $filter->[0];
printf " %s", $filter->[1] if defined $filter->[1];
printf " %s", $filter->[2] if defined $filter->[2];
print "\n";
}
print_language_info($language, " extensions:");
} # 1}}}
sub replace_git_hash_with_tarfile { # {{{1
my ($ra_arg_list,) = @_; # in file name, directory name and/or git commit hash to examine
# replace git hashes in $ra_arg_list with tar files
# Diff mode and count mode behave differently:
# Diff:
# file git_hash
# Extract file from the git repo and only compare to it.
# git_hash1 git_hash2
# Get listings of all files in git_hash1 and git_hash2.
# Next, get listings of files that changed with git_hash1
# and git_hash2. Finally, make two tar files of
# git repos1 and 2 where the file listing is the union
# of changes.
# Regular count:
# Simply make a tar file of all files in the git repo.
#print "ra_arg_list 1: @{$ra_arg_list}\n";
my $hash_regex = qr/^([a-f\d]{5,40}|master|HEAD)$/;
my %replacement_arg_list = ();
# early exit if none of the inputs look like git hashes
my %git_hash = ();
my $i = 0;
foreach my $file_or_dir (@{$ra_arg_list}) {
++$i;
if (-r $file_or_dir) { # readable file or dir; not a git hash
$replacement_arg_list{$i} = $file_or_dir;
next;
} elsif ($opt_force_git or $file_or_dir =~ m/$hash_regex/) {
$git_hash{$file_or_dir} = $i;
} # else the input can't be understood; ignore for now
}
return unless %git_hash;
my $have_tar_git = external_utility_exists($ON_WINDOWS ? "unzip" : "tar --version") &&
external_utility_exists("git --version");
if (!$have_tar_git) {
warn "One or more inputs looks like a git hash but " .
"either git or tar is unavailable.\n";
return;
}
my %repo_listing = (); # $repo_listing{hash}{files} = 1;
foreach my $hash (sort keys %git_hash) {
my $git_list_cmd = "git ls-tree --name-only -r $hash";
print "$git_list_cmd\n" if $opt_v;
foreach my $file (`$git_list_cmd`) {
$file =~ s/\s+$//;
$repo_listing{$hash}{$file} = 1;
}
}
# logic for each of the modes
if ($opt_diff) {
#print "A DIFF\n";
# is it git to git, or git to file/dir ?
my ($Left, $Right) = @{$ra_arg_list};
#use Data::Dumper;
#print "diff_listing= "; print Dumper(\%diff_listing);
#print "git_hash= "; print Dumper(\%git_hash);
#print "repo_listing= "; print Dumper(\%repo_listing);
if ($git_hash{$Left} and $git_hash{$Right}) {
#print "A DIFF git-to-git\n";
# git to git
# first make a union of all files that have changed in both commits
my %files_union = ();
my $git_list_cmd = "git diff-tree -r --no-commit-id --name-only $Left $Right";
print "$git_list_cmd\n" if $opt_v;
foreach my $file (`$git_list_cmd`) {
chomp($file);
$files_union{$file} = 1;
}
# then make trucated tar files of those union files which
# actually exist in each repo
my @left_files = ();
my @right_files = ();
foreach my $file (sort keys %files_union) {
push @left_files , $file if $repo_listing{$Left }{$file};
push @right_files, $file if $repo_listing{$Right}{$file};
}
# backslash whitespace within file names (#257)
my $files = join(" ", map {$_ =~ s/(\s)/\\$1/g; $_} @left_files);
$replacement_arg_list{$git_hash{$Left}} = git_archive("$Left $files");
$files = join(" ", map {$_ =~ s/(\s)/\\$1/g; $_} @right_files);
$replacement_arg_list{$git_hash{$Right}} = git_archive("$Right $files");
} else {
#print "A DIFF git-to-file or file-to-git Left=$Left Right=$Right\n";
# git to file/dir or file/dir to git
if ($git_hash{$Left} and $repo_listing{$Left}{$Right} ) {
#print "A DIFF 1\n";
# $Left is a git hash and $Right is a file
$replacement_arg_list{$git_hash{$Left}} = git_archive("$Left $Right");
} elsif ($git_hash{$Right} and $repo_listing{$Right}{$Left}) {
#print "A DIFF 2\n";
# $Left is a file and $Right is a git hash
<table>
<thead>
<tr><th colspan="5">Removed</th>
</tr>
<tr>
<th>Language</th>
<th>Files</th>
<th>Blank</th>
<th>Comment</th>
<th>Code</th>
</tr>
</thead>
<tbody>
<xsl:for-each select="diff_results/removed/language">
<tr>
<th><xsl:value-of select="@name"/></th>
<td><xsl:value-of select="@files_count"/></td>
<td><xsl:value-of select="@blank"/></td>
<td><xsl:value-of select="@comment"/></td>
<td><xsl:value-of select="@code"/></td>
</tr>
</xsl:for-each>
</tbody>
</table>
EO_DIFF_XSL
# 2}}}
}
$XSL_DIFF.= <<'EO_DIFF_XSL'; # {{{2
</body>
</html>
</xsl:template>
</xsl:stylesheet>
EO_DIFF_XSL
# 2}}}
if ($opt_diff) {
print $OUT $XSL_DIFF;
} else {
print $OUT $XSL;
}
$OUT->close();
} # 1}}}
sub normalize_file_names { # {{{1
my (@files, ) = @_;
# Returns a hash of file names reduced to a canonical form
# (fully qualified file names, all path separators changed to /,
# Windows file names lowercased). Hash values are the original
# file name.
my %normalized = ();
foreach my $F (@files) {
my $F_norm = $F;
if ($ON_WINDOWS) {
$F_norm = lc $F_norm; # for case insensitive file name comparisons
$F_norm =~ s{\\}{/}g; # Windows directory separators to Unix
$F_norm =~ s{^\./}{}g; # remove leading ./
if (($F_norm !~ m{^/}) and ($F_norm !~ m{^\w:/})) {
# looks like a relative path; prefix with cwd
$F_norm = lc "$cwd/$F_norm";
}
} else {
$F_norm =~ s{^\./}{}g; # remove leading ./
if ($F_norm !~ m{^/}) {
# looks like a relative path; prefix with cwd
$F_norm = lc "$cwd/$F_norm";
}
}
# Remove trailing / so it does not interfere with further regex code
# that does not expect it
$F_norm =~ s{/+$}{};
$normalized{ $F_norm } = $F;
}
return %normalized;
} # 1}}}
sub combine_diffs { # {{{1
# subroutine by Andy (awalshe@sf.net)
# https://sourceforge.net/tracker/?func=detail&aid=3261017&group_id=174787&atid=870625
my ($ra_files) = @_;
my $res = "$URL v $VERSION\n";
my $dl = '-';
my $width = 79;
# columns are in this order
my @cols = ('files', 'blank', 'comment', 'code');
my %HoH = ();
foreach my $file (@{$ra_files}) {
my $IN = new IO::File $file, "r";
if (!defined $IN) {
warn "Unable to read $file; ignoring.\n";
next;
}
my $sec;
while (<$IN>) {
chomp;
s/\cM$//;
next if /^(http|Language|-----)/;
if (/^[A-Za-z0-9]+/) { # section title
$sec = $_;
chomp($sec);
$HoH{$sec} = () if ! exists $HoH{$sec};
next;
}
if (/^\s(same|modified|added|removed)/) { # calculated totals row
my @ar = grep { $_ ne '' } split(/ /, $_);
chomp(@ar);
my $ttl = shift @ar;
my $i = 0;
foreach(@ar) {
my $t = "${ttl}${dl}${cols[$i]}";
$HoH{$sec}{$t} = 0 if ! exists $HoH{$sec}{$t};
$HoH{$sec}{$t} += $_;
$i++;
}
}
}
$IN->close;
}
# rows are in this order
my @rows = ('same', 'modified', 'added', 'removed');
# The heuristic is as follows: it's Expect _IF_ it:
# 1. has "load_lib" command and either "#" comments or {}.
# 2. {, }, and one of: proc, if, [...], expect
my $is_expect = 0; # Value to determine.
my $begin_brace = 0; # Lines that begin with curly braces.
my $end_brace = 0; # Lines that begin with curly braces.
my $load_lib = 0; # Lines with the Load_lib command.
my $found_proc = 0;
my $found_if = 0;
my $found_brackets = 0;
my $found_expect = 0;
my $found_pound = 0;
# Return cached result, if available:
if ($expect_files{$filename}) { return expect_files{$filename};}
open(EXPECT_FILE, "<$filename") ||
die "Can't open $filename to determine if it's expect.\n";
while(<EXPECT_FILE>) {
if (m/#/) {$found_pound++; s/#.*//;}
if (m/^\s*\{/) { $begin_brace++;}
if (m/\{\s*$/) { $begin_brace++;}
if (m/^\s*\}/) { $end_brace++;}
if (m/\};?\s*$/) { $end_brace++;}
if (m/^\s*load_lib\s+\S/) { $load_lib++;}
if (m/^\s*proc\s/) { $found_proc++;}
if (m/^\s*if\s/) { $found_if++;}
if (m/\[.*\]/) { $found_brackets++;}
if (m/^\s*expect\s/) { $found_expect++;}
}
close(EXPECT_FILE);
if ($load_lib && ($found_pound || ($begin_brace && $end_brace)))
{$is_expect = 1;}
if ( $begin_brace && $end_brace &&
($found_proc || $found_if || $found_brackets || $found_expect))
{$is_expect = 1;}
$expect_files{$filename} = $is_expect; # Store result in cache.
return $is_expect;
} # 1}}}
sub really_is_pascal { # {{{1
# Given filename, returns TRUE if its contents really are Pascal.
# This isn't as obvious as it seems.
# Many ".p" files are Perl files
# (such as /usr/src/redhat/BUILD/ispell-3.1/dicts/czech/glob.p),
# others are C extractions
# (such as /usr/src/redhat/BUILD/linux/include/linux/umsdos_fs.p
# and some files in linuxconf).
# However, test files in "p2c" really are Pascal, for example.
# Note that /usr/src/redhat/BUILD/ucd-snmp-4.1.1/ov/bitmaps/UCD.20.p
# is actually C code. The heuristics determine that they're not Pascal,
# but because it ends in ".p" it's not counted as C code either.
# I believe this is actually correct behavior, because frankly it
# looks like it's automatically generated (it's a bitmap expressed as code).
# Rather than guess otherwise, we don't include it in a list of
# source files. Let's face it, someone who creates C files ending in ".p"
# and expects them to be counted by default as C files in SLOCCount needs
# their head examined. I suggest examining their head
# with a sucker rod (see syslogd(8) for more on sucker rods).
# This heuristic counts as Pascal such files such as:
# /usr/src/redhat/BUILD/teTeX-1.0/texk/web2c/tangleboot.p
# Which is hand-generated. We don't count woven documents now anyway,
# so this is justifiable.
my $filename = shift;
chomp($filename);
# The heuristic is as follows: it's Pascal _IF_ it has all of the following
# (ignoring {...} and (*...*) comments):
# 1. "^..program NAME" or "^..unit NAME",
# 2. "procedure", "function", "^..interface", or "^..implementation",
# 3. a "begin", and
# 4. it ends with "end.",
#
# Or it has all of the following:
# 1. "^..module NAME" and
# 2. it ends with "end.".
#
# Or it has all of the following:
# 1. "^..program NAME",
# 2. a "begin", and
# 3. it ends with "end.".
#
# The "end." requirements in particular filter out non-Pascal.
#
# Note (jgb): this does not detect Pascal main files in fpc, like
# fpc-1.0.4/api/test/testterminfo.pas, which does not have "program" in
# it
my $is_pascal = 0; # Value to determine.
my $has_program = 0;
my $has_unit = 0;
my $has_module = 0;
my $has_procedure_or_function = 0;
my $found_begin = 0;
my $found_terminating_end = 0;
my $has_begin = 0;
open(PASCAL_FILE, "<$filename") ||
die "Can't open $filename to determine if it's pascal.\n";
while(<PASCAL_FILE>) {
s/\{.*?\}//g; # Ignore {...} comments on this line; imperfect, but effective.
s/\(\*.*?\*\)//g; # Ignore (*...*) comments on this line; imperfect, but effective.
if (m/\bprogram\s+[A-Za-z]/i) {$has_program=1;}
if (m/\bunit\s+[A-Za-z]/i) {$has_unit=1;}
if (m/\bmodule\s+[A-Za-z]/i) {$has_module=1;}
if (m/\bprocedure\b/i) { $has_procedure_or_function = 1; }
if (m/\bfunction\b/i) { $has_procedure_or_function = 1; }
if (m/^\s*interface\s+/i) { $has_procedure_or_function = 1; }
if (m/^\s*implementation\s+/i) { $has_procedure_or_function = 1; }
if (m/\bbegin\b/i) { $has_begin = 1; }
# Originally I said:
( run in 1.984 second using v1.01-cache-2.11-cpan-99c4e6809bf )