[gdb/contrib] Add -c option to words.sh script

The words.sh script in its current form extracts c comments from files, which
it then transforms into a list of words.

To use the script on the documentation (as I did for commit 6b92c0d353
"[gdb/doc] Fix typos"), I needed to disable the "extract c comments" part.

Add an option -c that enables extracting c comments, and is off by default.

gdb/ChangeLog:

2019-11-25  Tom de Vries  <tdevries@suse.de>

	* contrib/words.sh: Add -c option.

Change-Id: Ifa34d435b3c41b3ff845dc07ae4b0d9f02d92a2d
This commit is contained in:
Tom de Vries 2019-11-25 23:00:03 +01:00
parent 5b89c67adb
commit 3cf2f2377e
2 changed files with 25 additions and 8 deletions

View File

@ -1,3 +1,7 @@
2019-11-25 Tom de Vries <tdevries@suse.de>
* contrib/words.sh: Add -c option.
2019-11-25 Christian Biesinger <cbiesinger@google.com>
* solib.c (solib_find_1): Change int to bool.

View File

@ -14,17 +14,20 @@
# You should have received a copy of the GNU General Public License
# along with this program. If not, see <http://www.gnu.org/licenses/>.
# This script intends to facilitate spell checking of comments in C sources.
# This script intends to facilitate spell checking of source/doc files.
# It:
# - extracts comments from C files
# - transforms the comments into a list of lowercase words
# - transforms the files into a list of lowercase words
# - prefixes each word with the frequency
# - filters out words within a frequency range
# - sorts the words, longest first
#
# If '-c' is passed as option, it operates on the C comments only, rather than
# on the entire file.
#
# For:
# ...
# $ ./gdb/contrib/words.sh $(find gdb -type f -name "*.c" -o -name "*.h")
# $ files=$(find gdb -type f -name "*.c" -o -name "*.h")
# $ ./gdb/contrib/words.sh -c $files
# ...
# it generates a list of ~15000 words prefixed with frequency.
#
@ -36,7 +39,8 @@
#
# And for:
# ...
# $ ./gdb/contrib/words.sh -f 1 $(find gdb -type f -name "*.c" -o -name "*.h")
# $ files=$(find gdb -type f -name "*.c" -o -name "*.h")
# $ ./gdb/contrib/words.sh -c -f 1 $files
# ...
# it generates a list of ~5000 words with frequency 1.
#
@ -45,8 +49,13 @@
minfreq=
maxfreq=
c=false
while [ $# -gt 0 ]; do
case "$1" in
-c)
c=true
shift
;;
--freq|-f)
minfreq=$2
maxfreq=$2
@ -111,9 +120,13 @@ EOF
# Stabilize sort.
export LC_ALL=C
awk \
-f "$awkfile" \
-- "$@" \
if $c; then
awk \
-f "$awkfile" \
-- "$@"
else
cat "$@"
fi \
| sed \
-e 's/[!"?;:%^$~#{}`&=@,. \t\/_()|<>\+\*-]/\n/g' \
-e 's/\[/\n/g' \