CONCORDANCE(1) CONCORDANCE(1) NAME concordance - an application for making concordances of words in text files. SYNOPSIS concordance [-p[q][n:num]][-l[q][n:num]][-s[q]] filename [outputFilename] DESCRIPTION concordance is intended to be a non-destructive maker of concordances for text files. It will read the file 'filename' and create a concordance of the usage of the words in the file, and their locations in the file. It creates 2 files: 'filename.wds' and 'filename.abc' [or, if an outputFilename was given on the command line, files named outputFilename.wds and outputFilename.abc]. The file 'filename.wds' ['outputFilename.wds'] will contain a list of the words used in the file, 'filename', a statement of each word's length, the number of times it was used in 'filename', and a list of the locations of its use in the file 'filename'. The way in which the list of locations of the words in the file 'filename' is generated depends on the options used on the command line. The locations can be listed by either page, or line, or stanza (if the file contains poetry and stanzas are numbered according to the convention concordance recognizes (see option -s below for information on the stanza numbering convention). NOTE that only one switch at a time can be used. I.e., you cannot, for example, concordance a file by BOTH page and line number in one run, though you can concordance the same file in 2 runs by page and line number by concordancing it by page on one run, then concordancing it by line on the second and saving the second run in a differently named output file. For the sake of making a set of concordances for documents that may come in several consecutive parts, concordance allows the user to set the beginning page/line/stanza number from which the locations of words in the file will be determined. See the auxiliary option n: . In addition to the word list of sizes, usage of words, and locations, the concordance file 'filename.abc' ['outputFilename.abc'] contains a list of characters used in the file 'filename', the number of times each character was used, its overall percentage of use, and a graph of all the characters used by their percentages. OPTIONS - 1 - Formatted: December 26, 2024 CONCORDANCE(1) CONCORDANCE(1) Options set the manner in which words in a file are counted: whether by page in the file, by line in the file, or by stanza. The beginning page or line also can be set to something other than 1, the default, using an auxiliary option, n: . Concordance can also be told to output to stdout nothing but its copyright message by using the auxiliary option, q. Concordance, run on a file, 'filename', with no options set, defaults to identifying word locations in the file by page and by considering the first page in the file concordanced to be page 1. -p The option -p tells concordance to determine the location of words in the file by their page location. If the option 'n:num' is not appended to -p, concordance assumes the first page in the file to be page 1. Concordance considers page breaks to be marked by the apearance of the standard ASCII formfeed character, FF (ASCII code 12 (decimal) or 0x0C (hexadecimal)). concordance increments the page location counter each time a formfeed character is passed. If a file contains no formfeed characters, the entire file, no matter how long it is, will be considered to consist of page 1! To change the beginning page number, see the auxiliary option n:num below. Note that concordance's default behavior is to concordance files by page number so in that sense a mere -p on the command line is a redundancy. But -p must be used if you want to set a different beginning page number using the auxiliary switch 'n:num'. -l The option -l tells concordance to determine the location of words in the file by their line location. It determines line location by the presence of the linefeed character, LF (ASCII code 10 (decimal) or 0x0A (hexadecimal)). Each time a linefeed character is passed in reading the file, if word location is being determined by lines, concordance increments the line-number counter. If the additional option n:num is not appended to -l, the first line in the file will be considered to be line number 1 and the rest of the lines incremented accordingly. To change the beginning line number, see the auxiliary option 'n:num' below. -s The option -s tells concordance to determine the location of words in the file by their stanza location. Stanza locations are determined by the manual insertion of stanza numbering in the file to be concordanced in the form '1>' '2>'... 'n>' where the stanza numbering indicator 'num>' immediately precedes the stanza of that number. NOTE that in the case of using stanza locations to make a concordance of the words in a file, the number placed in the stanza numbering indicator will be the number given a word's location. NOTE that the auxiliary option 'n:num' does - 2 - Formatted: December 26, 2024 CONCORDANCE(1) CONCORDANCE(1) nothing when combined with stanza counting. (-p|-l)n:num The auxiliary option n:num appended to either -p, or -l, (e.g.: -pn:5 ) in which num is a decimal number, causes concordance to consider the first page/line to be numbered 'num' and causes concordance to increment page/line numbers from there. This is so that if a set of files making up one large document is to be concordanced, and, say, the first page of the 2nd file of the set is actually page 21 of the entire set when all are combined, the second document can be concordanced with the option -pn:21 to cause concordance to consider words on the first page of the 2nd document to be actually on page 21, the words on the 2nd page of the 2nd document to be on page 22, and so forth. Thus a multi-file document can be concordanced with successive page/line numbers set correctly. n:num does nothing when word locations are being determined by stanza, as the number of the stanza depends on the stanza number indicators 'num>' in the file being concordanced. (-p|-l|-s)q The auxiliary option 'q' appended to the count type switch or between the count type switch and the other auxiliary switch, n:num, causes concordance to run in 'quiet' mode. In normal mode, concordance outputs to stdout, in addition to its initial copyright and self-identification message, a couple of messages as to what the filenames of the .wds, and .abc files are, and indicates when it has completed its work. In quiet mode, it only outputs its initial copyright-ID message. RESOURCES Concordance creates 2 files on disk: one with a file extension of .wds and one with an extension of .abc . In creating the concordance, concordance allocates memory for each new word and each new word location in the list of words. Longer files require greater amounts of memory. It is conceivable that on a very long file, one might run out of memory, and might need to break a very long file into shorter ones. But as this program was originally written for another operating system that didn't use memory efficiently, the problem is probably generic to that system, not UNIX/Linux. DIAGNOSTICS The program will output error messages if something that it recognizes goes wrong, such as trying to create a concordance for a non-existent text file. - 3 - Formatted: December 26, 2024 CONCORDANCE(1) CONCORDANCE(1) SEE ALSO Come to think of it, I haven't seen any programs similar to this, but I have a sneaking suspicion you might be able to accomplish the same thing using awk, or for those really into the zen of UN*X, ed - the real man's editor! COPYRIGHT concordance is Copyright (c) 1996 Ralph L. Meyer This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA. BUGS Several small bugs, and some cosmetic things were fixed before upload. Testing seems to indicate that most bugs have been anesthetized into oblivion. But there is no guarantee of that. If you find any bugs lurking in the program's more arcane (or less arcane) corners, then send a bug report to: Ralph Meyer 39 Nelson Avenue Spotswood, NJ 08884 or E-mail meyer@princeton.edu AUTHOR Ralph Meyer 39 Nelson Avenue Spotswood, NJ 08884 E-Mail: meyer@princeton.edu - 4 - Formatted: December 26, 2024