packages icon
 This manual page documents version __VERSION__ of the command.  tests  each
 argument  in  an  attempt  to  classify it.  There are three sets of tests,
 performed in this order: filesystem tests, magic tests, and language tests.
 The  test  that  succeeds  causes  the  file  type to be printed.  The type
 printed will usually contain one of  the  words  (the  file  contains  only
 printing  characters  and  a  few common control characters and is probably
 safe to read on an terminal), (the file contains the result of compiling  a
 program  in  a  form  understandable to some kernel or another), or meaning
 anything else (data is usually or  non-printable).   Exceptions  are  well-
 known  file  formats  (core  files, tar archives) that are known to contain
 binary data.  When modifying magic files or the program itself,  make  sure
 to  Users depend on knowing that all the readable files in a directory have
 the word printed.  Don't do as Berkeley did and change  to  The  filesystem
 tests  are  based  on examining the return from a system call.  The program
 checks to see if the file is empty, or if it's some sort of  special  file.
 Any known file types appropriate to the system you are running on (sockets,
 symbolic links, or named pipes (FIFOs)  on  those  systems  that  implement
 them)  are intuited if they are defined in the system header file The magic
 tests are used to check for files with data in  particular  fixed  formats.
 The  canonical  example  of  this is a binary executable (compiled program)
 file, whose format is defined in  and  possibly  in  the  standard  include
 directory.   These  files  have  a  stored  in  a particular place near the
 beginning of the file that tells the operating system that the  file  is  a
 binary  executable,  and  which of several types thereof.  The concept of a
 has been applied by extension to data files.  Any file with some  invariant
 identifier  at  a small fixed offset into the file can usually be described
 in this way.  The information identifying these  files  is  read  from  the
 compiled magic file or the files in the directory if the compiled file does
 not exist.  In addition, if or exists, it will be used in preference to the
 system  magic  files.   If  a file does not match any of the entries in the
 magic file, it is examined to see if it seems to be a  text  file.   ASCII,
 ISO-8859-x, non-ISO 8-bit extended-ASCII character sets (such as those used
 on Macintosh and IBM PC  systems),  UTF-8-encoded  Unicode,  UTF-16-encoded
 Unicode,  and  EBCDIC  character sets can be distinguished by the different
 ranges and sequences of bytes that constitute printable text in  each  set.
 If a file passes any of these tests, its character set is reported.  ASCII,
 ISO-8859-x, UTF-8, and extended-ASCII files are identified as because  they
 will  be mostly readable on nearly any terminal; UTF-16 and EBCDIC are only
 because, while they contain text, it is text that will require  translation
 before  it  can  be  read.   In  addition,  will attempt to determine other
 characteristics of text-type files.  If the lines of a file are  terminated
 by  CR,  CRLF,  or  NEL,  instead  of  the  Unix-standard  LF, this will be
 reported.  Files that contain embedded  escape  sequences  or  overstriking
 will  also  be identified.  Once has determined the character set used in a
 text-type file, it will attempt to determine in what language the  file  is
 written.   The  language  tests  look for particular strings (cf.  that can
 appear anywhere in the first few  blocks  of  a  file.   For  example,  the
 keyword  indicates  that  the file is most likely a input file, just as the
 keyword indicates a C program.  These tests  are  less  reliable  than  the
 previous  two  groups,  so  they  are  performed  last.   The language test
 routines also test for some miscellany (such as archives, JSON files).  Any
 file  that  cannot  be  identified  as  having  been  written in any of the
 character sets listed above is simply said to  be  Causes  the  command  to
 output the file type and creator code as used by older MacOS versions.  The
 code consists of eight letters, the first describing  the  file  type,  the
 latter  the creator.  This option works properly only for file formats that
 have the apple-style output defined.  Do not prepend  filenames  to  output
 lines (brief mode).  Write a output file that contains a pre-parsed version
 of the magic file or directory.  Cause a checking printout  of  the  parsed
 form  of  the  magic  file.   This  is usually used in conjunction with the
 option to debug a new magic file before  installing  it.   Prints  internal
 debugging  information  to  stderr.   On  filesystem errors (file not found
 etc), instead of handling the error as regular output as POSIX mandates and
 keep  going,  issue  an  error message and exit.  Exclude the test named in
 from the list of tests made to determine the file type.  Valid  test  names
 are:  application  type  (only  on EMX).  Various types of text files (this
 test will try to guess the text encoding, irrespective of  the  setting  of
 the  option).   Different text encodings for soft magic tests.  Ignored for
 backwards  compatibility.   Prints  details  of  Compound  Document  Files.
 Checks  for,  and  looks  inside, compressed files.  Checks Comma Separated
 Value files.  Prints ELF  file  details,  provided  soft  magic  tests  are
 enabled  and  the  elf  magic  is found.  Examines JSON (RFC-7159) files by
 parsing them for compliance.  Consults magic  files.   Examines  SIMH  tape
 files.   Examines  tar  files by verifying the checksum of the 512 byte tar
 header.  Excluding this test can provide more detailed content  description
 by  using  the soft magic method.  A synonym for Like but ignore tests that
 does not know  about.   This  is  intended  for  compatibility  with  older
 versions  of  Print a slash-separated list of valid extensions for the file
 type found.  Use the specified string as the separator between the filename
 and  the  file result returned.  Defaults to Read the names of the files to
 be examined from (one per line) before the argument  list.   Either  or  at
 least  one  filename  argument must be present; to test the standard input,
 use as a filename argument.  Please note that is unwrapped and the enclosed
 filenames  are  processed  when  this  option is encountered and before any
 further options processing is done.  This allows one  to  process  multiple
 lists   of  files  with  different  command  line  arguments  on  the  same
 invocation.  Thus if you want to set the  delimiter,  you  need  to  do  it
 before  you specify the list of files, like: instead of: This option causes
 symlinks not to be followed (on systems that support symbolic links).  This
 is  the  default  if  the  environment variable is not defined.  Causes the
 command to output mime type strings rather than the more traditional  human
 readable  ones.   Thus  it  may  say  rather  than  Like but print only the
 specified  element(s).   Don't  stop  at  the  first  match,  keep   going.
 Subsequent  matches  will  be  have  the  string prepended.  (If you want a
 newline, see the option.) The magic pattern with the highest strength  (see
 the  option)  comes  first.   Shows  a  list of patterns and their strength
 sorted descending by strength which is used for the matching (see also  the
 option).   This  option  causes  symlinks to be followed, as the like-named
 option in (on systems that support symbolic links).  This is the default if
 the  environment  variable  is defined.  Specify an alternate list of files
 and directories containing magic.  This can be a single item, or  a  colon-
 separated  list.   If  a  compiled  magic file is found alongside a file or
 directory, it will be used instead.  Don't pad filenames so that they align
 in  the output.  Force stdout to be flushed after checking each file.  This
 is only useful if checking a list of files.  It is intended to be  used  by
 programs that want filetype output from a pipe.  On systems that support or
 attempt to preserve the access time of  files  analyzed,  to  pretend  that
 never   read   them.    Set  various  parameter  limits.   Don't  translate
 unprintable characters to \ooo.  Normally translates unprintable characters
 to  their  octal  representation.   Normally,  only  attempts  to  read and
 determine the type of argument files  which  reports  are  ordinary  files.
 This  prevents  problems,  because  reading special files may have peculiar
 consequences.  Specifying the option causes to  also  read  argument  files
 which are block or character special files.  This is useful for determining
 the filesystem types of the data in raw disk partitions,  which  are  block
 special  files.   This  option  also  causes  to disregard the file size as
 reported by since on some systems it reports  a  zero  size  for  raw  disk
 partitions.   On systems where libseccomp is available, the option disables
 sandboxing which is enabled by default.   This  option  is  needed  for  to
 execute  external decompressing programs, i.e. when the option is specified
 and the  built-in  decompressors  are  not  available.   On  systems  where
 sandboxing  is not available, this option has no effect.  Print the version
 of the program and exit.  Try to look inside compressed files.  Try to look
 inside compressed files, but report information about the contents only not
 the compression.  Output a null character after the end  of  the  filename.
 Nice  to  the  output.   This does not affect the separator, which is still
 printed.  If this option is repeated more than once, then prints  just  the
 filename  followed  by  a  NUL followed by the description (or ERROR: text)
 followed by a second NUL for each entry.  Print a help  message  and  exit.
 The  environment  variable  can be used to set the default magic file name.
 If that variable is set, then will not attempt to open adds to the value of
 this  variable  as  appropriate.   The  environment  variable  controls (on
 systems that support  symbolic  links),  whether  will  attempt  to  follow
 symlinks  or  not.   If  set,  then follows symlink, otherwise it does not.
 This is also controlled by the  and  options.   Default  compiled  list  of
 magic.   Directory  containing  default magic files.  will exit with if the
 operation was successful or if an error  was  encountered.   The  following
 errors  cause  diagnostic  messages, but don't affect the program exit code
 (as POSIX requires), unless is specified: A file cannot be found  There  is
 no  permission  to  read  a  file The file type cannot be determined $ file
 file.c file /dev/{wd0a,hda} file.c:   C program text file:      ELF  32-bit
 LSB executable, Intel 80386, version 1 (SYSV),           dynamically linked
 (uses shared libs), stripped /dev/wd0a: block special (0/0) /dev/hda: block
 special (3/0)

 $ file -s /dev/wd0{b,d} /dev/wd0b: data /dev/wd0d: x86 boot sector

 $ file  -s  /dev/hda{,1,2,3,4,5,6,7,8,9,10}  /dev/hda:    x86  boot  sector
 /dev/hda1:    Linux/i386   ext2  filesystem  /dev/hda2:   x86  boot  sector
 /dev/hda3:   x86  boot  sector,   extended   partition   table   /dev/hda4:
 Linux/i386  ext2  filesystem  /dev/hda5:   Linux/i386  swap file /dev/hda6:
 Linux/i386  swap  file  /dev/hda7:    Linux/i386   swap   file   /dev/hda8:
 Linux/i386 swap file /dev/hda9:  empty /dev/hda10: empty

 $ file  -i  file.c  file  /dev/{wd0a,hda}  file.c:       text/x-c  file:
 application/x-executable     /dev/hda:       application/x-not-regular-file
 /dev/wd0a:   application/x-not-regular-file

 This program is believed to exceed the System  V  Interface  Definition  of
 FILE(CMD),  as  near as one can determine from the vague language contained
 therein.  Its behavior is mostly compatible with the System  V  program  of
 the  same name.  This version knows more magic, however, so it will produce
 different (albeit more accurate) output in many cases.  The one significant
 difference  between  this  version and System V is that this version treats
 any white space as a delimiter, so that spaces in pattern strings  must  be
 escaped.  For example, Gt]10   string  language impress(imPRESS data) in an
 existing magic file would have to be changed  to  Gt]10   string  language\
 impress       (imPRESS  data)  In  addition,  in this version, if a pattern
 string  contains  a  backslash,  it   must   be   escaped.    For   example
 0       string          \begindata      Andrew   Toolkit   document  in  an
 existing    magic    file    would    have     to     be     changed     to
 0       string          \\begindata     Andrew   Toolkit   document   SunOS
 releases 3.2 and later from Sun Microsystems include a command derived from
 the  System  V  one,  but  with some extensions.  This version differs from
 Sun's only in minor ways.  It includes the extension of the operator,  used
 as,    for   example,   Gt]16   longAm]0x7fffffff       Gt]0            not
 stripped On systems where libseccomp is  available,  is  enforces  limiting
 system  calls  to only the ones necessary for the operation of the program.
 This enforcement does not provide any security benefit  when  is  asked  to
 decompress  input  files  running  external  programs  with the option.  To
 enable execution of external decompressors, one needs to disable sandboxing
 using  the option.  The magic file entries have been collected from various
 sources, mainly USENET,  and  contributed  by  various  authors.   Christos
 Zoulas  (address  below)  will  collect  additional or corrected magic file
 entries.  A  consolidation  of  magic  file  entries  will  be  distributed
 periodically.   The  order  of  entries  in  the magic file is significant.
 Depending on what system you  are  using,  the  order  that  they  are  put
 together may be incorrect.  If your old command uses a magic file, keep the
 old magic file around for comparison purposes (rename it to There has  been
 a  command  in every (man page dated November, 1973).  The System V version
 introduced one significant major change: the external list of magic  types.
 This  slowed  the  program  down  slightly but made it a lot more flexible.
 This program, based on the System V version,  was  written  by  Ian  Darwin
 without  looking  at  anybody else's source code.  John Gilmore revised the
 code extensively, making it better than the first version.   Geoff  Collyer
 found   several   inadequacies   and  provided  some  magic  file  entries.
 Contributions of the operator by Rob McMahon, 1989.  Guy Harris, made  many
 changes from 1993 to the present.  Primary development and maintenance from
 1990 to the present by Christos Zoulas Altered by Chris Lowth 2000:  handle
 the option to output mime type strings, using an alternative magic file and
 internal logic.  Altered by Eric Fischer July, 2000, to identify  character
 codes and attempt to identify the languages of non-ASCII files.  Altered by
 Reuben Thomas 2007-2011, to improve MIME support, merge MIME  and  non-MIME
 magic, support directories as well as files of magic, apply many bug fixes,
 update and fix a lot of  magic,  improve  the  build  system,  improve  the
 documentation, and rewrite the Python bindings in pure Python.  The list of
 contributors to the directory (magic files) is too long  to  include  here.
 You  know  who  you  are;  thank  you.  Many contributors are listed in the
 source files.  Copyright (c) Ian F.  Darwin,  Toronto,  Canada,  1986-1999.
 Covered  by  the standard Berkeley Software Distribution copyright; see the
 file COPYING in the source distribution.  The files  and  were  written  by
 John  Gilmore  from  his  public-domain program, and are not covered by the
 above license.  Please report bugs and send patches to the bug  tracker  at
 or  the  mailing  list  at  (visit first to subscribe).  Fix output so that
 tests for MIME and APPLE flags are not  needed  all  over  the  place,  and
 actual output is only done in one place.  This needs a design.  Suggestion:
 push possible outputs on  to  a  list,  then  pick  the  last-pushed  (most
 specific,  one  hopes)  value  at  the end, or use a default if the list is
 empty.  This should not slow down evaluation.  The handling of and printing
 \012-  between  entries is clumsy and complicated; refactor and centralize.
 Some of the encoding logic is hard-coded in encoding.c and can be moved  to
 the  magic  files if we had a !:charset annotation.  Continue to squash all
 magic bugs.  See Debian BTS for a  good  source.   Store  arbitrarily  long
 strings,  for  example  for  %s  patterns, so that they can be printed out.
 Fixes Debian bug #271672.  This can be done  by  allocating  strings  in  a
 string  pool,  storing  the  string  pool  at the end of the magic file and
 converting all the string pointers to  relative  offsets  from  the  string
 pool.   Add  syntax  for  relative  offsets after current level (Debian bug
 #466037).  Make file -ki work, i.e. give multiple MIME types.   Add  a  zip
 library  so  we  can peek inside Office2007 documents to print more details
 about their contents.  Add an option to print URLs for the sources  of  the
 file descriptions.  Combine script searches and add a way to map executable
 names to MIME types (e.g. have a magic value for !:mime  which  causes  the
 resulting  string to be looked up in a table).  This would avoid adding the
 same magic repeatedly for each new  hash-bang  interpreter.   When  a  file
 descriptor  is  available, we can skip and adjust the buffer instead of the
 hacky buffer management we do now.  Fix and to  check  for  consistency  at
 compile  time (duplicate pointing to undefined ).  Make / more efficient by
 keeping a sorted list of names.  Special-case ^ to flip endianness  in  the
 parser  so  that  it  does not have to be escaped, and document it.  If the
 offsets specified internally in the file exceed the buffer size (  variable
 in file.h), then we don't seek to that offset, but we give up.  It would be
 better if buffer managements was done when the file descriptor is available
 so  we  can  seek around the file.  One must be careful though because this
 has performance and thus security considerations, because one can slow down
 things  by  repeatedly  seeking.  There is support now for keeping separate
 buffers and having offsets from the end  of  the  file,  but  the  internal
 buffer  management  still  needs  an overhaul.  You can obtain the original
 author's latest version by anonymous FTP on in the directory