Logo
root@/software/cygwinports/sgrep# cursor
home
software
  sgml
  scientific sw
  cygwin ports
    seqaln
    clustal w
    tacg
    readseq
    un
    sgrep
    |stat
    ploticus
  hacks
  history
photography
about
contact

sgrep

A text search tool for structured documents:

ruler

Overview

sgrep is a tool for searching the contents of text files, similar to grep. In contrast to the latter, sgrep works on regions which are either defined by offsets to the file start or by text patterns which are matched either case insensitive or exact. This allows to search for text occurring within given delimiters, e.g. to search for a certain string in the header but not the body part of a HTML document. It also allows to extract overlapping or excluding regions within delimiters. A special feature is the nearness condition, which allows to search for text occurring within a certain offset from the starting region. A search for "Romeo" and "Juliet" with less than 30 characters between them may be more useful than a search for a file that simply contains both words.

sgrep is very useful for SGML/XML/HTML files, but is by no means limited to them. It can be used to query program source code as well as email or usenet news. The tool comes in very handy if you need to assemble SGML or HTML files from parts of other files.

sgrep was written by Jani Jaakkola and Pekka Kilpeläinen at the University of Helskinki. The CygWin version is based on sgrep-1.92a.

ruler

Modifications

First run ./configure. This will generate Makefile. Edit this Makefile as follows:

Change the line:

bin_PROGRAMS = sgrep
    

to:

bin_PROGRAMS = sgrep.exe
    

Change the line:

sgrep: $(sgrep_OBJECTS) $(sgrep_DEPENDENCIES)
    

to:

sgrep.exe: $(sgrep_OBJECTS) $(sgrep_DEPENDENCIES)
    

add the -s flag to CFLAGS

Then make and make install run without problems as long as a Unix-like directory structure (/usr/local/bin, /usr/local/man, /usr/local/share) exists.

System requirements

The following setup worked in my case:

  • Windows NT 4.0 SP3 Workstation
  • CygWinB20.1
  • A directory /usr/local/bin and symlinks in /usr/local called man and share, pointing to the cygwinb20/man and cygwin-b20/share directories, respectively. Equivalent mounts will work as well.
ruler

Download information

Get the original sgrep sources here.

The precompiled binary for CygwinB20 with the data files are shipped as cygwinb20-sgrep-1.92a.tar.gz (69 kb).