chirp_stream_files(1)

NAME

chirp_stream_files - move data to/from chirp servers in parallel

SYNOPSIS

chirp_stream_files [options] <copy|split|join> <localfile> { <hostname[:port]> <remotefile>

DESCRIPTION

chirp_stream_files is a tool for moving data from one machine to and from many machines, with the option to split or join the file along the way. It is useful for constructing scatter-gather types of applications on top of Chirp.

chirp_stream_files copy duplicates a single file to multiple hosts. The <localfile> argument names a file on the local filesystem. The command will then open a connection to the following list of hosts, and stream the file to all simultaneously.

chirp_stream_files split divides an ASCII file up among multiple hosts. The first line of <localfile> is sent to the first host, the second line to the second, and so on, round-robin.

chirp_stream_files join collects multiple remote files into one. The argument <localfile> is opened for writing, and the remote files for reading. The remote files are read line-by-line and assembled round-robin into the local file.

In all cases, files are accessed in a streaming manner, making this particularly efficient for processing large files. A local file name of - indicates standard input or standard output, so that the command can be used in a pipeline.

OPTIONS

  • -a,--auth=<flag>
    Require this authentication mode.
  • -b,--block-size=<size>
    Set transfer buffer size. (default is 1048576 bytes)
  • -d,--debug=<flag>
    Enable debugging for this subsystem.
  • -i,--tickes=<files>
    Comma-delimited list of tickets to use for authentication.
  • -t,--timeout=<time>
    Timeout for failure. (default is 3600s)
  • -v,--version
    Show program version.
  • -h,--help
    Show help text.

ENVIRONMENT VARIABLES

List any environment variables used or set in this section.

EXIT STATUS

On success, returns zero. On failure, returns non-zero.

EXAMPLES

To copy the file mydata to three locations:

% chirp_stream_files copy mydata server1.somewhere.edu /mydata
                                 server2.somewhere.edu /mydata
                                 server2.somewhere.edu /mydata

To split the file mydata into subsets at three locations:

% chirp_stream_files split mydata server1.somewhere.edu /part1
                                  server2.somewhere.edu /part2
                                  server2.somewhere.edu /part3

To join three remote files back into one called newdata:

% chirp_stream_files join newdata server1.somewhere.edu /part1
                                  server2.somewhere.edu /part2
                                  server2.somewhere.edu /part3

The Cooperative Computing Tools are Copyright (C) 2022 The University of Notre Dame. This software is distributed under the GNU General Public License. See the file COPYING for details.

SEE ALSO

CCTools