#!/usr/bin/perl -w

#
# antlink
#
# Copyright (C) 2015-2016 the University of Southern California
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation; either version 2 of the License.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License along
#    with this program; if not, write to the Free Software Foundation, Inc.,
#    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
# 



=head1 NAME

antlink - support funky symlinks to manage a tree of git or other VC repositories

=head1 SYNOPSIS

antlink SUBCOMMAND AN_ANT_SYMLINK

=head1 DESCRIPTION

Antlink handles trees of repositories,
where a meta-repository can point to other sub-repositories,
some of which are checked out ("cloned") or not.
Repositories stored at the same place can live in a "site",
simplifying their discovery.
Antlink's goal is to make groups of git (or other) repositories
discoverable without requiring everyone check out everything.

A sub-repository is an "antlink", a funny symlink.

Sub-repositories are in two states: 
cloned or not.
When cloned, they have a checked out copy on the local system.
When not cloned, the symlink is dangling.
If a clone is no longer needed, they can be uncloned.

Sites share the same access method (file or ssh),
version control system (git or svn),
hostname, and perhaps a common path at that host.

=head2 WORKFLOW: REGULAR USE

Typical workflow is to go to the meta-repository
and see if anything needs to be updated via

    cd META
    antlink pull .

To work on an repository that is not yet local, clone it:

    antlink clone subrepo
    cd subrepo
    # edit away

To look for things that are not checked in:

    cd META
    antlink status .

To start a new sub-repository inside the existing meta-repository:

    cd META
    antlink init newsubrepo

One can organize things in the default site:

    cd META
    mkdir PAPERS
    cd PAPERS
    antlink init conference_paper_1
    antlink init journal_paper_1

And link in other sites:
[xxx: this feature is not yet implemented]

    cd META
    mkdir EXTERNAL
    antlink graft https://github.com/jekyll/jekyll.git EXTERNAL/jekyll_read_only
    antlink graft git@github.com:jekyll/jekyll.git EXTERNAL/jekyll_rw
    antlink graft --vc svn https://github.com/jekyll/jekyll EXTERNAL/jekyll_via_svn

One can also omit the destination to get a default:

    antlink graft https://github.com/jekyll/jekyll.git

(will appear in "jekyll" in the current directory).


=head2 WORKFLOW: STARTING A NEW ANTLINK METAREPOSITORY

To make a brand new meta-repository on your current computer:

    cd $HOME
    antlink initmeta /home/yourid/metarepo.git

To start on your computer from an existing meta-repository:

    antlink clonemeta /home/yourid/metarepo.git

will check out "metarepo" into the current directory.

Or to pick up the metarepo from another computer:

    antlink clonemeta ssh://git.example.com/home/yourid/metarepo.git

Then look in F<metarepo> 


=head1 SUBCOMMANDS

The following sub-commands work on the given antlink:

=over

=item B<clone>   (or B<resolve>, the old term)

Check out an antlink, if not checked out.

=item B<unclone>

Discard a checked-out antlink (if without changes).

=item B<init>

Create a new antlink and its new backing repository.

=item B<graft>

Link in a new external repository.

=item B<mv>

Rename an antlink, including its local repository and the server.
However, renaming does I<not> currently
catch local copies on I<other> computers---they
will become disconnected.
Because of this risk, C<mv> therefore requires the C<-f> option.

=item B<status> and B<push> and B<pull>

Report the status in a resolved antlink,
listing contents not yet committed or pushed.
Or push or pull across each resolved antlink.

With no argument, report across all resolved antlinks.

=item B<listsubcommands>

List all possible subcommands.
(Mainly for command line completion; humans should use C<antlink help>.)

=back

=head1 OPTIONS

=over

=item B<-f> or B<--force>

Force, allowing potentially risky behavior.

=item B<-d>

Enable debugging output.

=item B<-v>

Enable verbose output.

=item B<--help>

Show help.

=item B<--man>

Show full manual.

=back


=head1 REPOSITORY ASSUMPTIONS

We assume that, on the client, everything lives in a working directory W.
The local copy of meta repository is in W/META,
and the master is at ssh://git.example.com/path/META.git.

We assume all repositories follow a centralized model, with a central
master copy and local checked out version, and default to using git.
One can also patch in things that use other patterns and other VC
software.

Local copies of sub-repostories get checked out in W/SITE.
There's a default set of sub-repositories stored
next to the meta-repository;
they are checked out into W/META_GIT/SUB1
and with master copies in ssh://git.example.com/path/META_GIT/SUB1.git.

If META has many sub-repositories, they may live in a tree of subdirectories
in the meta repository.  Thus 
(ssh://site.example.com/path/code/SUBCODE2.git,
ssh://site.example.com/path/code/SUBCODE3.git,
ssh://site.example.com/path/www/SUBWWW4.git,
etc.).
Their working copies might be collected into 
W/SITE/code/SUBCODE2,
W/SITE/code/SUBCODE3,
W/SITE/www/SUBCODE4.

If there are multiple external groups of repositories,
that list of sites accumulates in META/_antlink.yaml.
The first one is always the meta directory and the second the default site.

=head1 WHY BOTHER WITH ANTLINK?

Git is great.  But git's assumptions don't cover the world of uses.
Specifically, git basically I<requires> that one check out all history
to do anything.  This approach fundamentally prevents a single repository
from scaling to cover many different projects over many years.

The git authors recognize this limitation and advise one git repository per "thing",
where thing is a program (like the Linux kernel, or git source).
This allows git to scale for that project, 
but it creates the new scaling problem: you now have many, many repositories.
(My research lab has more than 300; my personal site has a dozen.)

B<Antlink is the minimum glue needed to paste together a bunch of git repositories>
and manage them as a whole.

=head2 WHY NOT SOMETHING ELSE?

Many people have proposed similar things, but none is quite right:

=over

=item B<git-submodule>
doesn't work for us because it freezes the sub-module at a particular version.
We instead want to track the latest version of the subtrees.
(More detailed dislike: L<https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/>, 
L<http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/>).

=item B<git-subtree>
is like android repo (described below).
It also assumes you want all subtrees, 
and it ties subtrees to specific URLs (and therefore access methods of direct file or ssh).
We require the ability to copy some specific subtrees,
and we need to access them with different methods from different places
(for example, using direct file access when on the same server as the repository).

=item B<git-annex>
is intended to track pointers to large things that are not archived by git
and may be stored off-line.
We instead want to track small things (many files) that are in turn tracked
by other gits.
(We share goals in future-proofing and the need to avoid keeping a copy of all content locally.)

=item B<Android repo>
(L<https://source.android.com/source/using-repo.html>)
this tool is really close to what we need,
but it assumes one downloads all subtrees.
We instead require the ability to select only some of subtrees.
(In addition, its XML configuration format seems cumbersome.)

=item just use svn
This worked for quite a while, but svn has problems that git fixes. 
(Details: search for "git vs. svn".)

=back


=head2 ANTLINK DESIGN

Our goals:

=over

=item 
many repositories, so disk space can "scale down" for those who need only part of the porsitory, and no head-of-line blocking on big updates

=item
discoverability, so you can find out about repositories you don't know about

=item
all in one place, so repositories are not lost and are backed up

=item
the ability to paste together repositories from different places
(github, gitlab, gitfoo, gityou).

=back

=head2 ANTLINK IMPLEMENTATION

These "pointers" are symlinks that point just outside this directory
into "parallel" repositories that are checked out only when needed. By
default, you get a minimal checkout. If you need another repository
that's not yet checked out, run "antlink_resolve" on the symlink and
it will check out the backing repository.


=cut

use strict;
use Carp;
use Pod::Usage;
use Getopt::Long;
use File::Spec;
use File::Basename;
use File::Path qw(make_path);
use File::Find;
use File::Temp;
use Cwd qw(abs_path);
use IO::Pipe;
use YAML::XS;

our $VERSION = 1.5;

Getopt::Long::Configure ("bundling");
pod2usage(2) if ($#ARGV >= 0 && $ARGV[0] eq '-?');
#my(@orig_argv) = @ARGV;
my($prog) = $0;
my $debug = undef;
my $force = undef;
my $verbose = 0;
&GetOptions(
 	'help|?' => sub { pod2usage(1); },
	'man' => sub { pod2usage(-verbose => 2); },
	'd|debug+' => \$debug, 
	'f|force!' => \$force, 
        'v|verbose+' => \$verbose) or pod2usage(2);
pod2usage("$prog: no subcommand given.\n") if ($#ARGV == -1);
my $subcommand = shift @ARGV;




# sigh, poor-person OO programming.
my(%commands) = (
    'clone' => {
        'which_antlinks' => 'one',
	'action' => sub { antlink_clone(@_); },	   
    },
    'resolve' => {  # old synonym
        'which_antlinks' => 'one',
	'action' => sub { antlink_clone(@_); },	   
    },
    'unclone' => {
        'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { antlink_git_unclone(@_); },
    },
    'mv' => {
        'which_antlinks' => 'one;newdir',
	'action' => sub { antlink_mv(@_); },
    },
    'init' => {
        'which_antlinks' => 'newdir',
	'action' => sub { antlink_init(@_); },
    },
    'initmeta' => {
        'which_antlinks' => 'newdir',
	'action' => sub { antlink_initmeta(@_); },
    },
    'clonemeta' => {
        'which_antlinks' => 'onemeta',
	'action' => sub { antlink_clonemeta(@_); },
    },
    'graft' => {
        'which_antlinks' => 'onemeta',
	'action' => sub { antlink_graft(@_); },
    },
    'status' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub {
	    my($checkout_dir) = @_;
	    my($status) = system_output_nofail("git status", $checkout_dir);
	    my $branch = "unknown-branch";
	    my $parent = "unknown-parent";
	    foreach (split(/\n/, $status)) {
	        if (/^On branch (.*)$/) {
		    $branch = $1;
		    next;
	        };
	        if (/^Your branch is up-to-date with '(.*)'/) {
		    $parent = $1;
		    next;
	        };
	        if (/^Your branch ahead of '(.*)' by (\d+) /) {
		    $parent = $1;
		    my $commits = $2;
		    print "\tto push: $commits to '$parent'\n"; 
		    next;
	        };
	    };
	    system_nofail("git status -s|sed 's/^/\t/'", $checkout_dir);
	},
	'svn' => sub { system_nofail("svn status | sed 's/^/\t/'", $_[0]); },
	'checkedout' => sub { },
    },
    'push' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { system_nofail("git push | sed 's/^/\t/'", $_[0]); },
	'svn' => sub { print "svn push not supporeted; skipping $_[0]\n"; },
	'checkedout' => sub { },
    },
    'pull' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { system_nofail("git pull | sed 's/^/\t/'", $_[0]); },
	'svn' => sub { system_nofail("svn update | sed 's/^/\t/'", $_[0]); },
	'checkedout' => sub { },
    },
    'pending' => {
	'which_antlinks' => 'many',
	'action' => sub { pod2usage(-msg => "antlink pending not yet implemented\n"); }
    },
    'help' => {
	'which_antlinks' => 'none',
	'action' => sub { pod2usage(1); }
    },
    'listsubcommands' => {
	'which_antlinks' => 'none',
	'action' => sub { antlink_listsubcommands(); },
    },
    'man' => {
	'which_antlinks' => 'none',
	'action' => sub { pod2usage(-verbose => 2); }
    },
);


######################################################################

sub system_nofail($;$$) {
    # chdir, then
    # run a command and abort on error
    my($cmd, $dir, $error_message) = @_;
    print "$cmd\n" if ($verbose);
    my($pid) = fork();
    if (!defined($pid)) {
	die "cannot fork\n";
    } elsif ($pid == 0) {
	# child
	if (defined($dir)) {
	    print "chdir $dir\n" if ($verbose > 1);
	    chdir $dir || die "cannot chdir $dir\n";
	};
	exec $cmd or die "cannot exec $cmd: $!\n";
	exit(0);
    };
    my($result) = waitpid($pid, 0);
    die "lost our child process\n" if ($result == -1);
    croak "unexpected failure: $error_message\n\ton $cmd\n"
        if ($? != 0);
}

sub system_output_nofail($;$$) {
    # chdir, then
    # run a command and abort on error
    # returns output
    my($cmd, $dir, $error_message) = @_;
    print "$cmd\n" if ($verbose);
    $SIG{'PIPE'} = sub {};
    my($pipe) = IO::Pipe->new();
    my($pid) = fork();
    if (!defined($pid)) {
	die "cannot fork for $cmd\n";
    } elsif ($pid == 0) {
	# child
	$pipe->writer();
	untie *STDOUT;
	open \*STDOUT, ">&=", fileno $pipe or die "cannot cannot reopen stdout\n";
	if (defined($dir)) {
	    chdir $dir || die "cannot chdir $dir\n";
	};
	exec $cmd;
	exit(1);
    };
    $pipe->reader();
    my($output) = '';
    while (my $ln = $pipe->getline) {
	$output .= $ln;
    };
    close $pipe;
    my($result) = waitpid($pid, 0);
    die "lost our child process\n" if ($result == -1);
    die "unexpected failure: $error_message\n\ton $cmd\n"
        if ($? != 0);
    return $output;
}

sub ssh_system_output_nofail($$;$$) {
    # Like system_output_nofail, but on remote computer via ssh.
    # As an optimization, if remote computer is localhost, on local computer.
    my($host, $cmd, $dir, $error_message) = @_;
    unless ($host eq 'localhost' || $host eq 'localhost6') {
	my $quote_for_shell = ($cmd =~ /(\&\&|\|\|)/) ? 1 : undef;
	$cmd =~ s/\'/\\\'/g if ($quote_for_shell && $cmd =~ /\'/);
	$cmd = "'$cmd'" if ($quote_for_shell);
	$cmd = "ssh " . $host . " " . $cmd;
    };
    system_output_nofail($cmd, $dir, $error_message);
}


# replace foo/bar/.. with foo
# JUST TEXTUAL, unlike Cwd::abs_path
sub offline_canonicalize_path($) {
    my($path) = @_;
    my $prior = $path;
    for (;;) {
	$path =~ s@/[^/]+/\.\./@/@;
	last if ($prior eq $path);  # iterate to fixed point
	$prior = $path;
    };
    return $path;
}

sub normalize_git_url($) {
    my($url) = @_;
    $url =~ s@^/@file:/@;             # /foo => file:/foo
    $url =~ s@^([a-z]+:/)([^/])@$1//$2@;  # file:/foo => file:///foo
    return $url;
}



######################################################################

sub split_up_repo_down($) {
    my($contents) = @_;
    my($contents_up, $repo, $contents_down) = ($contents =~ m@^([./]+)/([^./]+)/(.*)$@);
    return($contents_up, $repo, $contents_down);
}

my(%repos) = (
#    'NAME' => {
#	'type' => 'git',
#	'remote_access' => 'ssh',
#	'remote_host' => 'ant.isi.edu',
#	'remote_path' => '/home/ant/ANT_GIT',
#	'users' => '*',
#    },
);
my(@repos_order) = ();

sub bootstrap_repos($$$) {
    my($link, $contents_up, $contents_repo) = @_;

    my($link_dir) = dirname($link);
    my $checkout_root = $link_dir . "/" . $contents_up;
    
    my($meta_repo_dir) = system_output_nofail("git rev-parse --show-toplevel", $link_dir, "not in  n antlink git repo");
    chomp $meta_repo_dir;
    my($meta_repo_list) = "$meta_repo_dir/_antlink.yaml";

    if (-f $meta_repo_list) {
	# found where to bootstrap
    } else {
        # in a antlink subrepo
	my($antlink_parent) = system_output_nofail("git config --get antlink.parent", $meta_repo_dir, "attempt to run antlink outside of meta or child repository (no antlink.parent record)");
	chomp($antlink_parent);
	$meta_repo_dir = $antlink_parent;
	$meta_repo_list = "$antlink_parent/_antlink.yaml";
	die "found antlink.parent at $antlink_parent, but no $meta_repo_list\n"
	    if (! -f $meta_repo_list);
    };

    die "cannot find _antlink.yaml\n"
        if (! -f $meta_repo_list);

    my($meta_repo_url) = system_output_nofail("git config --get remote.origin.url", $meta_repo_dir, "no remote.origin.url in $meta_repo_dir");
    chomp $meta_repo_url;
    $meta_repo_url = normalize_git_url($meta_repo_url);
    
    #
    # now parse it into:
    #    'NAME' => {
    #	'type' => 'git',
    #	'remote_access' => 'ssh',
    #	'remote_host' => 'ant.isi.edu',
    #	'remote_path' => '/home/ant/ANT_GIT',
    #	'users' => '*',
    #    },
    my $yaml = YAML::XS::LoadFile($meta_repo_list);
    die "no repos section of $meta_repo_list\n"
        if (!defined($yaml->{'repos'}));
    die "repos section is not a list of $meta_repo_list\n"
        if (ref($yaml->{'repos'}) ne 'ARRAY');
    foreach (@{$yaml->{'repos'}}) {
	my($name) = $_->{'name'};
	die "repo with no name\n" if (!defined($name));
	push(@repos_order, $name);
	$repos{$name}{'name'} = $name;
	$repos{$name}{'type'} = $_->{'type'} // 'git';
	if ($repos{$name}{'type'} ne 'checkedout') {
	    my $url = $_->{'url'};
	    die "repo $name has no url\n" if (!defined($url));
	    if ($url =~ /^parent:/i) {
	        my($parent_remote) = ($meta_repo_url =~ m@^([a-z]+:(//[^/]+)?)@);
		$url =~ s/^parent:/$parent_remote/;
	    } elsif ($url =~ m/^meta:$/i) {
	        $url = $meta_repo_url;
		$repos{$name}{'meta'} = 1;
	    };
	    $url =~ s@^([a-z]+):@@i; $repos{$name}{'remote_access'} = $1;
	    if ($url =~ s@^///@/@) {
		$repos{$name}{'remote_host'} = 'localhost';
	    } elsif ($url =~ s@^//([^/]+)/@/@) {
		$repos{$name}{'remote_host'} = $1;
	    } else {
		$repos{$name}{'remote_host'} = 'localhost';
	    };
	    # cannot end in /, because git clone ssh://foo.edu//bar fails
	    $url =~ s@/$@@;
	    ($repos{$name}{'remote_path'}) = $url;
	};
	$repos{$name}{'users'} = $_->{'users'} // '*';
    };

    # some more sanity checking    
    my($first_antlink_name) = $repos{$repos_order[0]}->{'name'};
    my($last_meta_repo_dir) = basename($meta_repo_dir);
    die "first entry of _antlink.yaml ($first_antlink_name) is not the meta dir ($last_meta_repo_dir)\n"
        if (! -f $meta_repo_list);

    return($checkout_root);
}

sub parse_antlink_meta($) {
    my($link) = @_;

    die "$link is not a symlink.\n" if (! -l $link);

    $link = File::Spec->rel2abs($link);
    my $contents = readlink($link) or die "cannot read contents of $link\n";
    die "antlinks must be relative, not absolute (but $link is absolute as $contents).\n"
        if ($contents =~ m@^/@);

    my($contents_up, $contents_repo, $contents_down) = split_up_repo_down($contents);
    die "cannot parse $contents into up and down\n" if (!defined($contents_up));
    my($checkout_root) = bootstrap_repos($link, $contents_up, $contents_repo) ||
        die "unknown repository: $contents_repo\n";

    return ($contents_up, $contents_repo, $contents_down, $checkout_root);
} 

sub create_antlink_meta($) {
    my($link) = @_;

    $link = File::Spec->rel2abs($link);
    #
    # it is now /home/user/WORKING/META/some/subpath
    # walk it back to the gitroot
    my(@dirs) = File::Spec->splitdir($link);
    my(@git_root_dirs) = @dirs;
    my($contents_up, $contents_repo, $contents_down) = ('.', undef, '.');
    for(;;) {
	my $git_root_dir = File::Spec->catdir(@git_root_dirs);
	last if (-d "$git_root_dir/.git");
	last if ($#git_root_dirs == -1);
	my($strip) = pop(@git_root_dirs);
	$contents_up .= "/..";
	$contents_down = $strip . "/" . $contents_down;
    };
    die "cannot find META's .git in $link\n"
        if ($#git_root_dirs == -1);
    $contents_up =~ s@^\./@@;
    $contents_down =~ s@/\.$@@;

#    my $checkout_root = dirname($link) . "/" . $contents_up;
    my($checkout_root) = bootstrap_repos($link, $contents_up, undef) ||
        die "unknown repository for $link\n";
    $contents_repo = $repos{$repos_order[1]}->{'name'};
    my($meta_dir) = $repos{$repos_order[0]}->{'name'};

    die "expect but cannot find meta dir $meta_dir in $checkout_root\n"
        if (! -d "$checkout_root/$meta_dir");

    return ($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir);
} 

sub parse_antlink_checkout($$$$) {
    my($contents_up, $contents_repo, $contents_down, $checkout_root) = @_;

    my $checkout_dir = "$checkout_root/$contents_repo/$contents_down";

    my($checkout_base, $checkout_path) = fileparse($checkout_dir);
    
    return($checkout_base, $checkout_path, $checkout_dir);
} 

######################################################################

=head1 SUBCOMMANDS IN DETAIL

=cut

=head2 antlink_clone

    antlink clone PATH_TO_ANTLINK

"clones" an antlink by checking it out into the parallel tree.

=cut

sub antlink_clone($;$);
sub antlink_clone($;$) {
    my($link) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    my $repo = $repos{$contents_repo};
    if ($repo->{'type'} eq 'checkedout') {
	print "$link points back into meta-repository; recursing\n" if ($verbose);
	# walk down
	my $recursive_link = File::Spec->catdir($checkout_root, $contents_repo);
	foreach (File::Spec->splitdir($contents_down)) {
	    $recursive_link = File::Spec->catdir($recursive_link, $_);
	    if (-l $recursive_link) {
		antlink_clone($recursive_link);
		return;
	    };
	};
	die "$link points back into meta-repository, but could not find the recursive antlink\n";
    };

    if (-d $checkout_dir) {
	print "$link is already cloned\n" if ($verbose);
	return;
    };
    die "have to checkout non-git by hand\n"
	if ($repo->{'type'} ne 'git');
    make_path($checkout_path);

#    die "you set up legacy svn by hand in parallel to the checkedout copy of ANT.git\n"
#	if ($repo =~ /svn$/i);
    my $canonical_checkout_dir = offline_canonicalize_path($checkout_dir);
    my $cmd = "git clone " . $repo->{'remote_access'} . "://" . $repo->{'remote_host'} . $repo->{'remote_path'} . "/$contents_down.git $canonical_checkout_dir";
    system_nofail($cmd);
}

=head2 antlink_mv

    antlink mv [-f] PATH_TO_ANTLINK NEW_PATH

Renames an antlink, on both the local copy and server.

=cut

sub antlink_mv($$) {
    my($link, $new_link) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};

    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot mv non-git repos\n"
	if ($repo->{'type'} ne 'git');

    my($new_contents_up, $new_contents_repo, $new_contents_down, $new_checkout_root, $new_meta_dir) = create_antlink_meta($new_link);
    my($new_checkout_base, $new_checkout_path, $new_checkout_dir) = parse_antlink_checkout($new_contents_up, $new_contents_repo, $new_contents_down, $new_checkout_root);

    die "confusing... did not find repo $new_contents_repo\n" if (!defined($new_repo));
    die "confusing... old $link and new $new_link do not appear to be in the same repo.\n"
	if ($contents_repo ne $new_contents_repo);

    #
    # sanity check
    #
    # ($link was already verified)
    # now check newlink
    die "something already exists at $newlink on local copy\n"
        if (-e $newlink);

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    #
    # make sure it's on the server
    #
    my($cmd) = "test -d " . $repo->{'remote_path'} . "/$contents_down.git && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    chomp $result;
    if ($result eq 'none') {
	die "anlink $link does not exist on server\n";
    } elsif ($result ne 'exists') {
	die "unknown response proing server with $cmd\n";
    };

    # and no new_link there, either
    $cmd = "test -d " . $new_repo->{'remote_path'} . "/$new_contents_down.git && echo exists || echo none";
    $result = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);

    #
    # now move it on server to $newlink
    #
    my($cmd) = "mv " . $repo->{'remote_path'} . "/$contents_down.git " .
		    $new_repo->{'remote_path'} . "/$new_contents_down.git " .
    $result = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);

    #
    # move the antlink
    #
    # make the new
    $cmd = "ln -s  $new_contents_up/$new_contents_repo/$new_contents_down $new_link";
    system_nofail($cmd);
    system_nofail("git add $new_link", ".", "failed to add $new_link");
    # kill the old
    system_nofail("git rm $link", ".", "failed to git rm $link");
#    unlink("$contents_up/$contents_repo/$contents_down", $link)
#	or die "cannot remove old antlink $link\n";

    #
    # commit meta
    #
    system_nofail("git commit -m 'mv $link $new_linkdir' $link $new_link", ".", "failed to commit$link and $newlink");
    system_nofail("git push origin master", $meta_co, "failed to commit");

    #
    # check local (if any) and move it
    #
    my $canonical_checkout_dir = offline_canonicalize_path($checkout_dir);
    my $new_canonical_checkout_dir = offline_canonicalize_path($new_checkout_dir);
    my $cmd = "mv $canonical_checkout_dir $new_canonical_checkout_dir";
    system_nofail($cmd);
}

=head2 antlink_git_unclone

    antlink unclone PATH_TO_ANTLINK

"unclones" an antlink by (1) making sure no changes are pending,
(2) discarding the checked out copy.

=cut

sub antlink_unclone($;$) {
    my($link, $subcommand) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};

    if (! -d "$checkout_dir/.") {
	print "$contents_down is not checked out.\n" if ($verbose);
	return;
    };

    my $status = system_output_nofail("git status --procelain", $link);
    my %files_status;
    foreach (split(/\n/, $status)) {
	my($file_status, $file) = m/^(..)\s(.*)$/;
	push (@{$files_status{$file_status}}, $file);
    };
    die "xxx: not done, check to see if anything left to commit\n";
}


=head2 antlink_initmeta

    antlink initmeta GIT_REPOSITORY_DIRECTORY

Create a new meta-repository.
These are always on the local computer.
xxx: should we parse ssh:?

=cut

sub antlink_initmeta($;$) {
    my($meta_repo_dir) = @_;

    # git init
    $meta_repo_dir =~ s@file:///@@;
    die "antlink initmeta only works with a local file system path.\n(use clonemeta later to get it to a remote system)\n"	
        if ($meta_repo_dir =~ /^[a-z]+:/);
    die "antlink initmeta requires (by convention) the path to end in .git\n"
        if ($meta_repo_dir !~ /\.git$/);
    die "antlink initmeta requires a full path (starting at the root with /)\n"
        if ($meta_repo_dir !~ /^\//);
    system_nofail("git init --bare --shared=group $meta_repo_dir", ".", "failed to git-init new meta repo in $meta_repo_dir");

    my($meta_repo_dir_no_git) = $meta_repo_dir;
    $meta_repo_dir_no_git =~ s/\.git$//;
    my($meta_base) = basename($meta_repo_dir_no_git);

    mkdir("${meta_repo_dir_no_git}_GIT") or die "cannot mkdir ${meta_repo_dir_no_git}_GIT\n";
    
    # add _antlink.yaml
    my($tempdir) = File::Temp::tempdir("./antlink_initmeta_XXXXXX", CLEANUP => 1);
#    my($tempdir) = "/tmp/ALT";
    my($meta_co) = "$tempdir/meta";
    system_nofail("git clone $meta_repo_dir $meta_co", ".", "failed to checkout a copy of $meta_repo_dir into $meta_co");
    my($al_file) = $meta_co . "/_antlink.yaml";
    open(AL, ">$al_file") or die "cannot write to $al_file\n";
    print AL "repos:\n  - name: $meta_base\n    type: git\n    url: \"meta:\"\n";
    print AL "  - name: ${meta_base}_GIT\n    type: git\n    url: \"parent:${meta_repo_dir_no_git}_GIT\"\n";
    close AL;
    system_nofail("git add _antlink.yaml", $meta_co, "failed to add $al_file");
    system_nofail("git commit -m 'initial _antlink.yaml' _antlink.yaml", $meta_co, "failed to commit $al_file");
    system_nofail("git push origin master", $meta_co, "failed to commit $al_file");

    # tempdir will cleanup the checkedout copy
}

=head2 antlink_clonemeta

    antlink clonemeta GIT_REPO_URL_OR_PATH [LOCAL_DIR]

Clone a meta-repository.
Could be from a local or remote computer.
Result is always local.

=cut

sub antlink_clonemeta(@) {
    # just git clone
    system_nofail("git clone " . join(" ", @_));
}


=head2 antlink_graft

    antlink graft [--vc svn|git] GIT_REPO_URL_OR_PATH [LOCAL_DIR]

Graft in an external meta-repository.
Could be from a local or remote computer.

=cut

sub antlink_graft(@) {
    # just git clone
    die "antlink_graft not yet implemented\n";
}


=head2 antlink_init

    antlink init PATH_TO_ANTLINK

Initialize a new antlink with some path,
creating a new git repository for it on the server
checking that out on the local computer,
and adding the antlink to the meta-repository

=cut

sub antlink_init($;$) {
    my($link) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir) = create_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    my $repo = $repos{$contents_repo};
    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot init non-git repos\n"
	if ($repo->{'type'} ne 'git');

    #
    # sanity check
    #
    die "something already exists at $link on local copy\n"
        if (-e $link);

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    
    #
    # go to the server and make it
    #
    my($cmd) = "test -d " . $repo->{'remote_path'} . "/$contents_down.git && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd) || die "cannot ssh to test $repo->{'remote_host'}\n";
    chomp $result;
    if ($result eq 'exists') {
	die "repository already exists for $link on server\n";
    } elsif ($result ne 'none') {
	die "unknown response proing server with $cmd\n";
    };
    $result = ssh_system_output_nofail($repo->{'remote_host'}, "git init --bare --shared=group " . $repo->{'remote_path'} . "/$contents_down.git");
    if ($repo->{'init_hook'}) {
	$result = ssh_system_output_nofail($repo->{'remote_host'}, $repo->{'init_hook'} . " " . $repo->{'remote_path'} . "/$contents_down.git");
    };

    #
    # symlink alias
    #
    symlink("$contents_up/$contents_repo/$contents_down", $link)
	or die "cannot create symlink $link\n";

    #
    # get a local copy
    #
    antlink_clone($link);

    #
    # put something in it
    # and push it
    # (to avoid the special case first push)
    #
    my($gi_dir) = "$checkout_root/$contents_repo/$contents_down";
    my($gi_file) = ".gitignore";
    open(GITIGNORE, ">$gi_dir/$gi_file") || die "cannot create $gi_dir/$gi_file\n";
    print GITIGNORE "*~\n";
    close GITIGNORE;
    system_nofail("git add $gi_file", $gi_dir);
    system_nofail("git commit -m 'start gitignore' $gi_file", $gi_dir);
    system_nofail("git push origin master", $gi_dir);

    #
    # finally, commit the master symlink
    #
    system_nofail("git add $link", ".", "failed to add $link");
    system_nofail("git commit -m 'create new $gi_dir' $link", ".", "failed to commit new link");
    system_nofail("git push", "$checkout_root/$meta_dir", "failed to push new link");
}


=head2 antlink_foreach

    antlink status PATH_TO_ANTLINK
    antlink push PATH_TO_ANTLINK
    antlink pull PATH_TO_ANTLINK

Show the git status of an antlink,
or push or pull.

If given a path, it performs the action on all antlinks
in that directory or its children.

=cut

sub antlink_foreach($;$) {
    my($link, $subcommand) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    if (! -d $checkout_dir) {
	print "$link is not cloned\n" if ($verbose);
	return;
    };

    print "$link\n";
    my $repo = $repos{$contents_repo};
    if (!defined($commands{$subcommand}{$repo->{'type'}})) {
	die "no option for $subcommand on repository of type " . $repo->{'type'} . " on $link\n";
    };
    &{$commands{$subcommand}{$repo->{'type'}}}(@_);
}

=head2 antlink_listsubcommands

    antlink listsubcommands

Enumerate all possible subcommands.
Useful in filename completion.

=cut

sub antlink_listsubcommands() {
    print join(" ", sort keys %commands), "\n";
}



######################################################################

#
# main
#

pod2usage(-msg => "unknown subcommand: $subcommand\n")
    if (!defined($commands{$subcommand}));

if ($commands{$subcommand}{'which_antlinks'} eq 'one;newdir' && $#ARGV != 1) {
    pod2usage(-msg => "$prog: mv requires both OLD and NEW antlinks.\n");
};
if ($#ARGV == -1) {
    if($commands{$subcommand}{'which_antlinks'} eq 'many') {
	push(@ARGV, ".");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'one') {
	pod2usage(-msg => "$prog: no ANTLINK given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'newdir') {
	pod2usage(-msg => "$prog: no new ANTLINK given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'onemeta') {
	pod2usage(-msg => "$prog: no meta repository given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'none') {
	# pass
    } else {
	die "$prog: internal error, unknown which_antlinks\n";
    };
};

if ($commands{$subcommand}{'which_antlinks'} eq 'onemeta' ||
    $commands{$subcommand}{'which_antlinks'} eq 'none' || 
    $commands{$subcommand}{'which_antlinks'} eq 'one;newdir') {
    &{$commands{$subcommand}{'action'}}(@ARGV);
    exit 0;
};

#
# default path, iterate over all args
#
foreach my $link (@ARGV) {
    #
    # if on dir in ANT, expand all children
    #
    my(@recursive_ARGV) = ();
    if ($commands{$subcommand}{'which_antlinks'} eq 'newdir' || -l $link) {
	&{$commands{$subcommand}{'action'}}($link, $subcommand);
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'one') {
	die "invoked antlink $subcommand on a non-symlink; this subcommand is too dangereous to recurse.\n";
    } else {
        find({
	    preprocess => sub { return sort @_; },
	    wanted => sub { -l && -d && push(@recursive_ARGV, $_) },
	    no_chdir => 1},
	     $link);
	foreach my $recurse (@recursive_ARGV) {
	    &{$commands{$subcommand}{'action'}}($recurse, $subcommand);
	};
    };
};

exit 0;

=head1 RELEASE HISTORY

The most recent version of antlink is at L<https://ant.isi.edu/software/antlink/>.

=over

=item 0.1 (2015-06-09)
Released for internal ANT project use.  Full of unportability, but functional.

=item 1.0 (2016-01-03)
Cleaned up with no ANT-specific dependencies.  A "real" release.

=item 1.1 (2016-01-04)
Better documentation and a website.

=item 1.2 (2016-01-05)
Fixes critical bug in C<antlink init> when meta is remote.

=item 1.3 (2016-06-06)

Bugfix: no more infinite loop when C<antlink init> run outside a meta repository.
(Bug reported by Calvin Ardi.)

Enhancement: C<antlink help> and C<antlink man> now work.
(Suggestion from Calvin Ardi.)

=item 1.4 (2016-12-06)

Enhancement: Added bash autocompletion, and C<antlink listsubcommands> to support it.

Enhancement: Added preliminary verison of C<antlink mv> to rename antlinks.
(More work is needed, though, to handle distributed moves.)
Motivated by a rename for Lan Wei.

=item 1.5 (2016-12-06)

Bug fix: improved documentation installtion to fix Fedora packaging problem.


=back

=head1 AUTHOR AND THANKS

Antlink is written by John Heidemann.

Antlink benefited from feedback and bug reports from many people (thanks!):
Yuri Pradkin,
Calvin Ardi,
Wes Hardaker.


=head1 COPYRIGHT

Copyright (C) 2015-2016 the University of Southern California.

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License,
version 2, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License along
ith this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.

=cut
    
