#!/usr/bin/perl -w

#
# antlink
#
# Copyright (C) 2015-2023 the University of Southern California
#
#    This program is free software; you can redistribute it and/or modify
#    it under the terms of the GNU General Public License as published by
#    the Free Software Foundation; either version 2 of the License.
#
#    This program is distributed in the hope that it will be useful,
#    but WITHOUT ANY WARRANTY; without even the implied warranty of
#    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#    GNU General Public License for more details.
#
#    You should have received a copy of the GNU General Public License along
#    with this program; if not, write to the Free Software Foundation, Inc.,
#    51 Franklin Street, Fifth Floor, Boston, MA 02110-1301 USA.
# 



=head1 NAME

antlink - support funky symlinks to manage a tree of git (or other) repositories

=head1 SYNOPSIS

antlink SUBCOMMAND AN_ANT_SYMLINK

=head1 DESCRIPTION

Antlink's goal is to make groups of git (or other) repositories
discoverable and clonable without requiring everyone check out everything.

Antlink handles the meta-repository, which is a tree of repositories.
A meta-repository has links, called I<antlinks>, to other regular repositories.
Some repositories may be cloned or not.
When cloned, a local copy can be edited.
When not cloned, they take no space and are easy to clone when desired.
(If a clone is no longer needed, it can be removed, then restored later.)

Repositories stored on the same server can be grouped into a I<site>,
simplifying their discovery.
There can be multiple sites, like github and gitlab,
or the ANT project set and the other project set.
Sites share the same access method (file or ssh),
version control system (git or svn),
hostname, and perhaps a common path at that host.

In practice, an antlink is just a specially formatted symlink,
so when they are checked
A sub-repository is an "antlink", a funny symlink.

Typically one interacts with a repository with standard git commands.
Antlink commands are used only to clone, remove, or rename
repositories.  There are also antlink status commands
that, when invoked in the meta-repository, run over all 
currently cloned repositories.

In addition to git, leaf repositories can be stored by 
subversion.


=head2 WORKFLOW: REGULAR USE

(If you are starting for the first time, see 
"WORKFLOW: STARTING A NEW ANTLINK METAREPOSITORY" below.)

Typical workflow is to go to the meta-repository
and see if anything needs to be updated in any 
cloned repository via

    cd META
    antlink pull .

To work on an repository that is not yet local, clone it:

    cd META
    antlink clone subrepo
    cd subrepo
    # edit away

To look for things that are not checked in:

    cd META
    antlink status .

To start a new sub-repository inside the existing meta-repository:

    cd META
    antlink init newsubrepo

One can organize things in the default meta-repository:

    cd META
    mkdir PAPERS
    cd PAPERS
    antlink init workshop_paper_0
    antlink init conference_paper_1
    antlink init journal_paper_2
    antlink init tenure_acceptance_3

=head2 WORKFLOW: COPYING AN EXISTING GIT REPO INTO ANTLINK

One can make a copy of an existing repo 
to be managed under antlink, or you can link in
a pointer to the other repo, as described in the next section.

For copying, make a new antlink repo and use standard git to pull
the old history:

    cd META
    antlink init copy_of_paper
    cd copy_of_paper
    git remote add upstream https://location/of/otherrepo.git
    git pull upstream main

(or replace main with whatever your upstream's prefered branch is).

and to abandon the upstream
  
    git remote remove upstream

and save to the antlink copy

    git push

=head2 WORKFLOW: LINKING IN OTHER REPOSITORIES

Antlink can tie together repos on multiple sites.
Each remote meta-repository is listed in F<_antlink.yaml>,
and the graft command puts them there, and clones the repo for you.

    cd META
    mkdir EXTERNAL
    antlink graft https://github.com/jekyll/jekyll.git EXTERNAL/jekyll_read_only
    antlink graft git@github.com:jekyll/jekyll.git EXTERNAL/jekyll_rw


    antlink graft --vc svn https://github.com/jekyll/jekyll EXTERNAL/jekyll_via_svn

One can also omit the destination to get a default:

    antlink graft https://github.com/jekyll/jekyll.git

(will appear in "jekyll" in the current directory).


=head2 WORKFLOW: LINKING OVERLEAF

Antlink integrates with overleaf, treating it as an external git repository.
First, create a project in overleaf (or have your friend invite you to
their project, and join it on the website.)

Then  put this in your _antlink.yaml file like this:

      - name: OVERLEAF_GIT
        type: git
        url: "https://git.overleaf.com/"

and put your userid in ~/.gitconfig:

        [credential "https://git.overleaf.com"]
            username = johnh@isi.edu

and do

    cd META
    antlink graft https://git.overleaf.com/63cb1095c6b536300dc7f02a overleaf_example_project

This command will graft in a specific overleaf sample project.
(Note that the URL has "git", not "www"---use
the URL from the "clone with git" recommendation under
Overleaf's Sync > Git menu).
It will clone the project into the "overleaf_example_project" antlink.


=head2 WORKFLOW: STARTING A NEW ANTLINK METAREPOSITORY

To make a brand new meta-repository on your current computer:

    cd $HOME
    antlink initmeta /home/yourid/metarepo.git

To start on your computer from an existing meta-repository:

    antlink clonemeta /home/yourid/metarepo.git

will check out "metarepo" into the current directory.

Or to pick up the metarepo from another computer:

    antlink clonemeta ssh://git.example.com/home/yourid/metarepo.git

Then look in F<metarepo>.


=head2 ON-LINE AND OFF-LINE USE

Currently all interations with the meta-repository must be done on-line,
with access to that respository.
This requirement avoids independent, conflicting operations on repositories
(for example, if two people were to create or rename
the same sub-repository).

Operations on inside individual sub-repositories can 
be carried out when off-line,
as with normal git.

In principle we can operate fully-offline; we did it in 1990 (see
"Implementation of the Ficus Replicated File System" by Guy et al.,
Usenix Technical Conference, 1990).
However, the current implementation does not support infrastructure to support 
offline operation
(something recognized as a bug).


=head1 SUBCOMMANDS

The following sub-commands work on the given antlink:

=over

=item B<help>

Show basic help.  See also C<antlink --man> to show the full manual page.

=item B<clone>

Check out an antlink, if not checked out.
(An old synonym is B<resolve>.)

=item B<unclone>

Discard a checked-out antlink (assuming all changes are committed and pushed).

=item B<init>

Create a new antlink and its new backing repository on the server.

=item B<graft>

Link in a new external repository.

=item B<mv>

Rename an antlink, including its local repository and the server.
However, renaming does I<not> currently
catch local copies on I<other> computers---they
will become disconnected.
Because of this risk, C<mv> therefore requires the C<-f> option.

=item B<rm>

Remove an antlink I<and> its repository, both
the local copy and on the server.
As with renaming, remove does I<not> currently
catch local copies on I<other> computers---they
will become disconnected.
Because of this risk and because it is destructive,
C<rm> therefore requires the C<-f> option.
(Run without C<-f> but with C<-v> to show what it will do, 
if you're nervous.)

=item B<status> and B<push> and B<pull>

Report the status in a resolved antlink,
listing contents not yet committed or pushed.
Or push or pull across each resolved antlink.

With no argument, report across all cloned antlinks
under the given path (or the current directory).

=item B<listsubcommands>

List all possible subcommands.
(Mainly for command line completion; humans should use C<antlink help>.)

=back

=head1 OPTIONS

=over

=item B<-f> or B<--force>

Force, allowing potentially risky behavior.

=item B<-d>

Enable debugging output.

=item B<-v>

Enable verbose output.

=item B<--help>

Show help.

=item B<--man>

Show full manual, including list of subcommands.

=back


=head1 REPOSITORY ASSUMPTIONS

We assume that, on the client, everything lives in a working directory W.
The local copy of meta repository is in W/META,
with the official copy at ssh://git.example.com/path/META.git.

We assume all repositories on a site
follow a centralized model, with a central,
offical copy and local checked out version, and default to using git.
One can also patch in things that use other patterns and other VC
software.

Local copies of sub-repostories get checked out in W/SITE.
There's a default set of sub-repositories stored
next to the meta-repository;
they are checked out into W/META_GIT/SUB1
and with offical, central copies in ssh://git.example.com/path/META_GIT/SUB1.git.

If META has many sub-repositories, they may live in a tree of subdirectories
in the meta repository.  Thus 
(ssh://site.example.com/path/code/SUBCODE2.git,
ssh://site.example.com/path/code/SUBCODE3.git,
ssh://site.example.com/path/www/SUBWWW4.git,
etc.).
Their working copies might be collected into 
W/SITE/code/SUBCODE2,
W/SITE/code/SUBCODE3,
W/SITE/www/SUBCODE4.

If there are multiple external groups of repositories,
that list of sites accumulates in META/_antlink.yaml.
The first one is always the meta directory and the second the default site.


=head1 WHY BOTHER WITH ANTLINK?

Git is great.  But git's assumptions don't cover the world of uses.
Specifically, git basically I<requires> that one check out I<all> history
to do anything.  This approach fundamentally prevents a single repository
from scaling to cover many different projects over many years.

The git authors recognize this limitation and advise one git repository per "thing",
where thing is a program (like the Linux kernel, or git source).
This allows git to scale for that project, 
but it creates the new scaling problem: you now have many, many repositories.
(My research lab has more than 300; my personal site has a dozen.)

B<Antlink is the minimum glue needed to paste together a bunch of git repositories>
and manage them as a whole.

=head2 WHY NOT SOMETHING ELSE?

Many people have proposed similar things, but none is quite right:

=over

=item B<git-submodule>
doesn't work for us because it freezes the sub-module at a particular version.
We instead want to track the latest version of the subtrees.
(More detailed dislike: L<https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/>, 
L<http://blogs.atlassian.com/2013/05/alternatives-to-git-submodule-git-subtree/>).

=item B<git-subtree>
is like android repo (described below).
It also assumes you want all subtrees, 
and it ties subtrees to specific URLs (and therefore access methods of direct file or ssh).
We require the ability to copy some specific subtrees,
and we need to access them with different methods from different places
(for example, using direct file access when on the same server as the repository).

=item B<git-annex>
is intended to track pointers to large things that are not archived by git
and may be stored off-line.
We instead want to track small things (many files) that are in turn tracked
by other gits.
(We share goals in future-proofing and the need to avoid keeping a copy of all content locally.)

=item B<Android repo>
(L<https://source.android.com/source/using-repo.html>)
this tool is really close to what we need,
but it assumes one always downloads all subtrees.
We instead require the ability to select only some of subtrees.
(In addition, its XML configuration format seems cumbersome.)

=item just use svn
This worked for quite a while, but svn has problems that git fixes. 
(Details: search for "git vs. svn".)

=item B<gr>
(L<https://github.com/mixu/gr>)
I found gr a year after I started antlink.
Seems to have roughly the same goals
(and similar design choices, basically passing git commands through).
I need to look at it more carefully.
(Seems like last edit was 2017.)

=item B<mu>
(L<https://fabioz.github.io/mu-repo/>)
I found mu a year after I started antlink.
It seems to have similar goals, and
I need to look at it more carefully.

=item B<myrepos>
(L<https://myrepos.branchable.com/>)
It seems to have similar goals, and
I need to look at it more carefully.

=back


=head2 ANTLINK DESIGN

Our goals:

=over

=item *
B<many repositories>, so disk space can "scale down" for those who need only 
some of the group of repositories (the I<meta-repository>),
and to avoid head-of-line blocking on big updates

=item *
B<discoverability>, so you can find out about repositories you don't know about

=item *
B<all in one place>, so repositories are not lost and are easy to back up

=item
B<graftability>, so one can paste together repositories from different places
(github, gitlab, gitfoo, gityou) into one meta-repository.

=back

=head2 ANTLINK IMPLEMENTATION

These antlink "pointers" are symlinks that point just outside this directory
into "parallel" repositories that are checked out only when needed. By
default, you get a minimal checkout. If you need another repository
that's not yet checked out, run "antlink_resolve" on the symlink and
it will check out the backing repository.


=head1 ANTLINK's _antlink.yaml

The root of a metarepository has a file F<_antlink.yaml>.
In the fullness of time the C<antlink graft> command will edit this file.
For now, users must edit it if they want to paste together
different repositories.

The first two entries will always be the primary metarepo:

    repos:
      - name: ANT
        type: git
        url: "meta:"
      - name: ANT_GIT
        type: git
        url: "parent:/home/ant/ANT_GIT"
        init_hook: /home/ant/githooks/configure_new_repository

But one can hook in foreign repos, like overleaf and github
or a privately hosted github like thing, or even subversion:

      - name: OVERLEAF_GIT
        type: git
        url: "https://git.overleaf.com/"
      - name: GITHUB_GENERAL
        type: git
        url: "https://github.com/"
      - name: ANT_GITEA
        type: git
        verify_ssh: false
        url: "ssh://git@git.ant.isi.edu/"
      - name: ANT_SVN
        type: svn
        url: "parent:/home/ant/ANT_SVN"

In most cases the URL is the prefix you pass to "git clone",
so https or ssh access methods.

A "parent:" in the URL means inherit the access method
of the parent git repo.  Thus one can clone the ANT metarepo
with file:// on one machine with direct access to the central repo,
and with ssh:// on another machine, and it will all work.

Subverion support is incomplete and intended for legacy use.

Antlink assumes it can shell ssh to the hosting computer
when the access method is ssh, unless you set C<verify_ssh: false>.


=cut

use strict;
use Carp;
use Pod::Usage;
use Getopt::Long;
use File::Spec;
use File::Basename;
use File::Path qw(make_path);
use File::Find;
use File::Temp;
use Cwd qw(abs_path);
use IO::Pipe;
use YAML::PP;

our $VERSION = '1.20';

Getopt::Long::Configure ("bundling");
pod2usage(2) if ($#ARGV >= 0 && $ARGV[0] eq '-?');
my(@orig_argv) = @ARGV;
my($prog) = $0;
my $debug = undef;
my $force = undef;
my $location = undef;
my $verbose = 0;
&GetOptions(
 	'help|?' => sub { pod2usage(1); },
	'man' => sub { pod2usage(-verbose => 2); },
 	'version' => sub { print "$prog $VERSION\n"; exit(0); },
	'd|debug+' => \$debug, 
	'f|force!' => \$force, 
        'l|local' => sub { $location = 'local'; },
    	'r|remote' => sub { $location = 'remote'; },
        'v|verbose+' => \$verbose) or pod2usage(2);
pod2usage("$prog: no subcommand given (try antlink help or antlink --man for more information).\n") if ($#ARGV == -1);
my $subcommand = shift @ARGV;


my $DEFAULT_BRANCH = 'main';   # if user doesn't give one


# sigh, poor-person OO programming.
my(%commands) = (
    'clone' => {
        'which_antlinks' => 'one',
	'action' => sub { antlink_clone(@_); },	   
    },
    'resolve' => {  # old synonym for clone
        'which_antlinks' => 'one',
	'action' => sub { antlink_clone(@_); },	   
    },
    'unclone' => {
        'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
#	'git' => sub { antlink_git_unclone(@_); },
    },
    'mv' => {
        'which_antlinks' => 'one;newdir',
	'action' => sub { antlink_mv(@_); },
    },
    'rm' => {
        'which_antlinks' => 'one',
	'action' => sub { antlink_rm(@_); },
    },
    'rename-branch-meta' => {
        'which_antlinks' => 'onemeta;old;new',
	'action' => sub { antlink_rename_branch_meta(@_); },
    },
    'rename-branch' => {
        'which_antlinks' => 'one;old;new',
	'action' => sub { antlink_rename_branch(@_); },
    },
    'init' => {
        'which_antlinks' => 'newdir',
	'action' => sub { antlink_init(@_); },
    },
    'initmeta' => {
        'which_antlinks' => 'newdir',
	'action' => sub { antlink_initmeta(@_); },
    },
    'clonemeta' => {
        'which_antlinks' => 'onemeta',
	'action' => sub { antlink_clonemeta(@_); },
    },
    'graft' => {
        'which_antlinks' => 'onemeta',
	'action' => sub { antlink_graft(@_); },
    },
    'status' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub {
	    my($checkout_dir) = @_;
	    my($status) = system_output_nofail("git status", $checkout_dir);
	    my $branch = "unknown-branch";
	    my $parent = "unknown-parent";
	    foreach (split(/\n/, $status)) {
	        if (/^On branch (.*)$/) {
		    $branch = $1;
		    next;
	        };
	        if (/^Your branch is up-to-date with '(.*)'/) {
		    $parent = $1;
		    next;
	        };
	        if (/^Your branch ahead of '(.*)' by (\d+) /) {
		    $parent = $1;
		    my $commits = $2;
		    print "\tto push: $commits to '$parent'\n"; 
		    next;
	        };
	    };
	    system_nofail("git status -s|sed 's/^/\t/'", $checkout_dir);
	},
	'svn' => sub { system_nofail("svn status | sed 's/^/\t/'", $_[0]); },
	'checkedout' => sub { },
    },
    'push' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { system_nofail("git push | sed 's/^/\t/'", $_[0]); },
	'svn' => sub { print "svn push not supporeted; skipping $_[0]\n"; },
	'checkedout' => sub { },
    },
    'pull' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { system_nofail("git pull | sed 's/^/\t/'", $_[0]); },
	'svn' => sub { system_nofail("svn update | sed 's/^/\t/'", $_[0]); },
	'checkedout' => sub { },
    },
    'show-clones' => {
	'which_antlinks' => 'many',
	'action' => sub { antlink_foreach(@_); },
	'git' => sub { },
	'svn' => sub { },
	'checkedout' => sub { },
    },
    'pending' => {
	'which_antlinks' => 'many',
	'action' => sub { pod2usage(-msg => "antlink pending not yet implemented\n"); }
    },
    'help' => {
	'which_antlinks' => 'none',
	'action' => sub { pod2usage(1); }
    },
    'listsubcommands' => {
	'which_antlinks' => 'none',
	'action' => sub { antlink_listsubcommands(); },
    },
    'man' => {
	'which_antlinks' => 'none',
	'action' => sub { pod2usage(-verbose => 2); }
    },
);


######################################################################

#
# system_nofail
# run (shell) CMD in optional DIR, terminatingwith optional ERROR on failure
#
sub system_nofail($;$$) {
    # chdir, then
    # run a command and abort on error
    my($cmd, $dir, $error_message) = @_;
    print "cd " . ($dir // ".") . " && $cmd\n" if ($verbose);
    my($pid) = fork();
    if (!defined($pid)) {
	die "cannot fork\n";
    } elsif ($pid == 0) {
	# child
        close STDIN;
	if (defined($dir)) {
	    print "chdir $dir\n" if ($verbose > 1);
	    chdir $dir || die "cannot chdir $dir\n";
	};
	exec $cmd or die "cannot exec $cmd: $!\n";
	exit(0);
    };
    my($result) = waitpid($pid, 0);
    die "lost our child process\n" if ($result == -1);
    if ($? != 0) {
        $error_message //= '(no error)';
        croak "unexpected failure: $error_message\n\ton $cmd\n"
    };
}

#
# system_verbose_or_nofail:
# like system_nofail, but if verbose is on just print (don't run) the command
#
sub system_verbose_or_nofail($;$$) {
    my($cmd, $dir, $error_message) = @_;
    if ($verbose) {
        print("cd $dir && ") if (defined($dir));
        print("$cmd\n");
    } else {
        system_nofail($cmd, $dir, $error_message);
    };
}

#
# system_verbose_output_nofail:
# like system_nofail, but return the output as one big (multi-line) string
#
sub system_output_nofail($;$$) {
    # chdir, then
    # run a command and abort on error
    # returns output
    my($cmd, $dir, $error_message) = @_;
    print "cd " . ($dir // ".") . " && $cmd\n" if ($verbose);
    $SIG{'PIPE'} = sub {};
    my($pipe) = IO::Pipe->new();
    my($pid) = fork();
    if (!defined($pid)) {
	die "cannot fork for $cmd\n";
    } elsif ($pid == 0) {
	# child
	$pipe->writer();
	untie *STDOUT;
	open \*STDOUT, ">&=", fileno $pipe or die "cannot cannot reopen stdout\n";
	if (defined($dir)) {
	    chdir $dir || die "cannot chdir $dir\n";
	};
	exec $cmd;
	exit(1);
    };
    $pipe->reader();
    my($output) = '';
    while (my $ln = $pipe->getline) {
	$output .= $ln;
    };
    close $pipe;
    my($result) = waitpid($pid, 0);
    die "lost our child process\n" if ($result == -1);
    die "unexpected failure: $error_message\n\ton $cmd\n"
        if ($? != 0);
    return $output;
}

#
# ssh_system_output_nofail
# ssh to HOST to run shell CMD, optionally in local DIR, exiting with optional ERROR if it fails, returning output as one big string
#
sub ssh_system_output_nofail($$;$$) {
    # Like system_output_nofail, but on remote computer via ssh.
    # As an optimization, if remote computer is localhost, on local computer.
    my($host, $cmd, $dir, $error_message) = @_;
    unless (!defined($host) || $host eq 'localhost' || $host eq 'localhost6') {
	my $quote_for_shell = ($cmd =~ /(\&\&|\|\|)/) ? 1 : undef;
	$cmd =~ s/\'/\\\'/g if ($quote_for_shell && $cmd =~ /\'/);
	$cmd = "'$cmd'" if ($quote_for_shell);
	$cmd = "ssh -n " . $host . " " . $cmd;
    };
    return system_output_nofail($cmd, $dir, $error_message);
}

#
# ssh_verify:
# ssh to HOST to prove that PATH exists there
#
sub ssh_verify($$) {
    my($remote_host, $remote_path) = @_;
    my($cmd) = "test -d $remote_path && echo exists || echo none";
    my($result) = ssh_system_output_nofail($remote_host, $cmd);
    chomp $result;
    if ($result eq 'exists') {
        # ok
    } elsif ($result eq 'none') {
        die "assertion failed: path $remote_path does not exist on server $remote_host\n";
    } else {
        die "unknown response probing server with $cmd\n\t(Maybe you have ssh problems?)\n";
    };
}

#
# offline_canonicalize_path
# cannonicalize PATH,
# replacing foo/bar/.. with foo.
# (Just textual replacement, unlike Cwd::abs_path.)
#
sub offline_canonicalize_path($) {
    my($path) = @_;
    my $prior = $path;
    for (;;) {
	$path =~ s@/[^/]+/\.\./@/@;
	last if ($prior eq $path);  # iterate to fixed point
	$prior = $path;
    };
    return $path;
}

#
# normalize_git_url
# given a URL for git, make it of form file:///foo
#
sub normalize_git_url($) {
    my($url) = @_;
    $url =~ s@^/@file:/@;             # /foo => file:/foo
    $url =~ s@^([a-z]+:/)([^/])@$1//$2@;  # file:/foo => file:///foo
    return $url;
}

#
# git_has_initialBranch:
#
# determine if the git we're on has initialBranch
sub git_has_initialBranch(;$) {
    my($host) = @_;
    # --initial-branch starts in git-2.28
    my($result) = ssh_system_output_nofail($host, 'git --version', undef, 'git --version on server');
    return undef if ($? != 0);
    my($major, $minor) = $result =~ / (\d+)\.(\d+)/;
    return undef if ($major < 2 || ($major == 2 && $minor < 28));
    return 1;
}

#
# git_default_branch:
#
# determine the correct name for the default branch,
# defaulting to main if it's not specified.
#
# Handles different git versions, including those with out initialBranch.
#
sub git_default_branch(;$) {
    my($host) = @_;
    my $branch;
    my($has_initialBranch) = git_has_initialBranch($host);
    return undef if (!$has_initialBranch);
    my($result) = `git config init.defaultBranch`;
    return "main" if ($? != 0);
    chomp($result);
    return $result;
}



######################################################################

#
# split_up_repo_down
# sllit an ANTLINK into the "up" part (just ../../..),
# the name of the repo (the next component)
# and the donwn part (any remaining path components).
#
sub split_up_repo_down($) {
    my($contents) = @_;
    my($contents_up, $repo, $contents_down) = ($contents =~ m@^([./]+)/([^./]+)/(.*)$@);
    return($contents_up, $repo, $contents_down);
}

#
# reops is a parsed version of _antlink.yaml
# _antlink.yaml has
#   name
#   type (git or svn)
#   url
#
my(%repos) = (
#    'NAME' => {
#	'type' => 'git',
#	'remote_access' => 'ssh',
#	'remote_host' => 'ant.isi.edu',
#	'remote_path' => '/home/ant/ANT_GIT',
#	'users' => '*',
#	'url' => (from _antlink.yaml),
#    },
);
my(@repos_order) = ();

#
# bootstrap_repos:
# given an ANTLINK with CONTENTS_UP (dot-dot links) and REPO
# (or the antlink is in the metadir if optional LINK_IS_META is set)
# read and parse _antlink.yaml
#
# (Cannot be called on a METADIR.)
#
sub bootstrap_repos($$$;$) {
    my($link, $contents_up, $contents_repo, $link_is_meta) = @_;

    my($link_dir) = ($link_is_meta ? $link : dirname($link));
    my $checkout_root = $link_dir . "/" . $contents_up;
    
    my($meta_repo_dir) = system_output_nofail("git rev-parse --show-toplevel", $link_dir, "not in an antlink git repo");
    chomp $meta_repo_dir;
    my($meta_repo_list) = "$meta_repo_dir/_antlink.yaml";

    if (-f $meta_repo_list) {
	# found where to bootstrap
    } else {
        # in a antlink subrepo
	my($antlink_parent) = system_output_nofail("git config --get antlink.parent", $meta_repo_dir, "attempt to run antlink outside of meta or child repository (no antlink.parent record)");
	chomp($antlink_parent);
	$meta_repo_dir = $antlink_parent;
	$meta_repo_list = "$antlink_parent/_antlink.yaml";
	die "found antlink.parent at $antlink_parent, but no $meta_repo_list\n"
	    if (! -f $meta_repo_list);
    };

    die "cannot find _antlink.yaml\n"
        if (! -f $meta_repo_list);

    my($meta_repo_url) = system_output_nofail("git config --get remote.origin.url", $meta_repo_dir, "no remote.origin.url in $meta_repo_dir");
    chomp $meta_repo_url;
    $meta_repo_url = normalize_git_url($meta_repo_url);
    
    #
    # now parse it into:
    #    'NAME' => {
    #	'type' => 'git',
    #	'remote_access' => 'ssh',
    #	'remote_host' => 'ant.isi.edu',
    #	'remote_path' => '/home/ant/ANT_GIT',
    #	'users' => '*',
    #    },
    my $yaml = YAML::PP::LoadFile($meta_repo_list);
    die "no repos section of $meta_repo_list\n"
        if (!defined($yaml->{'repos'}));
    die "repos section is not a list of $meta_repo_list\n"
        if (ref($yaml->{'repos'}) ne 'ARRAY');
    foreach (@{$yaml->{'repos'}}) {
	my($name) = $_->{'name'};
	die "repo with no name\n" if (!defined($name));
	push(@repos_order, $name);
	$repos{$name}{'name'} = $name;
     	$repos{$name}{'url'} = $_->{'url'};
	$repos{$name}{'type'} = $_->{'type'} // 'git';
	$repos{$name}{'verify_ssh'} = $_->{'verify_ssh'} // 1;
	if ($repos{$name}{'type'} ne 'checkedout') {
	    my $url = $_->{'url'};
	    die "repo $name has no url\n" if (!defined($url));
	    if ($url =~ /^parent:/i) {
	        my($parent_remote) = ($meta_repo_url =~ m@^([a-z]+:(//[^/]+)?)@);
		$url =~ s/^parent:/$parent_remote/;
	    } elsif ($url =~ m/^meta:$/i) {
	        $url = $meta_repo_url;
		$repos{$name}{'meta'} = 1;
	    };
            # strip off the access protocol (ssh: or http:)
	    $url =~ s@^([a-z]+):@@i; $repos{$name}{'remote_access'} = $1;
	    if ($url =~ s@^///@/@) {
		$repos{$name}{'remote_host'} = 'localhost';
	    } elsif ($url =~ s@^//([^/]+)([/:])@$2@) {
		$repos{$name}{'remote_host'} = $1;
	    } else {
		$repos{$name}{'remote_host'} = 'localhost';
	    };
	    # cannot end in /, because git clone ssh://foo.edu//bar fails
	    $url =~ s@/$@@;
	    ($repos{$name}{'remote_path'}) = $url;
	};
	$repos{$name}{'users'} = $_->{'users'} // '*';
    };

    # some more sanity checking    
    my($first_antlink_name) = $repos{$repos_order[0]}->{'name'};
    my($last_meta_repo_dir) = basename($meta_repo_dir);
    die "first entry of _antlink.yaml ($first_antlink_name) is not the meta dir ($last_meta_repo_dir)\n"
        if (! -f $meta_repo_list);

    return($checkout_root);
}

#
# parse_antlink_meta
# given an ANTLINK (in an existing metadir)
# figure out its parts and build %repos.
#
sub parse_antlink_meta($) {
    my($link) = @_;

    die "$link is not a symlink.\n" if (! -l $link);

    $link = File::Spec->rel2abs($link);
    my $contents = readlink($link) or die "cannot read contents of $link\n";
    die "antlinks must be relative, not absolute (but $link is absolute as $contents).\n"
        if ($contents =~ m@^/@);

    my($contents_up, $contents_repo, $contents_down) = split_up_repo_down($contents);
    die "cannot parse $contents into up and down\n" if (!defined($contents_up));
    my($checkout_root) = bootstrap_repos($link, $contents_up, $contents_repo) ||
        die "unknown repository: $contents_repo\n";

    return ($contents_up, $contents_repo, $contents_down, $checkout_root);
} 

#
# create_antlink_meta
# setup (create the in-memory reprepsentation) of
# metadata based on ANTLINK  or optionally if LINK_IS_META,
# a path in the metadir
#
# (does NOT actually create things on disk)
#
sub create_antlink_meta($;$) {
    my($link, $link_is_meta) = @_;

    $link = File::Spec->rel2abs($link);
    #
    # it is now /home/user/WORKING/META/some/subpath
    # walk it back to the gitroot
    my(@dirs) = File::Spec->splitdir($link);
    my(@git_root_dirs) = @dirs;
    my($contents_up, $contents_repo, $contents_down) = ('.', undef, '.');
    for(;;) {
	my $git_root_dir = File::Spec->catdir(@git_root_dirs);
	last if (-d "$git_root_dir/.git");
	last if ($#git_root_dirs == -1);
	my($strip) = pop(@git_root_dirs);
	$contents_up .= "/..";
	$contents_down = $strip . "/" . $contents_down;
    };
    die "cannot find META's .git in $link\n\t(you should be in or under the directory with _antlink.yaml in it)\n"
        if ($#git_root_dirs == -1);
    $contents_up =~ s@^\./@@;
    $contents_down =~ s@/\.$@@;

#    my $checkout_root = dirname($link) . "/" . $contents_up;
    my($checkout_root) = bootstrap_repos($link, $contents_up, undef, $link_is_meta) ||
        die "unknown repository for $link\n";
    $contents_repo = $repos{$repos_order[1]}->{'name'};
    my($meta_dir) = $repos{$repos_order[0]}->{'name'};

    die "expect but cannot find meta dir $meta_dir in $checkout_root\n"
        if (!$link_is_meta && ! -d "$checkout_root/$meta_dir");

    return ($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir);
} 

sub parse_antlink_checkout($$$$) {
    my($contents_up, $contents_repo, $contents_down, $checkout_root) = @_;

    my $checkout_dir = "$checkout_root/$contents_repo/$contents_down";

    my($checkout_base, $checkout_path) = fileparse($checkout_dir);
    
    return($checkout_base, $checkout_path, $checkout_dir);
} 

######################################################################

=head1 SUBCOMMANDS IN DETAIL

=cut

#=head2 _force_clone_branch_to_main
#
#Handle old git without --initial-branch in a new clone
#
#=cut
sub _force_clone_branch_to_main($$) {
    my($remote_path, $local_dir) = @_;
    if (!defined($local_dir) && defined($remote_path)) {
        $local_dir = basename($remote_path);
        $local_dir =~ s/\.git$//;
    };
    system_nofail("git switch main", $local_dir, "cannot force branch to main");
}

=head2 antlink_clone

    antlink clone PATH_TO_ANTLINK

"clones" an antlink by checking it out into the parallel tree.

=cut

sub antlink_clone($;$$);
sub antlink_clone($;$$) {
    my($link, $dummy, $is_initial_repo) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    my $repo = $repos{$contents_repo};
    if ($repo->{'type'} eq 'checkedout') {
	print "$link points back into meta-repository; recursing\n" if ($verbose);
	# walk down
	my $recursive_link = File::Spec->catdir($checkout_root, $contents_repo);
	foreach (File::Spec->splitdir($contents_down)) {
	    $recursive_link = File::Spec->catdir($recursive_link, $_);
	    if (-l $recursive_link) {
		antlink_clone($recursive_link);
		return;
	    };
	};
	die "$link points back into meta-repository, but could not find the recursive antlink\n";
    };

    if ($repo->{'remote_access'} eq 'ssh' && $repo->{'verify_ssh'}) {
        ssh_verify($repo->{'remote_host'}, $repo->{'remote_path'});
    };
    if (-d $checkout_dir) {
	print "$link is already cloned\n" if ($verbose);
	return;
    };
    die "have to checkout non-git by hand\n"
	if ($repo->{'type'} ne 'git');
    make_path($checkout_path);

#    die "you set up legacy svn by hand in parallel to the checkedout copy of ANT.git\n"
#	if ($repo =~ /svn$/i);
    my $canonical_checkout_dir = offline_canonicalize_path($checkout_dir);
    my $cmd = "git clone " . $repo->{'remote_access'} . "://" . $repo->{'remote_host'} . $repo->{'remote_path'} . "/" .$contents_down . ".git " . $canonical_checkout_dir;
    system_nofail($cmd);

    # handle default branch for old git
    if (!$is_initial_repo) {
        my($initial_branch) = git_default_branch();
        _force_clone_branch_to_main(undef,  $canonical_checkout_dir) if (!defined($initial_branch));
    };
}

=head2 antlink_mv

    antlink mv [-f] PATH_TO_ANTLINK NEW_PATH

Renames an antlink, on both the local copy and server.

=cut

sub antlink_mv($$) {
    my($link, $new_link) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};

    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot mv non-git repos\n"
	if ($repo->{'type'} ne 'git');

    my($new_contents_up, $new_contents_repo, $new_contents_down, $new_checkout_root, $new_meta_dir) = create_antlink_meta($new_link);
    my($new_checkout_base, $new_checkout_path, $new_checkout_dir) = parse_antlink_checkout($new_contents_up, $new_contents_repo, $new_contents_down, $new_checkout_root);
    my $new_repo = $repos{$new_contents_repo};

    die "confusing... did not find repo $new_contents_repo\n" if (!defined($new_contents_repo));
    die "confusing... old $link and new $new_link do not appear to be in the same repo.\n"
	if ($contents_repo ne $new_contents_repo);
#    die "confusing... old $meta_dir and new $new_meta_dir do not appear to be in the same repo.\n"
#	if ($meta_dir ne $new_meta_dir);

    #
    # sanity check
    #
    # ($link was already verified)
    # now check new_link
    die "something already exists at $new_link on local copy\n"
        if (-e $new_link);

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$new_meta_dir", "failed to pull current meta-repository");

    #
    # make sure it's on the server
    #
    my $trial_remote_path = $repo->{'remote_path'} . "/$contents_down.git";
    my($cmd) = "test -d $trial_remote_path && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    chomp $result;
    if ($result eq 'exists') {
	# ok
    } elsif ($result eq 'none') {
	die "antlink $link does not exist on server (at $trial_remote_path)\n";
    } else {
	die "unknown response from server with $cmd\n";
    };

    # and no new_link there, either
    $cmd = "test -d " . $new_repo->{'remote_path'} . "/$new_contents_down.git && echo exists || echo none";
    $result = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    chomp $result;
    if ($result eq 'none') {
	# ok
    } elsif ($result eq 'exists') {
	die "antlink $new_contents_down seems to already exist\n";
    } else {
	die "unknown response from server with $cmd\n";
    };

    #
    # now move it on server to $new_link
    #
    $cmd = "mv " . $repo->{'remote_path'} . "/$contents_down.git " .
        $new_repo->{'remote_path'} . "/$new_contents_down.git ";
    $result = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);

    #
    # move the antlink
    #
    # make the new
    $cmd = "ln -s  $new_contents_up/$new_contents_repo/$new_contents_down $new_link";
    system_nofail($cmd);
    system_nofail("git add $new_contents_down", "$checkout_root/$new_meta_dir", "failed to add $new_link");
    # kill the old
    system_nofail("git rm $contents_down", "$checkout_root/$new_meta_dir", "failed to git rm $link");
#    unlink("$contents_up/$contents_repo/$contents_down", $link)
#	or die "cannot remove old antlink $link\n";

    #
    # commit meta
    #
    system_nofail("git commit -m 'mv $link $new_link' $contents_down $new_contents_down", "$checkout_root/$new_meta_dir", "failed to commit$link and $new_link");
    system_nofail("git push origin", "$checkout_root/$new_meta_dir", "failed to commit");

    #
    # check local repo (if any) and move it
    #
    my $canonical_checkout_dir = offline_canonicalize_path($checkout_dir);
    my $new_canonical_checkout_dir = offline_canonicalize_path($new_checkout_dir);
    if (-d $canonical_checkout_dir) {
        $cmd = "mv $canonical_checkout_dir $new_canonical_checkout_dir";
	system_nofail($cmd);
	my $something = $new_repo->{'remote_access'} . "://" . $new_repo->{'remote_host'} . $new_repo->{'remote_path'} . "/$new_contents_down.git";
	system_nofail("git config remote.origin.url $something", $new_canonical_checkout_dir, "failed to git config new remote path to $something");
    };
}

=head2 antlink_rm

    antlink rm -f PATH_TO_ANTLINK

Removes an antlink, on both the local copy and server.

=cut

sub antlink_rm($$) {
    my($link) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};
    my($meta_dir) = $repos{$repos_order[0]}->{'name'};

    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot rm non-git repos\n"
	if ($repo->{'type'} ne 'git');

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    #
    # make sure it's on the server
    #
    my $trial_remote_path = $repo->{'remote_path'} . "/$contents_down.git";
    my($cmd) = "test -d $trial_remote_path && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    chomp $result;
    if ($result eq 'exists') {
	# ok
    } elsif ($result eq 'none') {
	die "antlink $link does not exist on server (at $trial_remote_path)\n";
    } else {
	die "unknown response from server with $cmd\n";
    };

    #
    # require confirmation
    #
    die "antlink rm is dangerous, so removal requires the -f (force) option (please rerun)\n"
        if (!$force && !$verbose);
    print "antlink rm is dangerous, but run with --verbose, so we will just show the commands it will run.  Re-run with -f (force) to have it take action\n"
        if ($verbose);

    #
    # kill the old antlink
    #
    system_verbose_or_nofail("git rm $contents_down", "$checkout_root/$meta_dir", "failed to git rm $link");

    #
    # kill the old checked out copy
    #
    my $canonical_checkout_dir = offline_canonicalize_path($checkout_dir);
    if (-d $canonical_checkout_dir) {
        die "confusion... $canonical_checkout_dir does not appear to be git repo\n"
            if (! -d "$canonical_checkout_dir/.git");
        system_verbose_or_nofail("rm -rf $canonical_checkout_dir");
    };

    #
    # commit meta
    #
    system_verbose_or_nofail("git push origin", "$checkout_root/$meta_dir", "failed to commit");

    #
    # kill copy on server (gulp!)
    #
    $cmd = "rm -rf " . $repo->{'remote_path'} . "/$contents_down.git";
    if ($verbose) {
        print("$cmd\n");
    } else {
        $result = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    };
}

=head2 antlink_rename_branch

    antlink rename-branch -f [--local | --remote] PATH_TO_ANTLINK OLD_NAME NEW_NAME

Rename a branch, either the local copy (with --local)
or both local and remote (with --remote).
The old branch will be removed.

=cut

sub antlink_rename_branch($$$) {
    my($link, $old_name, $new_name) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};
    my($meta_dir) = $repos{$repos_order[0]}->{'name'};

    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot rm non-git repos\n"
	if ($repo->{'type'} ne 'git');

    #
    # pick a side!
    #
    die "antlink rename-branch requires one to to specific --local or --remote;\n\tplease rerun with one of those options\n"
        if (!defined($location) || !($location eq 'local' || $location eq 'remote'));

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    #
    # make sure it's on the server
    #
    my $trial_remote_path = $repo->{'remote_path'} . "/$contents_down.git";
    my($cmd) = "test -d $trial_remote_path && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
    chomp $result;
    if ($result eq 'exists') {
	# ok
    } elsif ($result eq 'none') {
	die "antlink $link does not exist on server (at $trial_remote_path)\n";
    } else {
	die "unknown response from server with $cmd\n";
    };

    #
    # require confirmation
    #
    die "antlink branch-rename is a big step, so it requires the -f (force) option (please rerun)\n"
        if (!$force && !$verbose);
    print "antlink branch-rename is dangerous, but run with --verbose, so we will just show the commands it will run.  Re-run with -f (force) to have it take action\n"
        if ($verbose);

    #
    # Make sure old branch exists and new one doesn't!
    #
    my $results = system_output_nofail("git branch", "$checkout_root/$meta_dir/$contents_down", "cannot get branches from $link");
    my %current_branches;
    foreach (split(/\n/, $results)) {
        chomp;
	my($branch_status, $branch_name) = m/^(.)\s(.*)$/;
        next if (!defined($branch_name));
        $current_branches{$branch_name} = 1;
    };
    die("antlink rename-branch: old branch $old_name doesn't exist\n")
        if (!defined($current_branches{$old_name}));
    die("antlink rename-branch: new branch $old_name does exist (and it shouldn't)\n")
        if (defined($current_branches{$new_name}));
    
    #
    # rename local branch
    #
    foreach $cmd ("git checkout $old_name", "git branch -m $old_name $new_name") {
        system_verbose_or_nofail($cmd, "$checkout_root/$meta_dir/$contents_down", "failed to $cmd");
    };
    
    if ($location eq 'remote') {
        #
        # push local main to remote
        #
        $cmd = "git push -u origin $new_name";
        system_verbose_or_nofail($cmd, "$checkout_root/$meta_dir/$contents_down", "failed to $cmd");
        #
        # remove old main
        #
        $cmd = "cd " . $repo->{'remote_path'} . "/$contents_down.git && git symbolic-ref HEAD refs/heads/$new_name";
        if ($verbose) {
            print("ssh " . $repo->{'remote_host'} . " '$cmd'\n");
        } else {
            my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
        };
        $cmd = "git push origin --delete $old_name";
        system_verbose_or_nofail($cmd, "$checkout_root/$meta_dir/$contents_down", "failed to $cmd");
    } elsif ($location eq 'local') {
        #
        # on other local branches
        #
        foreach $cmd ("git fetch", "git branch --unset-upstream", "git branch -u origin/$new_name") {
            system_verbose_or_nofail($cmd, "$checkout_root/$meta_dir/$contents_down", "failed to $cmd");
        };
    } else {
        die("antlink rename-branch: unknown location: $location\n");
    };
}

=head2 antlink_rename_branch_meta

    antlink rename-branch-meta -f  [--local | --remote] OLD_NAME NEW_NAME

Rename a branch of the metadirectory, either the local copy (with --local)
or both local and remote (with --remote).
The old branch will be removed.

=cut

sub antlink_rename_branch_meta($$$) {
    my($link, $old_name, $new_name) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir) = create_antlink_meta($link, 1);
    my $repo = $repos{$contents_repo};

    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot rm non-git repos\n"
	if ($repo->{'type'} ne 'git');

    #
    # pick a side!
    #
    die "antlink rename-branch-meta_text requires one to to specific --local or --remote;\n\tplease rerun with one of those options\n"
        if (!defined($location) || !($location eq 'local' || $location eq 'remote'));
    # actually, only one choice
    die "antlink rename-branch-meta_text currently only works in --local\n"
        if ($location ne 'local');

    #
    # require confirmation
    #
    die "antlink branch-rename is a big step, so it requires the -f (force) option (please rerun)\n"
        if (!$force && !$verbose);
    print "antlink branch-rename is dangerous, but run with --verbose, so we will just show the commands it will run.  Re-run with -f (force) to have it take action\n"
        if ($verbose);

    #
    # Make sure old branch exists and new one doesn't!
    #
    my $results = system_output_nofail("git branch", "$checkout_root/$contents_down", "cannot get branches from $link");
    my %current_branches;
    foreach (split(/\n/, $results)) {
        chomp;
	my($branch_status, $branch_name) = m/^(.)\s(.*)$/;
        next if (!defined($branch_name));
        $current_branches{$branch_name} = 1;
    };
    die("antlink rename-branch-meta: old branch $old_name doesn't exist\n")
        if (!defined($current_branches{$old_name}));
    die("antlink rename-branch-meta: new branch $old_name does exist (and it shouldn't)\n")
        if (defined($current_branches{$new_name}));
    
    #
    # rename local branch
    #
    my($cmd);
    foreach $cmd ("git checkout $old_name", "git branch -m $old_name $new_name") {
        system_verbose_or_nofail($cmd, "$checkout_root/$contents_down", "failed to $cmd");
    };
    
    if ($location eq 'remote') {
        #
        # push local main to remote
        #
        $cmd = "git push -u origin $new_name";
        system_verbose_or_nofail($cmd, "$checkout_root/$contents_down", "failed to $cmd");
        #
        # remove old main
        #
        $cmd = "cd " . $repo->{'remote_path'} . "/$contents_down.git && git symbolic-ref HEAD refs/heads/$new_name";
        if ($verbose) {
            print("ssh " . $repo->{'remote_host'} . " '$cmd'\n");
        } else {
            my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd);
        };
        $cmd = "git push origin --delete $old_name";
        system_verbose_or_nofail($cmd, "$checkout_root/$contents_down", "failed to $cmd");
    } elsif ($location eq 'local') {
        #
        # on other local branches
        #
        foreach $cmd ("git fetch", "git branch --unset-upstream", "git branch -u origin/$new_name") {
            system_verbose_or_nofail($cmd, "$checkout_root/$contents_down", "failed to $cmd");
        };
    } else {
        die("antlink rename-branch-meta: unknown location: $location\n");
    };
}

=head2 antlink_unclone

    antlink unclone PATH_TO_ANTLINK

"unclones" an antlink by (1) making sure no changes are pending,
(2) discarding the checked out copy.

(See also "rm" which, in addition to unclone, removes it on the server.)

=cut

sub antlink_unclone($;$) {
    my($link, $subcommand) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);
    my $repo = $repos{$contents_repo};

    if (! -d "$checkout_dir/.") {
	print "$contents_down is not checked out.\n" if ($verbose);
	return;
    };

    my $status = system_output_nofail("git status --porcelain", $link);
    my %files_status;
    foreach (split(/\n/, $status)) {
	my($file_status, $file) = m/^(..)\s(.*)$/;
	push (@{$files_status{$file_status}}, $file);
    };
    die "xxx: not done, check to see if anything left to commit\n";
}


=head2 antlink_initmeta

    antlink initmeta GIT_REPOSITORY_DIRECTORY

Create a new meta-repository.
These are always on the local computer.

=cut

sub antlink_initmeta($;$) {
    my($meta_repo_dir) = @_;

    # git init
    $meta_repo_dir =~ s@file:///@@;
    die "antlink initmeta only works with a local file system path.\n(use clonemeta later to get it to a remote system)\n"	
        if ($meta_repo_dir =~ /^[a-z]+:/);
    die "antlink initmeta requires (by convention) the path to end in .git\n"
        if ($meta_repo_dir !~ /\.git$/);
    die "antlink initmeta requires a full path (starting at the root with /)\n"
        if ($meta_repo_dir !~ /^\//);
    foreach (qw(user.name user.email init.defaultBranch)) {
        system_nofail("git config --global $_", ".", "no global setting for user name or initial branch, please config with git config --global $_");
    };
    my($initial_branch) = git_default_branch();
    my($initial_branch_arg) = (defined($initial_branch) ? "--initial-branch=$initial_branch" : "");
    system_nofail("git init --bare --shared=group $initial_branch_arg  $meta_repo_dir", ".", "failed to git-init new meta repo in $meta_repo_dir");

    my($meta_repo_dir_no_git) = $meta_repo_dir;
    $meta_repo_dir_no_git =~ s/\.git$//;
    my($meta_base) = basename($meta_repo_dir_no_git);

    unless (-d "${meta_repo_dir_no_git}_GIT") {
        mkdir("${meta_repo_dir_no_git}_GIT") or die "cannot mkdir ${meta_repo_dir_no_git}_GIT\n";
    };
    
    # add _antlink.yaml
    my($tempdir) = File::Temp::tempdir("./antlink_initmeta_XXXXXX", CLEANUP => 1);
#    my($tempdir) = "/tmp/ALT";
    my($meta_co) = "$tempdir/meta";
    system_nofail("git clone $meta_repo_dir $meta_co", ".", "failed to checkout a copy of $meta_repo_dir into $meta_co");

    #
    # make a main branch, if we didn't earlier
    #
    if (!$initial_branch) {
        system_nofail("git checkout -b main", $meta_co, "failed to git branch -b main");
    };
    
    my($al_file) = $meta_co . "/_antlink.yaml";
    open(AL, ">$al_file") or die "cannot write to $al_file\n";
    print AL "repos:\n  - name: $meta_base\n    type: git\n    url: \"meta:\"\n";
    print AL "  - name: ${meta_base}_GIT\n    type: git\n    url: \"parent:${meta_repo_dir_no_git}_GIT\"\n";
    close AL;
    system_nofail("git add _antlink.yaml", $meta_co, "failed to add $al_file");
    system_nofail("git commit -m 'initial _antlink.yaml' _antlink.yaml", $meta_co, "failed to commit $al_file");
    my($initial_branch_with_default) = $initial_branch // "main";  # will be null only if on old git
    my($initial_branch_needs_create) = (!defined($initial_branch) ? ' -u ' : '');
    system_nofail("git push $initial_branch_needs_create origin $initial_branch_with_default", $meta_co, "failed to commit $al_file");

    #
    # force HEAD to point to main
    #
    if (!defined($initial_branch)) {
        system_nofail("git symbolic-ref HEAD refs/heasd/main", $meta_repo_dir, "cannot symoblic-ref to main");
    };

    # tempdir will cleanup the checkedout copy
}

=head2 antlink_clonemeta

    antlink clonemeta GIT_REPO_URL_OR_PATH [LOCAL_DIR]

Clone a meta-repository.
Could be from a local or remote computer.
Result is always local.

=cut

sub antlink_clonemeta($;$) {
    my($link, $local_dir) = @_;
    # Check that we can ssh, since ssh failure is a common error.
    my($method, $remote_host, $remote_path) = ($link =~ m@^(ssh)://([^/]+)(/.*)$@);
    $method //= '';
    if ($method eq 'ssh') {
        ssh_verify($remote_host, $remote_path);
    };
    # just git clone
    system_nofail("git clone " . join(" ", @_), undef, "clonemeta attempt");

    # handle default branch for old git
    my($initial_branch) = git_default_branch();
    _force_clone_branch_to_main($remote_path // $link, $local_dir) if (!defined($initial_branch));
}


=head2 antlink_graft

    antlink graft [--vc svn|git] GIT_REPO_URL_OR_PATH [LOCAL_DIR]

Graft in an external meta-repository.
Could be from a local or remote computer.

=cut

sub antlink_graft(@) {
    my($remote_url, $link) = @_;

    # have to do this BEFORE setting up meta, or we get a confusing error
    die "antlink: refusing to graft since something already exists at $link\n"
        if (-e $link);

    #
    # bootstrap
    #
    my($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir) = create_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    #
    # sanity check
    #
    # make sure we have a clean meta
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    #
    # first, make sure the remote is in _antlink.yaml
    #
    my $repo_ref = undef;
    foreach (keys %repos) {
        next if (!defined($repos{$_}{'url'}));
        if ($repos{$_}{'url'} eq substr($remote_url, 0, length($repos{$_}{'url'}))) {
            $repo_ref = $repos{$_};
            last;
        };
    };
    if (!defined($repo_ref)) {
        die "antlink: cannot find $remote_url in _antlink.yaml. xxx: should automatically add it here."
    };
    my($remote_name) = $repo_ref->{'name'};
    my($remote_tail) = substr($remote_url, length($repo_ref->{'url'}));

    #
    # next, make the antlink
    #
    if (!defined($link)) {
        $link = basename($remote_tail);
    };
    symlink("$contents_up/$remote_name/$remote_tail", $link) or die "cannot make anlink $link\n";

    #
    # get a local copy
    #
    antlink_clone($link);

    #
    # repo-specific hacks
    #
    if ($remote_name eq 'OVERLEAF_GIT') {
        system_nofail("git config credential.helper store", "$contents_up/$meta_dir/$contents_down/.", "failed to set creditial.helper");

    };

    #
    # finally, commit the new gitlink
    #
    system_nofail("git add $link", ".", "failed to add $link");
    system_nofail("git commit -m 'new graft $link' $link", ".", "failed to commit new link");
    system_nofail("git push", ".", "failed to push new link");
}


=head2 antlink_init

    antlink init PATH_TO_ANTLINK

Initialize a new antlink with some path,
creating a new git repository for it on the server
checking that out on the local computer,
and adding the antlink to the meta-repository

If the repository has an "init_hook" set
(defined in _antlink.yaml in the root of the meta-repo),
that script will be run on the server to setup any repo-specific things
(like commit hooks to send e-mail).

=cut

sub antlink_init($;$) {
    my($link) = @_;

    die "something already exists at $link on local copy\n"
        if (-e $link);

    my($contents_up, $contents_repo, $contents_down, $checkout_root, $meta_dir) = create_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    my $repo = $repos{$contents_repo};
    die "confusing... did not find repo $contents_repo\n" if (!defined($repo));
    die "cannot init non-git repos\n"
	if ($repo->{'type'} ne 'git');

    #
    # sanity check
    #
    if ($repo->{'remote_access'} eq 'ssh' && $repo->{'verify_ssh'}) {
        ssh_verify($repo->{'remote_host'}, $repo->{'remote_path'});
    };

    #
    # make sure we have a clean meta
    #
    system_nofail("git pull", "$checkout_root/$meta_dir", "failed to pull current meta-repository");

    #
    # go to the server and make it
    #
    my($cmd) = "test -d " . $repo->{'remote_path'} . "/$contents_down.git && echo exists || echo none";
    my($result) = ssh_system_output_nofail($repo->{'remote_host'}, $cmd) || die "cannot ssh to test $repo->{'remote_host'}\n";
    chomp $result;
    if ($result eq 'exists') {
	die "repository already exists for $link on server\n";
    } elsif ($result ne 'none') {
	die "unknown response from server with $cmd\n";
    };
    my($initial_branch) = git_default_branch($repo->{'remote_host'});
    my($initial_branch_arg) = (defined($initial_branch) ? "--initial-branch=$initial_branch" : "");
    $result = ssh_system_output_nofail($repo->{'remote_host'}, "git init --bare $initial_branch_arg --shared=group " . $repo->{'remote_path'} . "/$contents_down.git");
    if (!defined($initial_branch)) {
        # sigh, server is old git without --initial-branch
        # have to change the default branch name
        $result = ssh_system_output_nofail($repo->{'remote_host'}, "cd " . $repo->{'remote_path'} . "/$contents_down.git && git symbolic-ref HEAD refs/heads/$DEFAULT_BRANCH");
    };
    if ($repo->{'init_hook'}) {
	$result = ssh_system_output_nofail($repo->{'remote_host'}, $repo->{'init_hook'} . " " . $repo->{'remote_path'} . "/$contents_down.git", undef, 'cannot patch up new repo branch name on server');
    };

    #
    # symlink alias
    #
    symlink("$contents_up/$contents_repo/$contents_down", $link)
	or die "cannot create symlink $link\n";

    #
    # get a local copy
    # (but the 1 means: initail repo)
    #
    antlink_clone($link, undef, 1);

    #
    # put something in it
    # and push it
    # (to avoid the special case first push)
    #
    my($gi_dir) = "$checkout_root/$contents_repo/$contents_down";
    my($gi_file) = ".gitignore";
    open(GITIGNORE, ">$gi_dir/$gi_file") || die "cannot create $gi_dir/$gi_file\n";
    print GITIGNORE "*~\n";
    close GITIGNORE;
    my($initial_branch_with_default) = $initial_branch // $DEFAULT_BRANCH;  # will be null only if on old git
    system_nofail("git checkout -b $initial_branch_with_default", $gi_dir);
    system_nofail("git add $gi_file", $gi_dir);
    system_nofail("git commit -m 'start gitignore' $gi_file", $gi_dir);
    system_nofail("git push origin $initial_branch_with_default", $gi_dir);

    #
    # finally, commit the new gitlink
    #
    system_nofail("git add $contents_down", "$checkout_root/$meta_dir", "failed to add $link");
    system_nofail("git commit -m 'create new $gi_dir' $contents_down", "$checkout_root/$meta_dir", "failed to commit new link");
#    system_nofail("git add $link", ".", "failed to add $link");
#    system_nofail("git commit -m 'create new $gi_dir' $link", ".", "failed to commit new link");
    system_nofail("git push", "$checkout_root/$meta_dir", "failed to push new link");
}


=head2 antlink_foreach

    antlink status PATH_TO_ANTLINK
    antlink push PATH_TO_ANTLINK
    antlink pull PATH_TO_ANTLINK

Show the git status of an antlink,
or push or pull.

If given a path, it performs the action on all antlinks
in that directory or its children.

=cut

sub antlink_foreach($;$) {
    my($link, $subcommand) = @_;

    my($contents_up, $contents_repo, $contents_down, $checkout_root) = parse_antlink_meta($link);
    my($checkout_base, $checkout_path, $checkout_dir) = parse_antlink_checkout($contents_up, $contents_repo, $contents_down, $checkout_root);

    if (! -d $checkout_dir) {
	print "$link is not cloned\n" if ($verbose);
	return;
    };

    print "$link\n";
    my $repo = $repos{$contents_repo};
    if (!defined($commands{$subcommand}{$repo->{'type'}})) {
	die "repo $contents_repo on antlink $link has no type\n" if (!defined($repo->{'type'}));
	die "no option for $subcommand on repository of type " . $repo->{'type'} . " on $link\n";
    };
    &{$commands{$subcommand}{$repo->{'type'}}}(@_);
}

=head2 antlink_listsubcommands

    antlink listsubcommands

Enumerate all possible subcommands.
Useful in filename completion.

=cut

sub antlink_listsubcommands() {
    print join(" ", sort keys %commands), "\n";
}



######################################################################

#
# main
#

pod2usage(-msg => "unknown subcommand: $subcommand\n")
    if (!defined($commands{$subcommand}));

if ($commands{$subcommand}{'which_antlinks'} eq 'one;newdir' && $#ARGV != 1) {
    pod2usage(-msg => "$prog: $subcommand requires both OLD and NEW antlinks.\n");
};
if (($commands{$subcommand}{'which_antlinks'} eq 'one;old;new' || $commands{$subcommand}{'which_antlinks'} eq 'onemeta;old;new') && $#ARGV != 2) {
    pod2usage(-msg => "$prog: $subcommand requires ANTLINK plus OLD and NEW names.\n");
};
if ($#ARGV == -1) {
    if($commands{$subcommand}{'which_antlinks'} eq 'many') {
	push(@ARGV, ".");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'one') {
	pod2usage(-msg => "$prog: no ANTLINK given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'newdir') {
	pod2usage(-msg => "$prog: no new ANTLINK given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'onemeta') {
	pod2usage(-msg => "$prog: no meta repository given.\n");
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'none') {
	# pass
    } else {
	die "$prog: internal error, unknown which_antlinks\n";
    };
};

if ($commands{$subcommand}{'which_antlinks'} eq 'onemeta' ||
    $commands{$subcommand}{'which_antlinks'} eq 'none' || 
    $commands{$subcommand}{'which_antlinks'} eq 'one;newdir' ||
    $commands{$subcommand}{'which_antlinks'} eq 'one;old;new' ||
    $commands{$subcommand}{'which_antlinks'} eq 'onemeta;old;new') {
    &{$commands{$subcommand}{'action'}}(@ARGV);
    exit 0;
};

#
# default path, iterate over all args
#
foreach my $link (@ARGV) {
    #
    # if on dir in ANT, expand all children
    #
    my(@recursive_ARGV) = ();
    if ($commands{$subcommand}{'which_antlinks'} eq 'newdir' || -l $link) {
	&{$commands{$subcommand}{'action'}}($link, $subcommand);
    } elsif (! -e $link) {
	die "invoked antlink $subcommand on non-existant path: $link.\n\t(Maybe you need to get pull in the meta-respository?)\n";
    } elsif ($commands{$subcommand}{'which_antlinks'} eq 'one') {
	die "invoked antlink $subcommand on a non-symlink; this subcommand is too dangereous to recurse.\n";
    } else {
        find({
	    preprocess => sub { return sort @_; },
	    wanted => sub { -l && -d && push(@recursive_ARGV, $_) },
	    no_chdir => 1},
	     $link);
	foreach my $recurse (@recursive_ARGV) {
	    &{$commands{$subcommand}{'action'}}($recurse, $subcommand);
	};
    };
};

exit 0;

=head1 RELEASE HISTORY

The most recent version of antlink is at L<https://ant.isi.edu/software/antlink/>.

=over

=item 0.1 (2015-06-09)
Released for internal ANT project use.  Full of unportability, but functional.

=item 1.0 (2016-01-03)
Cleaned up with no ANT-specific dependencies.  A "real" release.

=item 1.1 (2016-01-04)
Better documentation and a website.

=item 1.2 (2016-01-05)
Fixes critical bug in C<antlink init> when meta is remote.

=item 1.3 (2016-06-06)

Bugfix: no more infinite loop when C<antlink init> run outside a meta repository.
(Bug reported by Calvin Ardi.)

Enhancement: C<antlink help> and C<antlink man> now work.
(Suggestion from Calvin Ardi.)

=item 1.4 (2016-12-06)

Enhancement: Added bash autocompletion, and C<antlink listsubcommands> to support it.

Enhancement: Added preliminary verison of C<antlink mv> to rename antlinks.
(More work is needed, though, to handle distributed moves.)
Motivated by a rename for Lan Wei.

=item 1.5 (2016-12-06)

Bug fix: improved documentation installtion to fix Fedora packaging problem.

=item 1.6 (2016-12-06)

Bug fix: fix numerous bugs in C<antlink mv>.

=item 1.7 (2017-07-19)

Enhancement: an initial test suite, so no more silly "numerous bugs",
and finally got C<antlink mv> to work.

Enhancement: C<antlink init> now works when run outside the meta-repostiroy.

=item 1.8 (2017-07-21)

Enhancement: C<antlink --version> now works.

Bug fix: Several packaging problems due to the test suite in
antlink.spec are now fixed.  CentOS-6 packages only build with antlink-1.6, 
but all current RH RPM OSes work (epel7, f24, f25, f26).

Bug fix: C<antlink mv> now works in subdirs, not just the meta's root.

=item 1.9 (2018-09-05)

Enhancement: C<antlink initmeta> now accepts an existing META_GIT directory,
if it exists.

Bug fix: More bugs for C<antlink mv> now works in subdirs.

=item 1.10 (2019-07-10)

Enhancement: Several antlink subcommands now check for ssh working
and give a reasonable error message if it's not (rather than 
dumping a stack trace).

=item 1.11 (2021-04-04)

Enhancement: C<antlink rm> now exists.

=item 1.12 (2021-04-08)

Enhancement: antlink now honors init.defaultBranch, or uses "main" 
if no default branch is given.

=item 1.13 (2021-04-19)

Enhancement: more robust handling of init.defaultBranch
and git version checking, so that it builds on ELEP7 and F32 to F34.

=item 1.14 (2021-04-21)

Enhancement: C<antlink show-clones> and C<antlink rename-branch> now exist.
Typically, the system administrator will bulk-rename all directories
on the server:

=item 1.15 (2021-04-21b)

Enhancement: C<antlink rename-branch-meta> now exists.
Typically, the system administrator will bulk-rename all directories
on the server:

    cd $META_GIT
    OLD_NAME=master; NEW_NAME=main
    find . -type d -name \*.git -print | while read D;
    do
      grep $OLD_NAME $D/HEAD && (
        cd $D; git branch -m $OLD_NAME $NEW_NAME;
      );
    done

And then each user will rename all checked out copies with:

    cd $LOCAL_META_CHECKEDOUT
    antlink --force --local rename-branch-meta . master main
    antlink show-clones | while read D;
    do
      antlink --force --local rename-branch $D master main;
    done


=item 1.16 (2021-07-20)

Finally, partial support for C<antlink graft>, just for overleaf.

Bug fix: antlink init failed if the server's git was pre-2.28.
We now check for and handle that case.


=item 1.17 (2022-01-18)

Improvement: better error message if the antlink path doesn't exist.
(Bug report from Jelena Mirkovic.)

An editing pass over the documentation.

The YAML parsing library was changed (to YAML::PP) 
since that seems available in places where YAML::XS is not.


=item 1.18 (2022-01-25)

Improvement: add the verify_ssh option to F<_antlink.yaml>
to restore support for gitea grafting.

Fix test suites on boxes with pre-initialBranch gits.

Document F<_antlink.yaml>.


=item 1.19 (2023-02-08)

Improvement: a special case for overleaf
auto-sets credential storage.

Bug fix: add missing install-time dependency on YAML::PP.


=item 1.20 (2024-01-30)

Bug fix: the error for grafting or init'ing a new subrepo over an
existing like were confusing ("expect but cannot find meta dir...").
Now they should be clearer and about cannot overwrite.

Bug fix: packaging uses perl-interpreter, not just perl.


=back

=head1 KNOWN BUGS

=over

=item 
off-line operation on the meta-repository is not currently supported.

=item
C<antlink mv> is correct on the local copy,
but orphans sub-repositories on other checked-out meta-repositories.

=back

=head1 AUTHOR AND THANKS

Antlink is written by John Heidemann.

Antlink benefited from feedback and bug reports from many people (thanks!):
Yuri Pradkin,
Calvin Ardi,
Wes Hardaker,
Jelena Mirkovic.


=head1 COPYRIGHT

Copyright (C) 2015-2023 the University of Southern California.

This program is free software; you can redistribute it and/or
modify it under the terms of the GNU General Public License,
version 2, as published by the Free Software Foundation.

This program is distributed in the hope that it will be useful,
but WITHOUT ANY WARRANTY; without even the implied warranty of
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
GNU General Public License for more details.

You should have received a copy of the GNU General Public License along
ith this program; if not, write to the Free Software Foundation, Inc.,
59 Temple Place, Suite 330, Boston, MA 02111-1307, USA.

=cut
    
