Search:  
Gentoo Wiki

TIP_Downloading_distfiles_on_another_machine

This article is part of the Tips & Tricks series.
Terminals / Shells Network X Window System Portage System Filesystems Kernel Other

Contents

Overview

If you run a Gentoo machine on a network that is only connected to the outside world via a low bandwidth dial up link you may have concluded that upgrading just isn't worth the hassle. If you have access to a high bandwidth Internet connection elsewhere (e.g. at work, college or a friend's house) then this tip may be useful to you.

This tip will help you to:

Listing the required packages

Before you start you will need to synchronise your database with portage. There are two options here:

Sync'ing over a dial up link will take a long time if you've not done it for a while, but it's bearable if you do it fairly frequently (I'm thinking once every week or so). This is the standard method for refreshing your computer's knowledge of which packages are available.

If you don't sync often you may find it easier to download a recent snapshot on your high bandwidth machine, burn it to a CD, copy it onto your bandwidth challenged Gentoo box and uncompress/untar it into /usr/portage.

Once you have an up to date snapshot, run this command:

# emerge -uDp --fetchonly world 2> /tmp/distfiles.txt
# gzip /tmp/distfiles.txt

You will now have a (fairly small) file that you can transfer to your higher bandwidth computer. These files are normally so small (around 10-20 KB) that I email them.

Downloading the packages

On the computer with lots of bandwidth, create a new file called distfile-grabber and paste the contents of this Python script into it:

File: distfile-grabber.py
#!/usr/bin/env python
#
# Copyright Graham Ashton <ashtong at users dot sourceforge dot net>, 2004.


"""Download tarballs for Gentoo upgrade

Downloads distfiles (i.e. the contents of /usr/portage/distfiles)
specified by an input file. You can generate suitable input with a
command such as this:

  emerge --fetchonly -uDp world 2> distfiles.txt

Transfer the distfiles.txt file to a different machine, and run:

  distfile-grabber distfiles.txt

All the files required to upgrade the first computer will be
downloaded to a temporary directory on the second. It is intended to
be used to download files on machines that have lots of bandwidth (and
perhaps a CD burner), on behalf of those that don't.

"""


import os
import sys
import tempfile


def get_filename():
    try:
        return sys.argv[1]
    except IndexError:
        sys.stderr.write('Usage: %s <file>\n' % os.path.basename(sys.argv[0]))
        sys.exit(1)


def download_file(urls, temp_dir):
    for url in urls.split():
        rval = os.system('wget -c -P %s %s' % (temp_dir, url))
        if rval == 0:
            break


def main():
    filename = get_filename()
    tempfile.tempdir = '/var/tmp'
    temp_dir = tempfile.mkdtemp()
    for i, urls in enumerate(file(filename)):
        download_file(urls, temp_dir)
    print '%s files successfully downloaded to %s' % (i, temp_dir)


if __name__ == '__main__':
    main()

Make the file executable, copy it into a directory in your PATH, then run it like this:

$ distfile-grabber distfiles.txt

The files mentioned in distfiles.txt will be downloaded to a temporary directory. Once all the files have been downloaded you can either write the files to a CD or copy them to a laptop/removable drive/etc., and take it back to the bandwidth challenged machine.

If you have a Windows box running Perl you can also use this script:

File: windows-download.pl
#!/usr/bin/perl
# This is just a perl version of the PHP script by Graham Ashton.
# I use ActivePerl on WinXP and wget (tested with GNU Wget 1.8.2 for W32).
# Check '$wget_opt' to meet your environment.
# Copyright Toralf Gerstaecker <toralfg at gmx dot net>, 2005.
#
# <See code above for the original remarks by Graham Ashton>
#

$wget_opts = " -o gentoo_distfile_grabber_wget.log -t 1 -c -P ";
$wget_dir = " C:\\temp\\gentoo\\tmp\\ ";

sub download_file 
{	
     my @counter = split(/\s/,$urls);
     foreach (@counter) {
        $rval = system("wget ".$wget_opts.$wget_dir.$_);
        if( $rval == 0 ) { return 0 }
     return $rval; }  }

# MAIN
# If you need a proxy set in your environment use the next line
$ENV{'http_proxy'} = 'protocol://user:passwd@proxy.name.tld:port';

$in = $ARGV[0];
open(INFILE, $in) or die "Error opening file: $in! (Usage: gentoo_distfile_grabber.pl <distfile.txt>)\n";

while(defined($urls = <INFILE>)) {
      @t = ($urls =~ /^\http/);
      if(@t == 1){
              $rc = download_file;
              if ($rc == 0) {
                      print "Files successfully downloaded to $wget_dir.\n";
              }
              else {
                      print "Wget command failed! System call return code was: $rc\n";
              }
      }
}
close(INFILE);

exit;

Performing the upgrade

This is the easy bit. Copy all the files from the CD into /usr/portage/distfiles and run:

# emerge -uDp world

That's it! Best put the kettle on...

Update:-- there has been some discussion in Gentoo Forums about some users recieving many Security Violation errors to do with files downloaded that are not in the manifest. This seems to be due to the new "strict" feature of Portage. It seems that when a Portage tree is updated as above (that is, by untaring into /usr/portage on top of an older tree), then some old ebuilds, digests, etc. can be left behind, causing havoc when distfiles are checked later.

If you recieve many errors like this:

Code: emerge blender
Calculating dependencies ...done!
>>> emerge (1 of 6) media-libs/smpeg-0.4.4-r5 to /
!!! Security Violation: A file exists that is not in the manifest.
!!! File: files/digest-smpeg-0.4.4-r4

then you may have better luck by turning off the "strict" feature (add FEATURES=-strict to /etc/make.conf). Alternatively, if you would like to keep the "strict" setting turned on (probably a Good Thing), then after you download a recent snapshot, you can update your Portage tree from the snapshot with this handy script, contributed by mikenerone in the forums:

File: shotsync.sh
#!/bin/bash
# shotsync-0.1.2
#   Ripped almost line for line from the sync_local() function of emerge-webrsync.
#   It's purpose is to provide an easy way to sync your Gentoo portage tree to a snapshot file.
#   Usage example: "shotsync /path/to/portage-20050507.tar.bz2" (relative path is fine, too)
#   There is no warranty of any kind associated with the use of this script.
#   Trivially created by Mike Nerone

SNAPSHOT="$1"
PORTDIR="$(/usr/lib/portage/bin/portageq portdir)"
PORTDIR="${PORTDIR%%/}"
TMPDIR="$(/usr/lib/portage/bin/portageq envvar PORTAGE_TMPDIR)"
TMPDIR="${TMPDIR%%/}/shotsync"

if [ -e "$TMPDIR" ]; then
  echo Cleaning out shotsync\'s tmpdir...
  rm -rf "$TMPDIR"
fi
mkdir -p "$TMPDIR"

echo Extracting snapshot file "$SNAPSHOT"...
if ! tar -C "$TMPDIR" -jxf "$SNAPSHOT"; then
  echo "Tar failed to extract the image. Please review the output."
  echo "Executed command: tar -C \"$TMPDIR\" -jxf \"$SNAPSHOT\""
  exit 1
fi 

# Uncomment the next line if you'd like the snapshot file you provided
#   to be deleted automatically after it is extracted.
#rm -f "$SNAPSHOT" 

# Make sure user and group file ownership is root
chown -R 0:0 portage

echo Syncing local portage tree to extracted snapshot...
rsync -av --progress --stats --delete --delete-after \
--exclude='/distfiles' --exclude='/packages' \
--exclude='/local' "$TMPDIR/portage/" "$PORTDIR"
echo "Cleaning up..."
rm -rf "$TMPDIR"
echo "Updating portage cache..."
emerge metadata

Now, run the script and repeat that ebuild:

# shotsync.sh /path/to/snapshot.tar.bzip
# emerge -uDp world


SIDENOTE: You can also remove old portage tree prior to untarring the new one. Please remember to move your distfiles and packages directory to somewhere safe before deleting portage, then don't forget to restore them back. packages directory is used only if you build binary packages. /usr/portage eg.: mv /usr/portage/distfiles /root and move it back again after untar.

Code: Backing up packages and distfiles
 export $TMPDIRS=/root/tmpportage
 mkdir $TMPDIRS
 mv -f /usr/portage/distfiles $TMPDIRS
 mv -f /usr/portage/packages $TMPDIRS
 rm -rf /usr/portage
 # now untar the new tree under /usr/portage, with parameters  
 # --owner portage --group portage --mode 755
 mv -f $TMPDIRS/distfiles /usr/portage
 mv -f $TMPDIRS/packages /usr/portage
 # now copy new distfiles to /usr/portage/distfiles
 emerge --metadata  # if you do not do this, portage will be slow

An alternative solution

# get required URLs
emerge --pretend --fetchonly --update world > list.txt
# convert URL list to wget format
cat list.txt | sed 's/\shttp:/\nhttp:/gi' | sed 's/\sftp:/\nftp:/gi' > wgetlist.txt
# get 'em
wget -i wgetlist.txt -nc


IMPROVEMENT: The above will download all files, including the files that already exist in your distfiles directory. It will also not scan for deep dependencies which might cause it to miss certain files. The script below should solve those issues and gzip the file to reduce the file size.

# get URLs and output to gzipped file named wgetlist.gz
emerge --pretend --verbose --deep --columns world | grep -v " 0 kB" | grep ebuild | gawk -F\  '{ print "=" $4 "-" substr($5,2,length($5)-2) }' | xargs emerge --pretend --verbose --nodeps --fetchonly | grep -E 'http:|ftp:' | sed 's/\shttp:/\nhttp:/gi' | sed 's/\sftp:/\nftp:/gi' | gzip > wgetlist.gz
# ungzip the file
gunzip -c wgetlist.gz>wgetlist.txt
# download with wget
wget -i wgetlist.txt -nc
Retrieved from "http://www.gentoo-wiki.info/TIP_Downloading_distfiles_on_another_machine"

Last modified: Sun, 30 Mar 2008 02:22:00 +0000 Hits: 21,920