Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

testsuite requires internet connection #6

Open
carandraug opened this issue Jan 26, 2017 · 0 comments
Open

testsuite requires internet connection #6

carandraug opened this issue Jan 26, 2017 · 0 comments
Milestone

Comments

@carandraug
Copy link
Member

Several of the test units in t/ require internet connection. This happens because the xml parsers download the dtd from the ncbi servers to validate the xml files. See debian bug #852004 for details.

The simplest fix is to simply remove the DOCTYPE line. However, the real code would have a xml validation step so it would be better if that also happened on the test so there's a better alternative of inlining the DTD.

I have a script that does it (see below) but think it would be nicer if it was possible to automate the whole creation of xml files for test, and not only the inlining of the dtd. I'm not sure how to go around doing that for several of the existing test xml files.

#!/usr/bin/perl
use utf8;

## Copyright (C) 2017 Carnë Draug <[email protected]>
##
## This is free software; you can redistribute it and/or modify it
## under the same terms as the Perl 5 programming language system
## itself.

use strict;
use warnings;

use XML::LibXML;

sub inline_dtd
{
  my $fpath = shift;
  my $dtd = shift;

  open my $fh, '<', $fpath;
  my @lines = <$fh>;
  close $fh;

  ## Note the capturing of anything coming after version because some
  ## of the test xml files have an encoding specified.
  die "unexpected first line on '$fpath'"
      unless $lines[0] =~ /^\<\?xml (version="1.0".*)\?\>$/;
  $lines[0] = "<?xml $1 standalone=\"yes\"?>\n";

  my $re = qr/^\<\!DOCTYPE \w+ PUBLIC /;
  die "no DTD on line 2 on '$fpath'"
      unless $lines[1] =~ $re;

  my @dtd_lines = split /\n/, $dtd;
  $dtd_lines[0] = "<!DOCTYPE " . $dtd->getName() . " [";
  $lines[1] = (join "\n", @dtd_lines) . "\n";

  open $fh, '>', $fpath;
  print $fh $_ for @lines;
  close $fh;
}

sub main
{
  foreach my $fpath (@ARGV)
    {
      if (! -e $fpath)
        {
            warn "no file '$fpath'";
            next;
        }

      my $xml = XML::LibXML->new->parse_file($fpath);

      my $dtd = $xml->internalSubset();
      my $inline_dtd = XML::LibXML::Dtd->new($dtd->publicId, $dtd->systemId);
      $inline_dtd->setName($dtd->getName());

      ## It would be nice if the following worked but seems like LibXML
      ## does not support it
      # $doc->setInternalSubset($inline_dtd);
      # $xml->setStandalone(1);
      # $xml->toFile($fpath);

      ## so we have to manually put it in the file.
      inline_dtd($fpath, $inline_dtd);
    }
  return 0;
}

main (@ARGV);
@cjfields cjfields added this to the 1.76 milestone Sep 5, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants