By codygman


2009-11-13 08:37:22 8 Comments

I have the code:

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
print $m->response->headers->as_string;

It submits the download button on the page, but I'm not sure how to download the file which is sent back after the POST.

I'm wanting a way to download this with wget if possible. I was thinking that their may be a secret url passed or something? Or will I have to download it with LWP directly from the response stream?

So how do I download the file that is in that header?

Thanks,

Cody Goodman

3 comments

@Pavel 2012-10-05 09:17:56

After submitting the form, you can use:

$mech->save_content( $filename )

Dumps the contents of $mech->content into $filename. $filename will be overwritten. Dies if there are any errors.

If the content type does not begin with "text/", then the content is saved in binary mode.

Source: http://metacpan.org/pod/WWW::Mechanize

@John O 2014-07-19 05:08:39

Thank you for this answer. Though I was looking right at the CPAN page, I missed this, and had to wade through alot of bad Google results until I found yours.

@PP. 2009-11-13 10:46:55

I tried your code and it returns a stack of HTML of which the only http:// references were:

    http://www.w3c.org
    http://ad.z5x.net
    http://divxsubtitles.net
    http://feeds2read.net
    http://ad.z5x.net
    http://www.google-analytics.com
    http://cls.assoc-amazon.com
using the code


    my $content = $m->response->content();
    while ( $content =~ m{(http://[^/\" \t\n\r]+)}g ) {
        print( "$1\n" );
    }

So my comments to you are:
1. add use strict; to your code, you are programming for failure if you don't
2. read the output HTML and determine what to do next, you haven't done that, and therefore you've asked an incomplete question. Unless you identify the URL you want to download you are asking somebody else to write a program for you.

Once you've identified the URL you want to download it is a simple matter of getting it and then writing the response content to a file. e.g.


if ( ! open( FOUT, ">output.bin" ) ) {
    die( "Could not create file: $!" );
}
binmode( FOUT ); # required for Windows
print( FOUT $m->response->content() );
close( FOUT );

@codygman 2009-11-14 07:17:25

The url doesn't contain the information to download the file. The file is in the headers as a download attachment

@PP. 2009-11-14 08:25:31

I suspect you may be confused about HTTP.. no file is magically embedded in the headers. It is possible a redirect has been returned in the headers in which case you should print the headers and extract the URL of the file to download.

@codygman 2009-11-14 22:19:53

Alright PP, I really do need to get around to reading the RFC for http and I believe your right. I thought that "header attachment" meant that it was embedded in the headers. I'll go ahead and read the headers and see if I can locate the redirect. Thanks for your help!

@codygman 2009-11-15 00:48:06

Thanks, I get what your saying now and that last part let me see how to write out the response that I got. What really threw me off was the "mechanize->form_number" starts at 1 contrary to the usual of starting at 0. Answering my own question now! :)

@codygman 2009-11-15 00:51:02

Well the thing that threw me off the most was the "mechanize->form_number" subroutine starts at 1 whereas typical programs start their index at 0. If anyone wants to know how to download response headers, or download header attachments, this is the way to do it.

Now here's the full code to do what I wanted.

#!/usr/bin/perl
use strict;
use WWW::Mechanize;

my $url = 'http://divxsubtitles.net/page_subtitleinformation.php?ID=111292';
my $m = WWW::Mechanize->new(autocheck => 1);
$m->get($url);
$m->form_number(2);
$m->click();
my $response = $m->res();
my $filename = $response->filename;

if (! open ( FOUT, ">$filename" ) ) {
    die("Could not create file: $!" );
}
print( FOUT $m->response->content() );
close( FOUT );

@msinfo 2013-08-26 17:35:19

when i used this to download a pdf file of 6 pages, it did, but contents were blank any idea? what went wrong.

@msinfo 2013-08-26 17:45:38

oh! $mech->save_content( $filename, binmode => ':raw', decoded_by_headers => 1 ); this helped me

Related Questions

Sponsored Content

32 Answered Questions

[SOLVED] Prevent users from submitting a form by hitting Enter

15 Answered Questions

35 Answered Questions

5 Answered Questions

[SOLVED] wget command to download a file and save as a different filename

  • 2013-05-21 20:01:54
  • noobcoder
  • 536599 View
  • 547 Score
  • 5 Answer
  • Tags:   download wget

2 Answered Questions

[SOLVED] Downloads in Firefox using Perl WWW::Mechanize::Firefox

4 Answered Questions

[SOLVED] How do I download a file using Perl?

  • 2011-01-12 14:06:19
  • Sfairas
  • 56860 View
  • 30 Score
  • 4 Answer
  • Tags:   perl download

1 Answered Questions

[SOLVED] Cancel Download using WWW::Mechanize in Perl

  • 2014-01-14 15:54:19
  • Neon Flash
  • 170 View
  • 0 Score
  • 1 Answer
  • Tags:   perl www-mechanize

1 Answered Questions

2 Answered Questions

[SOLVED] How can I add a progress bar to WWW::Mechanize?

Sponsored Content