Journal(2005) | Blog(2006) | RandomLink | WhoAmI | LiveBookmark | HomePage

<<Previous: rt.cpan.org: Bug  >>Next: A trip to the Yellow Mountain

批量转网页编码

Category: Script   Keywords: utf-8 gb2312

转为 utf-8 是为国际接轨。 :)
Download here!

Code

#!/usr/bin/perl
# convert gb2312 encoding webpage to utf-8 in a directory
use strict;
use warnings;
use Encode qw/from_to/; # load the main func.

# setting
my $dir = 'E:/Fayland/Emag/0503'; # the directory u want to convert.

# get all .html? files
opendir(DIR, $dir);
my @file = readdir(DIR);
closedir(DIR);
@file = grep(/\.html?$/, @file);

# convertion
foreach (@file) {
    # get the file data;
    open(FH, "$dir/$_");
    my @data = ;
    close(FH);
    my $data = join("", @data);
    if ($data =~ /charset\=gb2312/) { # it's not utf-8 yet
        $data =~ s/charset\=gb2312/charset\=utf-8/s;
        from_to($data, "gb2312", "utf8");    
        open(FH, ">$dir/$_");
        print FH $data;
        close(FH);
        print "$_ convert success!\n";
    } else {
        print "$_ is already utf-8\n";
    }
}

<<Previous: rt.cpan.org: Bug  >>Next: A trip to the Yellow Mountain

Options: +Del.icio.us

Related items Created on 2005-03-15 23:16:46, Last modified on 2005-11-08 01:51:57
Copyright 2004-2005 All Rights Reserved. Powered by Eplanet && Catalyst 5.62.