| File | /usr/lib/perl5/vendor_perl/5.10.1/URI.pm |
| Statements Executed | 3142 |
| Statement Execution Time | 43.7ms |
| Calls | P | F | Exclusive Time |
Inclusive Time |
Subroutine |
|---|---|---|---|---|---|
| 2616 | 24 | 8 | 26.3ms | 26.3ms | URI::__ANON__[:24] |
| 1 | 1 | 1 | 1.89ms | 2.21ms | URI::BEGIN@22 |
| 12 | 2 | 2 | 1.37ms | 2.31ms | URI::canonical |
| 11 | 1 | 1 | 1.10ms | 4.35ms | URI::new |
| 11 | 1 | 1 | 881µs | 2.54ms | URI::_init |
| 11 | 1 | 1 | 828µs | 1.49ms | URI::_uric_escape |
| 64 | 6 | 2 | 385µs | 385µs | URI::CORE:subst (opcode) |
| 11 | 1 | 1 | 373µs | 373µs | URI::implementor |
| 12 | 1 | 1 | 323µs | 406µs | URI::_scheme |
| 58 | 2 | 2 | 309µs | 309µs | URI::CORE:substcont (opcode) |
| 65 | 5 | 2 | 247µs | 247µs | URI::CORE:regcomp (opcode) |
| 78 | 6 | 2 | 225µs | 225µs | URI::CORE:match (opcode) |
| 9 | 1 | 1 | 114µs | 114µs | URI::clone |
| 1 | 1 | 1 | 106µs | 106µs | URI::BEGIN@21 |
| 1 | 1 | 1 | 27µs | 102µs | URI::BEGIN@24 |
| 1 | 1 | 1 | 26µs | 32µs | URI::BEGIN@3 |
| 1 | 1 | 1 | 23µs | 54µs | URI::BEGIN@127 |
| 1 | 1 | 1 | 14µs | 188µs | URI::BEGIN@13 |
| 1 | 1 | 1 | 14µs | 132µs | URI::BEGIN@7 |
| 1 | 1 | 1 | 13µs | 62µs | URI::BEGIN@4 |
| 0 | 0 | 0 | 0s | 0s | URI::STORABLE_freeze |
| 0 | 0 | 0 | 0s | 0s | URI::STORABLE_thaw |
| 0 | 0 | 0 | 0s | 0s | URI::__ANON__[:25] |
| 0 | 0 | 0 | 0s | 0s | URI::__ANON__[:26] |
| 0 | 0 | 0 | 0s | 0s | URI::_init_implementor |
| 0 | 0 | 0 | 0s | 0s | URI::_no_scheme_ok |
| 0 | 0 | 0 | 0s | 0s | URI::_obj_eq |
| 0 | 0 | 0 | 0s | 0s | URI::abs |
| 0 | 0 | 0 | 0s | 0s | URI::as_iri |
| 0 | 0 | 0 | 0s | 0s | URI::as_string |
| 0 | 0 | 0 | 0s | 0s | URI::eq |
| 0 | 0 | 0 | 0s | 0s | URI::fragment |
| 0 | 0 | 0 | 0s | 0s | URI::new_abs |
| 0 | 0 | 0 | 0s | 0s | URI::opaque |
| 0 | 0 | 0 | 0s | 0s | URI::rel |
| 0 | 0 | 0 | 0s | 0s | URI::scheme |
| 0 | 0 | 0 | 0s | 0s | URI::secure |
| Line | State ments |
Time on line |
Calls | Time in subs |
Code |
|---|---|---|---|---|---|
| 1 | package URI; | ||||
| 2 | |||||
| 3 | 3 | 42µs | 2 | 38µs | # spent 32µs (26+6) within URI::BEGIN@3 which was called
# once (26µs+6µs) by HTTP::Body::BEGIN@24 at line 3 # spent 32µs making 1 call to URI::BEGIN@3
# spent 6µs making 1 call to strict::import |
| 4 | 3 | 52µs | 2 | 110µs | # spent 62µs (13+48) within URI::BEGIN@4 which was called
# once (13µs+48µs) by HTTP::Body::BEGIN@24 at line 4 # spent 62µs making 1 call to URI::BEGIN@4
# spent 48µs making 1 call to vars::import |
| 5 | 1 | 2µs | $VERSION = "1.54"; | ||
| 6 | |||||
| 7 | 3 | 54µs | 2 | 249µs | # spent 132µs (14+117) within URI::BEGIN@7 which was called
# once (14µs+117µs) by HTTP::Body::BEGIN@24 at line 7 # spent 132µs making 1 call to URI::BEGIN@7
# spent 117µs making 1 call to vars::import |
| 8 | |||||
| 9 | 1 | 900ns | my %implements; # mapping from scheme to implementor class | ||
| 10 | |||||
| 11 | # Some "official" character classes | ||||
| 12 | |||||
| 13 | 3 | 1.06ms | 2 | 362µs | # spent 188µs (14+174) within URI::BEGIN@13 which was called
# once (14µs+174µs) by HTTP::Body::BEGIN@24 at line 13 # spent 188µs making 1 call to URI::BEGIN@13
# spent 174µs making 1 call to vars::import |
| 14 | 1 | 800ns | $reserved = q(;/?:@&=+$,[]); | ||
| 15 | 1 | 1µs | $mark = q(-_.!~*'()); #'; emacs | ||
| 16 | 1 | 3µs | $unreserved = "A-Za-z0-9\Q$mark\E"; | ||
| 17 | 1 | 2µs | $uric = quotemeta($reserved) . $unreserved . "%"; | ||
| 18 | |||||
| 19 | 1 | 800ns | $scheme_re = '[a-zA-Z][a-zA-Z0-9.+\-]*'; | ||
| 20 | |||||
| 21 | 3 | 45µs | 1 | 106µs | # spent 106µs within URI::BEGIN@21 which was called
# once (106µs+0s) by HTTP::Body::BEGIN@24 at line 21 # spent 106µs making 1 call to URI::BEGIN@21 |
| 22 | 3 | 384µs | 1 | 2.21ms | # spent 2.21ms (1.89+319µs) within URI::BEGIN@22 which was called
# once (1.89ms+319µs) by HTTP::Body::BEGIN@24 at line 22 # spent 2.21ms making 1 call to URI::BEGIN@22 |
| 23 | |||||
| 24 | 2616 | 33.3ms | # spent 26.3ms within URI::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/URI.pm:24] which was called 2616 times, avg 10µs/call:
# 861 times (11.6ms+0s) by Catalyst::CORE:subst at line 1330 of Catalyst.pm, avg 13µs/call
# 861 times (6.94ms+0s) by Catalyst::uri_for at line 1352 of Catalyst.pm, avg 8µs/call
# 836 times (7.34ms+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/admin/voters.tt:105] at line 69 of Epoll/root/templates/admin/voters.tt, avg 9µs/call
# 22 times (191µs+0s) by Catalyst::DispatchType::Path::register_path at line 124 of Catalyst/DispatchType/Path.pm, avg 9µs/call
# 11 times (86µs+0s) by Catalyst::DispatchType::Path::CORE:subst at line 121 of Catalyst/DispatchType/Path.pm, avg 8µs/call
# 4 times (31µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 10 of Epoll/root/templates/includes/header.tt, avg 8µs/call
# 3 times (13µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/mail/voting_passwd.tt:39] at line 7 of Epoll/root/mail/voting_passwd.tt, avg 4µs/call
# 2 times (13µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 16 of Epoll/root/templates/includes/header.tt, avg 7µs/call
# once (10µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 8 of Epoll/root/templates/includes/header.tt
# once (8µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/admin/voters.tt:105] at line 33 of Epoll/root/templates/admin/voters.tt
# once (8µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/admin_menu.tt:31] at line 11 of Epoll/root/templates/includes/admin_menu.tt
# once (8µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 30 of Epoll/root/templates/includes/header.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 14 of Epoll/root/templates/includes/header.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/admin/voters.tt:105] at line 24 of Epoll/root/templates/admin/voters.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/admin_menu.tt:31] at line 5 of Epoll/root/templates/includes/admin_menu.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/admin_menu.tt:31] at line 19 of Epoll/root/templates/includes/admin_menu.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 7 of Epoll/root/templates/includes/header.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/admin_menu.tt:31] at line 7 of Epoll/root/templates/includes/admin_menu.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/locale_select_form.tt:13] at line 2 of Epoll/root/templates/includes/locale_select_form.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/admin_menu.tt:31] at line 9 of Epoll/root/templates/includes/admin_menu.tt
# once (7µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/admin/voters.tt:105] at line 9 of Epoll/root/templates/admin/voters.tt
# once (6µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/footer.tt:15] at line 6 of Epoll/root/templates/includes/footer.tt
# once (4µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 28 of Epoll/root/templates/includes/header.tt
# once (4µs+0s) by Template::Document::__ANON__[/usr/lib/perl5/vendor_perl/5.10.1/Epoll/root/templates/includes/header.tt:65] at line 27 of Epoll/root/templates/includes/header.tt
# spent 102µs (27+75) within URI::BEGIN@24 which was called
# once (27µs+75µs) by HTTP::Body::BEGIN@24 at line 28 | ||
| 25 | '==' => sub { _obj_eq(@_) }, | ||||
| 26 | '!=' => sub { !_obj_eq(@_) }, | ||||
| 27 | 1 | 20µs | 1 | 75µs | fallback => 1, # spent 75µs making 1 call to overload::import |
| 28 | 2 | 989µs | 1 | 102µs | ); # spent 102µs making 1 call to URI::BEGIN@24 |
| 29 | |||||
| 30 | # Check if two objects are the same object | ||||
| 31 | sub _obj_eq { | ||||
| 32 | return overload::StrVal($_[0]) eq overload::StrVal($_[1]); | ||||
| 33 | } | ||||
| 34 | |||||
| 35 | sub new | ||||
| 36 | # spent 4.35ms (1.10+3.25) within URI::new which was called 11 times, avg 396µs/call:
# 11 times (1.10ms+3.25ms) by Catalyst::DispatchType::Path::register_path at line 120 of Catalyst/DispatchType/Path.pm, avg 396µs/call | ||||
| 37 | 11 | 35µs | my($class, $uri, $scheme) = @_; | ||
| 38 | |||||
| 39 | 11 | 29µs | $uri = defined ($uri) ? "$uri" : ""; # stringify | ||
| 40 | # Get rid of potential wrapping | ||||
| 41 | 11 | 233µs | 11 | 158µs | $uri =~ s/^<(?:URL:)?(.*)>$/$1/; # # spent 158µs making 11 calls to URI::CORE:subst, avg 14µs/call |
| 42 | 11 | 72µs | 11 | 18µs | $uri =~ s/^"(.*)"$/$1/; # spent 18µs making 11 calls to URI::CORE:subst, avg 2µs/call |
| 43 | 11 | 102µs | 11 | 46µs | $uri =~ s/^\s+//; # spent 46µs making 11 calls to URI::CORE:subst, avg 4µs/call |
| 44 | 11 | 82µs | 11 | 30µs | $uri =~ s/\s+$//; # spent 30µs making 11 calls to URI::CORE:subst, avg 3µs/call |
| 45 | |||||
| 46 | 11 | 14µs | my $impclass; | ||
| 47 | 11 | 229µs | 22 | 90µs | if ($uri =~ m/^($scheme_re):/so) { # spent 65µs making 11 calls to URI::CORE:regcomp, avg 6µs/call
# spent 25µs making 11 calls to URI::CORE:match, avg 2µs/call |
| 48 | $scheme = $1; | ||||
| 49 | } | ||||
| 50 | else { | ||||
| 51 | 11 | 34µs | if (($impclass = ref($scheme))) { | ||
| 52 | $scheme = $scheme->scheme; | ||||
| 53 | } | ||||
| 54 | elsif ($scheme && $scheme =~ m/^($scheme_re)(?::|$)/o) { | ||||
| 55 | $scheme = $1; | ||||
| 56 | } | ||||
| 57 | } | ||||
| 58 | $impclass ||= implementor($scheme) || | ||||
| 59 | 11 | 152µs | 11 | 373µs | do { # spent 373µs making 11 calls to URI::implementor, avg 34µs/call |
| 60 | require URI::_foreign; | ||||
| 61 | $impclass = 'URI::_foreign'; | ||||
| 62 | }; | ||||
| 63 | |||||
| 64 | 11 | 316µs | 11 | 2.54ms | return $impclass->_init($uri, $scheme); # spent 2.54ms making 11 calls to URI::_init, avg 231µs/call |
| 65 | } | ||||
| 66 | |||||
| 67 | |||||
| 68 | sub new_abs | ||||
| 69 | { | ||||
| 70 | my($class, $uri, $base) = @_; | ||||
| 71 | $uri = $class->new($uri, $base); | ||||
| 72 | $uri->abs($base); | ||||
| 73 | } | ||||
| 74 | |||||
| 75 | |||||
| 76 | sub _init | ||||
| 77 | # spent 2.54ms (881µs+1.65) within URI::_init which was called 11 times, avg 231µs/call:
# 11 times (881µs+1.65ms) by URI::new at line 64, avg 231µs/call | ||||
| 78 | 11 | 25µs | my $class = shift; | ||
| 79 | 11 | 32µs | my($str, $scheme) = @_; | ||
| 80 | # find all funny characters and encode the bytes. | ||||
| 81 | 11 | 79µs | 11 | 1.49ms | $str = $class->_uric_escape($str); # spent 1.49ms making 11 calls to URI::_uric_escape, avg 135µs/call |
| 82 | 11 | 330µs | 33 | 167µs | $str = "$scheme:$str" unless $str =~ /^$scheme_re:/o || # spent 72µs making 11 calls to URI::CORE:regcomp, avg 7µs/call
# spent 69µs making 11 calls to URI::_generic::_no_scheme_ok, avg 6µs/call
# spent 26µs making 11 calls to URI::CORE:match, avg 2µs/call |
| 83 | $class->_no_scheme_ok; | ||||
| 84 | 11 | 261µs | my $self = bless \$str, $class; | ||
| 85 | 11 | 90µs | $self; | ||
| 86 | } | ||||
| 87 | |||||
| 88 | |||||
| 89 | sub _uric_escape | ||||
| 90 | # spent 1.49ms (828µs+660µs) within URI::_uric_escape which was called 11 times, avg 135µs/call:
# 11 times (828µs+660µs) by URI::_init at line 81, avg 135µs/call | ||||
| 91 | 11 | 26µs | my($class, $str) = @_; | ||
| 92 | 31 | 892µs | 71 | 660µs | $str =~ s*([^$uric\#])* URI::Escape::escape_char($1) *ego; # spent 440µs making 20 calls to URI::Escape::escape_char, avg 22µs/call
# spent 127µs making 29 calls to URI::CORE:substcont, avg 4µs/call
# spent 67µs making 11 calls to URI::CORE:subst, avg 6µs/call
# spent 26µs making 11 calls to URI::CORE:regcomp, avg 2µs/call |
| 93 | 11 | 147µs | return $str; | ||
| 94 | } | ||||
| 95 | |||||
| 96 | |||||
| 97 | sub implementor | ||||
| 98 | # spent 373µs within URI::implementor which was called 11 times, avg 34µs/call:
# 11 times (373µs+0s) by URI::new at line 59, avg 34µs/call | ||||
| 99 | 11 | 26µs | my($scheme, $impclass) = @_; | ||
| 100 | 11 | 21µs | if (!$scheme || $scheme !~ /\A$scheme_re\z/o) { | ||
| 101 | 11 | 33µs | require URI::_generic; | ||
| 102 | 11 | 475µs | return "URI::_generic"; | ||
| 103 | } | ||||
| 104 | |||||
| 105 | $scheme = lc($scheme); | ||||
| 106 | |||||
| 107 | if ($impclass) { | ||||
| 108 | # Set the implementor class for a given scheme | ||||
| 109 | my $old = $implements{$scheme}; | ||||
| 110 | $impclass->_init_implementor($scheme); | ||||
| 111 | $implements{$scheme} = $impclass; | ||||
| 112 | return $old; | ||||
| 113 | } | ||||
| 114 | |||||
| 115 | my $ic = $implements{$scheme}; | ||||
| 116 | return $ic if $ic; | ||||
| 117 | |||||
| 118 | # scheme not yet known, look for internal or | ||||
| 119 | # preloaded (with 'use') implementation | ||||
| 120 | $ic = "URI::$scheme"; # default location | ||||
| 121 | |||||
| 122 | # turn scheme into a valid perl identifier by a simple transformation... | ||||
| 123 | $ic =~ s/\+/_P/g; | ||||
| 124 | $ic =~ s/\./_O/g; | ||||
| 125 | $ic =~ s/\-/_/g; | ||||
| 126 | |||||
| 127 | 3 | 1.69ms | 2 | 84µs | # spent 54µs (23+30) within URI::BEGIN@127 which was called
# once (23µs+30µs) by HTTP::Body::BEGIN@24 at line 127 # spent 54µs making 1 call to URI::BEGIN@127
# spent 30µs making 1 call to strict::unimport |
| 128 | # check we actually have one for the scheme: | ||||
| 129 | unless (@{"${ic}::ISA"}) { | ||||
| 130 | # Try to load it | ||||
| 131 | eval "require $ic"; | ||||
| 132 | die $@ if $@ && $@ !~ /Can\'t locate.*in \@INC/; | ||||
| 133 | return unless @{"${ic}::ISA"}; | ||||
| 134 | } | ||||
| 135 | |||||
| 136 | $ic->_init_implementor($scheme); | ||||
| 137 | $implements{$scheme} = $ic; | ||||
| 138 | $ic; | ||||
| 139 | } | ||||
| 140 | |||||
| 141 | |||||
| 142 | sub _init_implementor | ||||
| 143 | { | ||||
| 144 | my($class, $scheme) = @_; | ||||
| 145 | # Remember that one implementor class may actually | ||||
| 146 | # serve to implement several URI schemes. | ||||
| 147 | } | ||||
| 148 | |||||
| 149 | |||||
| 150 | sub clone | ||||
| 151 | # spent 114µs within URI::clone which was called 9 times, avg 13µs/call:
# 9 times (114µs+0s) by URI::canonical at line 298, avg 13µs/call | ||||
| 152 | 9 | 17µs | my $self = shift; | ||
| 153 | 9 | 21µs | my $other = $$self; | ||
| 154 | 9 | 155µs | bless \$other, ref $self; | ||
| 155 | } | ||||
| 156 | |||||
| 157 | |||||
| 158 | sub _no_scheme_ok { 0 } | ||||
| 159 | |||||
| 160 | sub _scheme | ||||
| 161 | # spent 406µs (323+83) within URI::_scheme which was called 12 times, avg 34µs/call:
# 12 times (323µs+83µs) by URI::canonical at line 293, avg 34µs/call | ||||
| 162 | 12 | 19µs | my $self = shift; | ||
| 163 | |||||
| 164 | 12 | 21µs | unless (@_) { | ||
| 165 | 12 | 314µs | 24 | 83µs | return unless $$self =~ /^($scheme_re):/o; # spent 48µs making 12 calls to URI::CORE:regcomp, avg 4µs/call
# spent 35µs making 12 calls to URI::CORE:match, avg 3µs/call |
| 166 | 1 | 70µs | return $1; | ||
| 167 | } | ||||
| 168 | |||||
| 169 | my $old; | ||||
| 170 | my $new = shift; | ||||
| 171 | if (defined($new) && length($new)) { | ||||
| 172 | Carp::croak("Bad scheme '$new'") unless $new =~ /^$scheme_re$/o; | ||||
| 173 | $old = $1 if $$self =~ s/^($scheme_re)://o; | ||||
| 174 | my $newself = URI->new("$new:$$self"); | ||||
| 175 | $$self = $$newself; | ||||
| 176 | bless $self, ref($newself); | ||||
| 177 | } | ||||
| 178 | else { | ||||
| 179 | if ($self->_no_scheme_ok) { | ||||
| 180 | $old = $1 if $$self =~ s/^($scheme_re)://o; | ||||
| 181 | Carp::carp("Oops, opaque part now look like scheme") | ||||
| 182 | if $^W && $$self =~ m/^$scheme_re:/o | ||||
| 183 | } | ||||
| 184 | else { | ||||
| 185 | $old = $1 if $$self =~ m/^($scheme_re):/o; | ||||
| 186 | } | ||||
| 187 | } | ||||
| 188 | |||||
| 189 | return $old; | ||||
| 190 | } | ||||
| 191 | |||||
| 192 | sub scheme | ||||
| 193 | { | ||||
| 194 | my $scheme = shift->_scheme(@_); | ||||
| 195 | return unless defined $scheme; | ||||
| 196 | lc($scheme); | ||||
| 197 | } | ||||
| 198 | |||||
| 199 | |||||
| 200 | sub opaque | ||||
| 201 | { | ||||
| 202 | my $self = shift; | ||||
| 203 | |||||
| 204 | unless (@_) { | ||||
| 205 | $$self =~ /^(?:$scheme_re:)?([^\#]*)/o or die; | ||||
| 206 | return $1; | ||||
| 207 | } | ||||
| 208 | |||||
| 209 | $$self =~ /^($scheme_re:)? # optional scheme | ||||
| 210 | ([^\#]*) # opaque | ||||
| 211 | (\#.*)? # optional fragment | ||||
| 212 | $/sx or die; | ||||
| 213 | |||||
| 214 | my $old_scheme = $1; | ||||
| 215 | my $old_opaque = $2; | ||||
| 216 | my $old_frag = $3; | ||||
| 217 | |||||
| 218 | my $new_opaque = shift; | ||||
| 219 | $new_opaque = "" unless defined $new_opaque; | ||||
| 220 | $new_opaque =~ s/([^$uric])/ URI::Escape::escape_char($1)/ego; | ||||
| 221 | |||||
| 222 | $$self = defined($old_scheme) ? $old_scheme : ""; | ||||
| 223 | $$self .= $new_opaque; | ||||
| 224 | $$self .= $old_frag if defined $old_frag; | ||||
| 225 | |||||
| 226 | $old_opaque; | ||||
| 227 | } | ||||
| 228 | |||||
| 229 | 1 | 3µs | *path = \&opaque; # alias | ||
| 230 | |||||
| 231 | |||||
| 232 | sub fragment | ||||
| 233 | { | ||||
| 234 | my $self = shift; | ||||
| 235 | unless (@_) { | ||||
| 236 | return unless $$self =~ /\#(.*)/s; | ||||
| 237 | return $1; | ||||
| 238 | } | ||||
| 239 | |||||
| 240 | my $old; | ||||
| 241 | $old = $1 if $$self =~ s/\#(.*)//s; | ||||
| 242 | |||||
| 243 | my $new_frag = shift; | ||||
| 244 | if (defined $new_frag) { | ||||
| 245 | $new_frag =~ s/([^$uric])/ URI::Escape::escape_char($1) /ego; | ||||
| 246 | $$self .= "#$new_frag"; | ||||
| 247 | } | ||||
| 248 | $old; | ||||
| 249 | } | ||||
| 250 | |||||
| 251 | |||||
| 252 | sub as_string | ||||
| 253 | { | ||||
| 254 | my $self = shift; | ||||
| 255 | $$self; | ||||
| 256 | } | ||||
| 257 | |||||
| 258 | |||||
| 259 | sub as_iri | ||||
| 260 | { | ||||
| 261 | my $self = shift; | ||||
| 262 | my $str = $$self; | ||||
| 263 | if ($str =~ s/%([89a-fA-F][0-9a-fA-F])/chr(hex($1))/eg) { | ||||
| 264 | # All this crap because the more obvious: | ||||
| 265 | # | ||||
| 266 | # Encode::decode("UTF-8", $str, sub { sprintf "%%%02X", shift }) | ||||
| 267 | # | ||||
| 268 | # doesn't work before Encode 2.39. Wait for a standard release | ||||
| 269 | # to bundle that version. | ||||
| 270 | |||||
| 271 | require Encode; | ||||
| 272 | my $enc = Encode::find_encoding("UTF-8"); | ||||
| 273 | my $u = ""; | ||||
| 274 | while (length $str) { | ||||
| 275 | $u .= $enc->decode($str, Encode::FB_QUIET()); | ||||
| 276 | if (length $str) { | ||||
| 277 | # escape next char | ||||
| 278 | $u .= URI::Escape::escape_char(substr($str, 0, 1, "")); | ||||
| 279 | } | ||||
| 280 | } | ||||
| 281 | $str = $u; | ||||
| 282 | } | ||||
| 283 | return $str; | ||||
| 284 | } | ||||
| 285 | |||||
| 286 | |||||
| 287 | sub canonical | ||||
| 288 | # spent 2.31ms (1.37+944µs) within URI::canonical which was called 12 times, avg 193µs/call:
# 11 times (1.32ms+841µs) by Catalyst::DispatchType::Path::register_path at line 120 of Catalyst/DispatchType/Path.pm, avg 196µs/call
# once (50µs+102µs) by URI::_server::canonical at line 148 of URI/_server.pm | ||||
| 289 | # Make sure scheme is lowercased, that we don't escape unreserved chars, | ||||
| 290 | # and that we use upcase escape sequences. | ||||
| 291 | |||||
| 292 | 12 | 25µs | my $self = shift; | ||
| 293 | 12 | 108µs | 12 | 406µs | my $scheme = $self->_scheme || ""; # spent 406µs making 12 calls to URI::_scheme, avg 34µs/call |
| 294 | 12 | 92µs | 12 | 19µs | my $uc_scheme = $scheme =~ /[A-Z]/; # spent 19µs making 12 calls to URI::CORE:match, avg 2µs/call |
| 295 | 12 | 301µs | 12 | 69µs | my $esc = $$self =~ /%[a-fA-F0-9]{2}/; # spent 69µs making 12 calls to URI::CORE:match, avg 6µs/call |
| 296 | 12 | 43µs | return $self unless $uc_scheme || $esc; | ||
| 297 | |||||
| 298 | 9 | 72µs | 9 | 114µs | my $other = $self->clone; # spent 114µs making 9 calls to URI::clone, avg 13µs/call |
| 299 | 9 | 13µs | if ($uc_scheme) { | ||
| 300 | $other->_scheme(lc $scheme); | ||||
| 301 | } | ||||
| 302 | 9 | 29µs | if ($esc) { | ||
| 303 | 29 | 632µs | 38 | 248µs | $$other =~ s{%([0-9a-fA-F]{2})} # spent 182µs making 29 calls to URI::CORE:substcont, avg 6µs/call
# spent 66µs making 9 calls to URI::CORE:subst, avg 7µs/call |
| 304 | 20 | 321µs | 40 | 88µs | { my $a = chr(hex($1)); # spent 51µs making 20 calls to URI::CORE:match, avg 3µs/call
# spent 36µs making 20 calls to URI::CORE:regcomp, avg 2µs/call |
| 305 | $a =~ /^[$unreserved]\z/o ? $a : "%\U$1" | ||||
| 306 | }ge; | ||||
| 307 | } | ||||
| 308 | 9 | 80µs | return $other; | ||
| 309 | } | ||||
| 310 | |||||
| 311 | # Compare two URIs, subclasses will provide a more correct implementation | ||||
| 312 | sub eq { | ||||
| 313 | my($self, $other) = @_; | ||||
| 314 | $self = URI->new($self, $other) unless ref $self; | ||||
| 315 | $other = URI->new($other, $self) unless ref $other; | ||||
| 316 | ref($self) eq ref($other) && # same class | ||||
| 317 | $self->canonical->as_string eq $other->canonical->as_string; | ||||
| 318 | } | ||||
| 319 | |||||
| 320 | # generic-URI transformation methods | ||||
| 321 | sub abs { $_[0]; } | ||||
| 322 | sub rel { $_[0]; } | ||||
| 323 | |||||
| 324 | sub secure { 0 } | ||||
| 325 | |||||
| 326 | # help out Storable | ||||
| 327 | sub STORABLE_freeze { | ||||
| 328 | my($self, $cloning) = @_; | ||||
| 329 | return $$self; | ||||
| 330 | } | ||||
| 331 | |||||
| 332 | sub STORABLE_thaw { | ||||
| 333 | my($self, $cloning, $str) = @_; | ||||
| 334 | $$self = $str; | ||||
| 335 | } | ||||
| 336 | |||||
| 337 | 1 | 11µs | 1; | ||
| 338 | |||||
| 339 | __END__ | ||||
| 340 | |||||
| 341 | =head1 NAME | ||||
| 342 | |||||
| 343 | URI - Uniform Resource Identifiers (absolute and relative) | ||||
| 344 | |||||
| 345 | =head1 SYNOPSIS | ||||
| 346 | |||||
| 347 | $u1 = URI->new("http://www.perl.com"); | ||||
| 348 | $u2 = URI->new("foo", "http"); | ||||
| 349 | $u3 = $u2->abs($u1); | ||||
| 350 | $u4 = $u3->clone; | ||||
| 351 | $u5 = URI->new("HTTP://WWW.perl.com:80")->canonical; | ||||
| 352 | |||||
| 353 | $str = $u->as_string; | ||||
| 354 | $str = "$u"; | ||||
| 355 | |||||
| 356 | $scheme = $u->scheme; | ||||
| 357 | $opaque = $u->opaque; | ||||
| 358 | $path = $u->path; | ||||
| 359 | $frag = $u->fragment; | ||||
| 360 | |||||
| 361 | $u->scheme("ftp"); | ||||
| 362 | $u->host("ftp.perl.com"); | ||||
| 363 | $u->path("cpan/"); | ||||
| 364 | |||||
| 365 | =head1 DESCRIPTION | ||||
| 366 | |||||
| 367 | This module implements the C<URI> class. Objects of this class | ||||
| 368 | represent "Uniform Resource Identifier references" as specified in RFC | ||||
| 369 | 2396 (and updated by RFC 2732). | ||||
| 370 | |||||
| 371 | A Uniform Resource Identifier is a compact string of characters that | ||||
| 372 | identifies an abstract or physical resource. A Uniform Resource | ||||
| 373 | Identifier can be further classified as either a Uniform Resource Locator | ||||
| 374 | (URL) or a Uniform Resource Name (URN). The distinction between URL | ||||
| 375 | and URN does not matter to the C<URI> class interface. A | ||||
| 376 | "URI-reference" is a URI that may have additional information attached | ||||
| 377 | in the form of a fragment identifier. | ||||
| 378 | |||||
| 379 | An absolute URI reference consists of three parts: a I<scheme>, a | ||||
| 380 | I<scheme-specific part> and a I<fragment> identifier. A subset of URI | ||||
| 381 | references share a common syntax for hierarchical namespaces. For | ||||
| 382 | these, the scheme-specific part is further broken down into | ||||
| 383 | I<authority>, I<path> and I<query> components. These URIs can also | ||||
| 384 | take the form of relative URI references, where the scheme (and | ||||
| 385 | usually also the authority) component is missing, but implied by the | ||||
| 386 | context of the URI reference. The three forms of URI reference | ||||
| 387 | syntax are summarized as follows: | ||||
| 388 | |||||
| 389 | <scheme>:<scheme-specific-part>#<fragment> | ||||
| 390 | <scheme>://<authority><path>?<query>#<fragment> | ||||
| 391 | <path>?<query>#<fragment> | ||||
| 392 | |||||
| 393 | The components into which a URI reference can be divided depend on the | ||||
| 394 | I<scheme>. The C<URI> class provides methods to get and set the | ||||
| 395 | individual components. The methods available for a specific | ||||
| 396 | C<URI> object depend on the scheme. | ||||
| 397 | |||||
| 398 | =head1 CONSTRUCTORS | ||||
| 399 | |||||
| 400 | The following methods construct new C<URI> objects: | ||||
| 401 | |||||
| 402 | =over 4 | ||||
| 403 | |||||
| 404 | =item $uri = URI->new( $str ) | ||||
| 405 | |||||
| 406 | =item $uri = URI->new( $str, $scheme ) | ||||
| 407 | |||||
| 408 | Constructs a new URI object. The string | ||||
| 409 | representation of a URI is given as argument, together with an optional | ||||
| 410 | scheme specification. Common URI wrappers like "" and <>, as well as | ||||
| 411 | leading and trailing white space, are automatically removed from | ||||
| 412 | the $str argument before it is processed further. | ||||
| 413 | |||||
| 414 | The constructor determines the scheme, maps this to an appropriate | ||||
| 415 | URI subclass, constructs a new object of that class and returns it. | ||||
| 416 | |||||
| 417 | The $scheme argument is only used when $str is a | ||||
| 418 | relative URI. It can be either a simple string that | ||||
| 419 | denotes the scheme, a string containing an absolute URI reference, or | ||||
| 420 | an absolute C<URI> object. If no $scheme is specified for a relative | ||||
| 421 | URI $str, then $str is simply treated as a generic URI (no scheme-specific | ||||
| 422 | methods available). | ||||
| 423 | |||||
| 424 | The set of characters available for building URI references is | ||||
| 425 | restricted (see L<URI::Escape>). Characters outside this set are | ||||
| 426 | automatically escaped by the URI constructor. | ||||
| 427 | |||||
| 428 | =item $uri = URI->new_abs( $str, $base_uri ) | ||||
| 429 | |||||
| 430 | Constructs a new absolute URI object. The $str argument can | ||||
| 431 | denote a relative or absolute URI. If relative, then it is | ||||
| 432 | absolutized using $base_uri as base. The $base_uri must be an absolute | ||||
| 433 | URI. | ||||
| 434 | |||||
| 435 | =item $uri = URI::file->new( $filename ) | ||||
| 436 | |||||
| 437 | =item $uri = URI::file->new( $filename, $os ) | ||||
| 438 | |||||
| 439 | Constructs a new I<file> URI from a file name. See L<URI::file>. | ||||
| 440 | |||||
| 441 | =item $uri = URI::file->new_abs( $filename ) | ||||
| 442 | |||||
| 443 | =item $uri = URI::file->new_abs( $filename, $os ) | ||||
| 444 | |||||
| 445 | Constructs a new absolute I<file> URI from a file name. See | ||||
| 446 | L<URI::file>. | ||||
| 447 | |||||
| 448 | =item $uri = URI::file->cwd | ||||
| 449 | |||||
| 450 | Returns the current working directory as a I<file> URI. See | ||||
| 451 | L<URI::file>. | ||||
| 452 | |||||
| 453 | =item $uri->clone | ||||
| 454 | |||||
| 455 | Returns a copy of the $uri. | ||||
| 456 | |||||
| 457 | =back | ||||
| 458 | |||||
| 459 | =head1 COMMON METHODS | ||||
| 460 | |||||
| 461 | The methods described in this section are available for all C<URI> | ||||
| 462 | objects. | ||||
| 463 | |||||
| 464 | Methods that give access to components of a URI always return the | ||||
| 465 | old value of the component. The value returned is C<undef> if the | ||||
| 466 | component was not present. There is generally a difference between a | ||||
| 467 | component that is empty (represented as C<"">) and a component that is | ||||
| 468 | missing (represented as C<undef>). If an accessor method is given an | ||||
| 469 | argument, it updates the corresponding component in addition to | ||||
| 470 | returning the old value of the component. Passing an undefined | ||||
| 471 | argument removes the component (if possible). The description of | ||||
| 472 | each accessor method indicates whether the component is passed as | ||||
| 473 | an escaped or an unescaped string. A component that can be further | ||||
| 474 | divided into sub-parts are usually passed escaped, as unescaping might | ||||
| 475 | change its semantics. | ||||
| 476 | |||||
| 477 | The common methods available for all URI are: | ||||
| 478 | |||||
| 479 | =over 4 | ||||
| 480 | |||||
| 481 | =item $uri->scheme | ||||
| 482 | |||||
| 483 | =item $uri->scheme( $new_scheme ) | ||||
| 484 | |||||
| 485 | Sets and returns the scheme part of the $uri. If the $uri is | ||||
| 486 | relative, then $uri->scheme returns C<undef>. If called with an | ||||
| 487 | argument, it updates the scheme of $uri, possibly changing the | ||||
| 488 | class of $uri, and returns the old scheme value. The method croaks | ||||
| 489 | if the new scheme name is illegal; a scheme name must begin with a | ||||
| 490 | letter and must consist of only US-ASCII letters, numbers, and a few | ||||
| 491 | special marks: ".", "+", "-". This restriction effectively means | ||||
| 492 | that the scheme must be passed unescaped. Passing an undefined | ||||
| 493 | argument to the scheme method makes the URI relative (if possible). | ||||
| 494 | |||||
| 495 | Letter case does not matter for scheme names. The string | ||||
| 496 | returned by $uri->scheme is always lowercase. If you want the scheme | ||||
| 497 | just as it was written in the URI in its original case, | ||||
| 498 | you can use the $uri->_scheme method instead. | ||||
| 499 | |||||
| 500 | =item $uri->opaque | ||||
| 501 | |||||
| 502 | =item $uri->opaque( $new_opaque ) | ||||
| 503 | |||||
| 504 | Sets and returns the scheme-specific part of the $uri | ||||
| 505 | (everything between the scheme and the fragment) | ||||
| 506 | as an escaped string. | ||||
| 507 | |||||
| 508 | =item $uri->path | ||||
| 509 | |||||
| 510 | =item $uri->path( $new_path ) | ||||
| 511 | |||||
| 512 | Sets and returns the same value as $uri->opaque unless the URI | ||||
| 513 | supports the generic syntax for hierarchical namespaces. | ||||
| 514 | In that case the generic method is overridden to set and return | ||||
| 515 | the part of the URI between the I<host name> and the I<fragment>. | ||||
| 516 | |||||
| 517 | =item $uri->fragment | ||||
| 518 | |||||
| 519 | =item $uri->fragment( $new_frag ) | ||||
| 520 | |||||
| 521 | Returns the fragment identifier of a URI reference | ||||
| 522 | as an escaped string. | ||||
| 523 | |||||
| 524 | =item $uri->as_string | ||||
| 525 | |||||
| 526 | Returns a URI object to a plain ASCII string. URI objects are | ||||
| 527 | also converted to plain strings automatically by overloading. This | ||||
| 528 | means that $uri objects can be used as plain strings in most Perl | ||||
| 529 | constructs. | ||||
| 530 | |||||
| 531 | =item $uri->as_iri | ||||
| 532 | |||||
| 533 | Returns a Unicode string representing the URI. Escaped UTF-8 sequences | ||||
| 534 | representing non-ASCII characters are turned into their corresponding Unicode | ||||
| 535 | code point. | ||||
| 536 | |||||
| 537 | =item $uri->canonical | ||||
| 538 | |||||
| 539 | Returns a normalized version of the URI. The rules | ||||
| 540 | for normalization are scheme-dependent. They usually involve | ||||
| 541 | lowercasing the scheme and Internet host name components, | ||||
| 542 | removing the explicit port specification if it matches the default port, | ||||
| 543 | uppercasing all escape sequences, and unescaping octets that can be | ||||
| 544 | better represented as plain characters. | ||||
| 545 | |||||
| 546 | For efficiency reasons, if the $uri is already in normalized form, | ||||
| 547 | then a reference to it is returned instead of a copy. | ||||
| 548 | |||||
| 549 | =item $uri->eq( $other_uri ) | ||||
| 550 | |||||
| 551 | =item URI::eq( $first_uri, $other_uri ) | ||||
| 552 | |||||
| 553 | Tests whether two URI references are equal. URI references | ||||
| 554 | that normalize to the same string are considered equal. The method | ||||
| 555 | can also be used as a plain function which can also test two string | ||||
| 556 | arguments. | ||||
| 557 | |||||
| 558 | If you need to test whether two C<URI> object references denote the | ||||
| 559 | same object, use the '==' operator. | ||||
| 560 | |||||
| 561 | =item $uri->abs( $base_uri ) | ||||
| 562 | |||||
| 563 | Returns an absolute URI reference. If $uri is already | ||||
| 564 | absolute, then a reference to it is simply returned. If the $uri | ||||
| 565 | is relative, then a new absolute URI is constructed by combining the | ||||
| 566 | $uri and the $base_uri, and returned. | ||||
| 567 | |||||
| 568 | =item $uri->rel( $base_uri ) | ||||
| 569 | |||||
| 570 | Returns a relative URI reference if it is possible to | ||||
| 571 | make one that denotes the same resource relative to $base_uri. | ||||
| 572 | If not, then $uri is simply returned. | ||||
| 573 | |||||
| 574 | =item $uri->secure | ||||
| 575 | |||||
| 576 | Returns a TRUE value if the URI is considered to point to a resource on | ||||
| 577 | a secure channel, such as an SSL or TLS encrypted one. | ||||
| 578 | |||||
| 579 | =back | ||||
| 580 | |||||
| 581 | =head1 GENERIC METHODS | ||||
| 582 | |||||
| 583 | The following methods are available to schemes that use the | ||||
| 584 | common/generic syntax for hierarchical namespaces. The descriptions of | ||||
| 585 | schemes below indicate which these are. Unknown schemes are | ||||
| 586 | assumed to support the generic syntax, and therefore the following | ||||
| 587 | methods: | ||||
| 588 | |||||
| 589 | =over 4 | ||||
| 590 | |||||
| 591 | =item $uri->authority | ||||
| 592 | |||||
| 593 | =item $uri->authority( $new_authority ) | ||||
| 594 | |||||
| 595 | Sets and returns the escaped authority component | ||||
| 596 | of the $uri. | ||||
| 597 | |||||
| 598 | =item $uri->path | ||||
| 599 | |||||
| 600 | =item $uri->path( $new_path ) | ||||
| 601 | |||||
| 602 | Sets and returns the escaped path component of | ||||
| 603 | the $uri (the part between the host name and the query or fragment). | ||||
| 604 | The path can never be undefined, but it can be the empty string. | ||||
| 605 | |||||
| 606 | =item $uri->path_query | ||||
| 607 | |||||
| 608 | =item $uri->path_query( $new_path_query ) | ||||
| 609 | |||||
| 610 | Sets and returns the escaped path and query | ||||
| 611 | components as a single entity. The path and the query are | ||||
| 612 | separated by a "?" character, but the query can itself contain "?". | ||||
| 613 | |||||
| 614 | =item $uri->path_segments | ||||
| 615 | |||||
| 616 | =item $uri->path_segments( $segment, ... ) | ||||
| 617 | |||||
| 618 | Sets and returns the path. In a scalar context, it returns | ||||
| 619 | the same value as $uri->path. In a list context, it returns the | ||||
| 620 | unescaped path segments that make up the path. Path segments that | ||||
| 621 | have parameters are returned as an anonymous array. The first element | ||||
| 622 | is the unescaped path segment proper; subsequent elements are escaped | ||||
| 623 | parameter strings. Such an anonymous array uses overloading so it can | ||||
| 624 | be treated as a string too, but this string does not include the | ||||
| 625 | parameters. | ||||
| 626 | |||||
| 627 | Note that absolute paths have the empty string as their first | ||||
| 628 | I<path_segment>, i.e. the I<path> C</foo/bar> have 3 | ||||
| 629 | I<path_segments>; "", "foo" and "bar". | ||||
| 630 | |||||
| 631 | =item $uri->query | ||||
| 632 | |||||
| 633 | =item $uri->query( $new_query ) | ||||
| 634 | |||||
| 635 | Sets and returns the escaped query component of | ||||
| 636 | the $uri. | ||||
| 637 | |||||
| 638 | =item $uri->query_form | ||||
| 639 | |||||
| 640 | =item $uri->query_form( $key1 => $val1, $key2 => $val2, ... ) | ||||
| 641 | |||||
| 642 | =item $uri->query_form( $key1 => $val1, $key2 => $val2, ..., $delim ) | ||||
| 643 | |||||
| 644 | =item $uri->query_form( \@key_value_pairs ) | ||||
| 645 | |||||
| 646 | =item $uri->query_form( \@key_value_pairs, $delim ) | ||||
| 647 | |||||
| 648 | =item $uri->query_form( \%hash ) | ||||
| 649 | |||||
| 650 | =item $uri->query_form( \%hash, $delim ) | ||||
| 651 | |||||
| 652 | Sets and returns query components that use the | ||||
| 653 | I<application/x-www-form-urlencoded> format. Key/value pairs are | ||||
| 654 | separated by "&", and the key is separated from the value by a "=" | ||||
| 655 | character. | ||||
| 656 | |||||
| 657 | The form can be set either by passing separate key/value pairs, or via | ||||
| 658 | an array or hash reference. Passing an empty array or an empty hash | ||||
| 659 | removes the query component, whereas passing no arguments at all leaves | ||||
| 660 | the component unchanged. The order of keys is undefined if a hash | ||||
| 661 | reference is passed. The old value is always returned as a list of | ||||
| 662 | separate key/value pairs. Assigning this list to a hash is unwise as | ||||
| 663 | the keys returned might repeat. | ||||
| 664 | |||||
| 665 | The values passed when setting the form can be plain strings or | ||||
| 666 | references to arrays of strings. Passing an array of values has the | ||||
| 667 | same effect as passing the key repeatedly with one value at a time. | ||||
| 668 | All the following statements have the same effect: | ||||
| 669 | |||||
| 670 | $uri->query_form(foo => 1, foo => 2); | ||||
| 671 | $uri->query_form(foo => [1, 2]); | ||||
| 672 | $uri->query_form([ foo => 1, foo => 2 ]); | ||||
| 673 | $uri->query_form([ foo => [1, 2] ]); | ||||
| 674 | $uri->query_form({ foo => [1, 2] }); | ||||
| 675 | |||||
| 676 | The $delim parameter can be passed as ";" to force the key/value pairs | ||||
| 677 | to be delimited by ";" instead of "&" in the query string. This | ||||
| 678 | practice is often recommended for URLs embedded in HTML or XML | ||||
| 679 | documents as this avoids the trouble of escaping the "&" character. | ||||
| 680 | You might also set the $URI::DEFAULT_QUERY_FORM_DELIMITER variable to | ||||
| 681 | ";" for the same global effect. | ||||
| 682 | |||||
| 683 | The C<URI::QueryParam> module can be loaded to add further methods to | ||||
| 684 | manipulate the form of a URI. See L<URI::QueryParam> for details. | ||||
| 685 | |||||
| 686 | =item $uri->query_keywords | ||||
| 687 | |||||
| 688 | =item $uri->query_keywords( $keywords, ... ) | ||||
| 689 | |||||
| 690 | =item $uri->query_keywords( \@keywords ) | ||||
| 691 | |||||
| 692 | Sets and returns query components that use the | ||||
| 693 | keywords separated by "+" format. | ||||
| 694 | |||||
| 695 | The keywords can be set either by passing separate keywords directly | ||||
| 696 | or by passing a reference to an array of keywords. Passing an empty | ||||
| 697 | array removes the query component, whereas passing no arguments at | ||||
| 698 | all leaves the component unchanged. The old value is always returned | ||||
| 699 | as a list of separate words. | ||||
| 700 | |||||
| 701 | =back | ||||
| 702 | |||||
| 703 | =head1 SERVER METHODS | ||||
| 704 | |||||
| 705 | For schemes where the I<authority> component denotes an Internet host, | ||||
| 706 | the following methods are available in addition to the generic | ||||
| 707 | methods. | ||||
| 708 | |||||
| 709 | =over 4 | ||||
| 710 | |||||
| 711 | =item $uri->userinfo | ||||
| 712 | |||||
| 713 | =item $uri->userinfo( $new_userinfo ) | ||||
| 714 | |||||
| 715 | Sets and returns the escaped userinfo part of the | ||||
| 716 | authority component. | ||||
| 717 | |||||
| 718 | For some schemes this is a user name and a password separated by | ||||
| 719 | a colon. This practice is not recommended. Embedding passwords in | ||||
| 720 | clear text (such as URI) has proven to be a security risk in almost | ||||
| 721 | every case where it has been used. | ||||
| 722 | |||||
| 723 | =item $uri->host | ||||
| 724 | |||||
| 725 | =item $uri->host( $new_host ) | ||||
| 726 | |||||
| 727 | Sets and returns the unescaped hostname. | ||||
| 728 | |||||
| 729 | If the $new_host string ends with a colon and a number, then this | ||||
| 730 | number also sets the port. | ||||
| 731 | |||||
| 732 | For IPv6 addresses the brackets around the raw address is removed in the return | ||||
| 733 | value from $uri->host. When setting the host attribute to an IPv6 address you | ||||
| 734 | can use a raw address or one enclosed in brackets. The address needs to be | ||||
| 735 | enclosed in brackets if you want to pass in a new port value as well. | ||||
| 736 | |||||
| 737 | =item $uri->ihost | ||||
| 738 | |||||
| 739 | Returns the host in Unicode form. Any IDNA A-labels are turned into U-labels. | ||||
| 740 | |||||
| 741 | =item $uri->port | ||||
| 742 | |||||
| 743 | =item $uri->port( $new_port ) | ||||
| 744 | |||||
| 745 | Sets and returns the port. The port is a simple integer | ||||
| 746 | that should be greater than 0. | ||||
| 747 | |||||
| 748 | If a port is not specified explicitly in the URI, then the URI scheme's default port | ||||
| 749 | is returned. If you don't want the default port | ||||
| 750 | substituted, then you can use the $uri->_port method instead. | ||||
| 751 | |||||
| 752 | =item $uri->host_port | ||||
| 753 | |||||
| 754 | =item $uri->host_port( $new_host_port ) | ||||
| 755 | |||||
| 756 | Sets and returns the host and port as a single | ||||
| 757 | unit. The returned value includes a port, even if it matches the | ||||
| 758 | default port. The host part and the port part are separated by a | ||||
| 759 | colon: ":". | ||||
| 760 | |||||
| 761 | For IPv6 addresses the bracketing is preserved; thus | ||||
| 762 | URI->new("http://[::1]/")->host_port returns "[::1]:80". Contrast this with | ||||
| 763 | $uri->host which will remove the brackets. | ||||
| 764 | |||||
| 765 | =item $uri->default_port | ||||
| 766 | |||||
| 767 | Returns the default port of the URI scheme to which $uri | ||||
| 768 | belongs. For I<http> this is the number 80, for I<ftp> this | ||||
| 769 | is the number 21, etc. The default port for a scheme can not be | ||||
| 770 | changed. | ||||
| 771 | |||||
| 772 | =back | ||||
| 773 | |||||
| 774 | =head1 SCHEME-SPECIFIC SUPPORT | ||||
| 775 | |||||
| 776 | Scheme-specific support is provided for the following URI schemes. For C<URI> | ||||
| 777 | objects that do not belong to one of these, you can only use the common and | ||||
| 778 | generic methods. | ||||
| 779 | |||||
| 780 | =over 4 | ||||
| 781 | |||||
| 782 | =item B<data>: | ||||
| 783 | |||||
| 784 | The I<data> URI scheme is specified in RFC 2397. It allows inclusion | ||||
| 785 | of small data items as "immediate" data, as if it had been included | ||||
| 786 | externally. | ||||
| 787 | |||||
| 788 | C<URI> objects belonging to the data scheme support the common methods | ||||
| 789 | and two new methods to access their scheme-specific components: | ||||
| 790 | $uri->media_type and $uri->data. See L<URI::data> for details. | ||||
| 791 | |||||
| 792 | =item B<file>: | ||||
| 793 | |||||
| 794 | An old specification of the I<file> URI scheme is found in RFC 1738. | ||||
| 795 | A new RFC 2396 based specification in not available yet, but file URI | ||||
| 796 | references are in common use. | ||||
| 797 | |||||
| 798 | C<URI> objects belonging to the file scheme support the common and | ||||
| 799 | generic methods. In addition, they provide two methods for mapping file URIs | ||||
| 800 | back to local file names; $uri->file and $uri->dir. See L<URI::file> | ||||
| 801 | for details. | ||||
| 802 | |||||
| 803 | =item B<ftp>: | ||||
| 804 | |||||
| 805 | An old specification of the I<ftp> URI scheme is found in RFC 1738. A | ||||
| 806 | new RFC 2396 based specification in not available yet, but ftp URI | ||||
| 807 | references are in common use. | ||||
| 808 | |||||
| 809 | C<URI> objects belonging to the ftp scheme support the common, | ||||
| 810 | generic and server methods. In addition, they provide two methods for | ||||
| 811 | accessing the userinfo sub-components: $uri->user and $uri->password. | ||||
| 812 | |||||
| 813 | =item B<gopher>: | ||||
| 814 | |||||
| 815 | The I<gopher> URI scheme is specified in | ||||
| 816 | <draft-murali-url-gopher-1996-12-04> and will hopefully be available | ||||
| 817 | as a RFC 2396 based specification. | ||||
| 818 | |||||
| 819 | C<URI> objects belonging to the gopher scheme support the common, | ||||
| 820 | generic and server methods. In addition, they support some methods for | ||||
| 821 | accessing gopher-specific path components: $uri->gopher_type, | ||||
| 822 | $uri->selector, $uri->search, $uri->string. | ||||
| 823 | |||||
| 824 | =item B<http>: | ||||
| 825 | |||||
| 826 | The I<http> URI scheme is specified in RFC 2616. | ||||
| 827 | The scheme is used to reference resources hosted by HTTP servers. | ||||
| 828 | |||||
| 829 | C<URI> objects belonging to the http scheme support the common, | ||||
| 830 | generic and server methods. | ||||
| 831 | |||||
| 832 | =item B<https>: | ||||
| 833 | |||||
| 834 | The I<https> URI scheme is a Netscape invention which is commonly | ||||
| 835 | implemented. The scheme is used to reference HTTP servers through SSL | ||||
| 836 | connections. Its syntax is the same as http, but the default | ||||
| 837 | port is different. | ||||
| 838 | |||||
| 839 | =item B<ldap>: | ||||
| 840 | |||||
| 841 | The I<ldap> URI scheme is specified in RFC 2255. LDAP is the | ||||
| 842 | Lightweight Directory Access Protocol. An ldap URI describes an LDAP | ||||
| 843 | search operation to perform to retrieve information from an LDAP | ||||
| 844 | directory. | ||||
| 845 | |||||
| 846 | C<URI> objects belonging to the ldap scheme support the common, | ||||
| 847 | generic and server methods as well as ldap-specific methods: $uri->dn, | ||||
| 848 | $uri->attributes, $uri->scope, $uri->filter, $uri->extensions. See | ||||
| 849 | L<URI::ldap> for details. | ||||
| 850 | |||||
| 851 | =item B<ldapi>: | ||||
| 852 | |||||
| 853 | Like the I<ldap> URI scheme, but uses a UNIX domain socket. The | ||||
| 854 | server methods are not supported, and the local socket path is | ||||
| 855 | available as $uri->un_path. The I<ldapi> scheme is used by the | ||||
| 856 | OpenLDAP package. There is no real specification for it, but it is | ||||
| 857 | mentioned in various OpenLDAP manual pages. | ||||
| 858 | |||||
| 859 | =item B<ldaps>: | ||||
| 860 | |||||
| 861 | Like the I<ldap> URI scheme, but uses an SSL connection. This | ||||
| 862 | scheme is deprecated, as the preferred way is to use the I<start_tls> | ||||
| 863 | mechanism. | ||||
| 864 | |||||
| 865 | =item B<mailto>: | ||||
| 866 | |||||
| 867 | The I<mailto> URI scheme is specified in RFC 2368. The scheme was | ||||
| 868 | originally used to designate the Internet mailing address of an | ||||
| 869 | individual or service. It has (in RFC 2368) been extended to allow | ||||
| 870 | setting of other mail header fields and the message body. | ||||
| 871 | |||||
| 872 | C<URI> objects belonging to the mailto scheme support the common | ||||
| 873 | methods and the generic query methods. In addition, they support the | ||||
| 874 | following mailto-specific methods: $uri->to, $uri->headers. | ||||
| 875 | |||||
| 876 | Note that the "foo@example.com" part of a mailto is I<not> the | ||||
| 877 | C<userinfo> and C<host> but instead the C<path>. This allows a | ||||
| 878 | mailto URI to contain multiple comma separated email addresses. | ||||
| 879 | |||||
| 880 | =item B<mms>: | ||||
| 881 | |||||
| 882 | The I<mms> URL specification can be found at L<http://sdp.ppona.com/>. | ||||
| 883 | C<URI> objects belonging to the mms scheme support the common, | ||||
| 884 | generic, and server methods, with the exception of userinfo and | ||||
| 885 | query-related sub-components. | ||||
| 886 | |||||
| 887 | =item B<news>: | ||||
| 888 | |||||
| 889 | The I<news>, I<nntp> and I<snews> URI schemes are specified in | ||||
| 890 | <draft-gilman-news-url-01> and will hopefully be available as an RFC | ||||
| 891 | 2396 based specification soon. | ||||
| 892 | |||||
| 893 | C<URI> objects belonging to the news scheme support the common, | ||||
| 894 | generic and server methods. In addition, they provide some methods to | ||||
| 895 | access the path: $uri->group and $uri->message. | ||||
| 896 | |||||
| 897 | =item B<nntp>: | ||||
| 898 | |||||
| 899 | See I<news> scheme. | ||||
| 900 | |||||
| 901 | =item B<pop>: | ||||
| 902 | |||||
| 903 | The I<pop> URI scheme is specified in RFC 2384. The scheme is used to | ||||
| 904 | reference a POP3 mailbox. | ||||
| 905 | |||||
| 906 | C<URI> objects belonging to the pop scheme support the common, generic | ||||
| 907 | and server methods. In addition, they provide two methods to access the | ||||
| 908 | userinfo components: $uri->user and $uri->auth | ||||
| 909 | |||||
| 910 | =item B<rlogin>: | ||||
| 911 | |||||
| 912 | An old specification of the I<rlogin> URI scheme is found in RFC | ||||
| 913 | 1738. C<URI> objects belonging to the rlogin scheme support the | ||||
| 914 | common, generic and server methods. | ||||
| 915 | |||||
| 916 | =item B<rtsp>: | ||||
| 917 | |||||
| 918 | The I<rtsp> URL specification can be found in section 3.2 of RFC 2326. | ||||
| 919 | C<URI> objects belonging to the rtsp scheme support the common, | ||||
| 920 | generic, and server methods, with the exception of userinfo and | ||||
| 921 | query-related sub-components. | ||||
| 922 | |||||
| 923 | =item B<rtspu>: | ||||
| 924 | |||||
| 925 | The I<rtspu> URI scheme is used to talk to RTSP servers over UDP | ||||
| 926 | instead of TCP. The syntax is the same as rtsp. | ||||
| 927 | |||||
| 928 | =item B<rsync>: | ||||
| 929 | |||||
| 930 | Information about rsync is available from L<http://rsync.samba.org/>. | ||||
| 931 | C<URI> objects belonging to the rsync scheme support the common, | ||||
| 932 | generic and server methods. In addition, they provide methods to | ||||
| 933 | access the userinfo sub-components: $uri->user and $uri->password. | ||||
| 934 | |||||
| 935 | =item B<sip>: | ||||
| 936 | |||||
| 937 | The I<sip> URI specification is described in sections 19.1 and 25 | ||||
| 938 | of RFC 3261. C<URI> objects belonging to the sip scheme support the | ||||
| 939 | common, generic, and server methods with the exception of path related | ||||
| 940 | sub-components. In addition, they provide two methods to get and set | ||||
| 941 | I<sip> parameters: $uri->params_form and $uri->params. | ||||
| 942 | |||||
| 943 | =item B<sips>: | ||||
| 944 | |||||
| 945 | See I<sip> scheme. Its syntax is the same as sip, but the default | ||||
| 946 | port is different. | ||||
| 947 | |||||
| 948 | =item B<snews>: | ||||
| 949 | |||||
| 950 | See I<news> scheme. Its syntax is the same as news, but the default | ||||
| 951 | port is different. | ||||
| 952 | |||||
| 953 | =item B<telnet>: | ||||
| 954 | |||||
| 955 | An old specification of the I<telnet> URI scheme is found in RFC | ||||
| 956 | 1738. C<URI> objects belonging to the telnet scheme support the | ||||
| 957 | common, generic and server methods. | ||||
| 958 | |||||
| 959 | =item B<tn3270>: | ||||
| 960 | |||||
| 961 | These URIs are used like I<telnet> URIs but for connections to IBM | ||||
| 962 | mainframes. C<URI> objects belonging to the tn3270 scheme support the | ||||
| 963 | common, generic and server methods. | ||||
| 964 | |||||
| 965 | =item B<ssh>: | ||||
| 966 | |||||
| 967 | Information about ssh is available at L<http://www.openssh.com/>. | ||||
| 968 | C<URI> objects belonging to the ssh scheme support the common, | ||||
| 969 | generic and server methods. In addition, they provide methods to | ||||
| 970 | access the userinfo sub-components: $uri->user and $uri->password. | ||||
| 971 | |||||
| 972 | =item B<urn>: | ||||
| 973 | |||||
| 974 | The syntax of Uniform Resource Names is specified in RFC 2141. C<URI> | ||||
| 975 | objects belonging to the urn scheme provide the common methods, and also the | ||||
| 976 | methods $uri->nid and $uri->nss, which return the Namespace Identifier | ||||
| 977 | and the Namespace-Specific String respectively. | ||||
| 978 | |||||
| 979 | The Namespace Identifier basically works like the Scheme identifier of | ||||
| 980 | URIs, and further divides the URN namespace. Namespace Identifier | ||||
| 981 | assignments are maintained at | ||||
| 982 | L<http://www.iana.org/assignments/urn-namespaces>. | ||||
| 983 | |||||
| 984 | Letter case is not significant for the Namespace Identifier. It is | ||||
| 985 | always returned in lower case by the $uri->nid method. The $uri->_nid | ||||
| 986 | method can be used if you want it in its original case. | ||||
| 987 | |||||
| 988 | =item B<urn>:B<isbn>: | ||||
| 989 | |||||
| 990 | The C<urn:isbn:> namespace contains International Standard Book | ||||
| 991 | Numbers (ISBNs) and is described in RFC 3187. A C<URI> object belonging | ||||
| 992 | to this namespace has the following extra methods (if the | ||||
| 993 | Business::ISBN module is available): $uri->isbn, | ||||
| 994 | $uri->isbn_publisher_code, $uri->isbn_group_code (formerly isbn_country_code, | ||||
| 995 | which is still supported by issues a deprecation warning), $uri->isbn_as_ean. | ||||
| 996 | |||||
| 997 | =item B<urn>:B<oid>: | ||||
| 998 | |||||
| 999 | The C<urn:oid:> namespace contains Object Identifiers (OIDs) and is | ||||
| 1000 | described in RFC 3061. An object identifier consists of sequences of digits | ||||
| 1001 | separated by dots. A C<URI> object belonging to this namespace has an | ||||
| 1002 | additional method called $uri->oid that can be used to get/set the oid | ||||
| 1003 | value. In a list context, oid numbers are returned as separate elements. | ||||
| 1004 | |||||
| 1005 | =back | ||||
| 1006 | |||||
| 1007 | =head1 CONFIGURATION VARIABLES | ||||
| 1008 | |||||
| 1009 | The following configuration variables influence how the class and its | ||||
| 1010 | methods behave: | ||||
| 1011 | |||||
| 1012 | =over 4 | ||||
| 1013 | |||||
| 1014 | =item $URI::ABS_ALLOW_RELATIVE_SCHEME | ||||
| 1015 | |||||
| 1016 | Some older parsers used to allow the scheme name to be present in the | ||||
| 1017 | relative URL if it was the same as the base URL scheme. RFC 2396 says | ||||
| 1018 | that this should be avoided, but you can enable this old behaviour by | ||||
| 1019 | setting the $URI::ABS_ALLOW_RELATIVE_SCHEME variable to a TRUE value. | ||||
| 1020 | The difference is demonstrated by the following examples: | ||||
| 1021 | |||||
| 1022 | URI->new("http:foo")->abs("http://host/a/b") | ||||
| 1023 | ==> "http:foo" | ||||
| 1024 | |||||
| 1025 | local $URI::ABS_ALLOW_RELATIVE_SCHEME = 1; | ||||
| 1026 | URI->new("http:foo")->abs("http://host/a/b") | ||||
| 1027 | ==> "http:/host/a/foo" | ||||
| 1028 | |||||
| 1029 | |||||
| 1030 | =item $URI::ABS_REMOTE_LEADING_DOTS | ||||
| 1031 | |||||
| 1032 | You can also have the abs() method ignore excess ".." | ||||
| 1033 | segments in the relative URI by setting $URI::ABS_REMOTE_LEADING_DOTS | ||||
| 1034 | to a TRUE value. The difference is demonstrated by the following | ||||
| 1035 | examples: | ||||
| 1036 | |||||
| 1037 | URI->new("../../../foo")->abs("http://host/a/b") | ||||
| 1038 | ==> "http://host/../../foo" | ||||
| 1039 | |||||
| 1040 | local $URI::ABS_REMOTE_LEADING_DOTS = 1; | ||||
| 1041 | URI->new("../../../foo")->abs("http://host/a/b") | ||||
| 1042 | ==> "http://host/foo" | ||||
| 1043 | |||||
| 1044 | =item $URI::DEFAULT_QUERY_FORM_DELIMITER | ||||
| 1045 | |||||
| 1046 | This value can be set to ";" to have the query form C<key=value> pairs | ||||
| 1047 | delimited by ";" instead of "&" which is the default. | ||||
| 1048 | |||||
| 1049 | =back | ||||
| 1050 | |||||
| 1051 | =head1 BUGS | ||||
| 1052 | |||||
| 1053 | Using regexp variables like $1 directly as arguments to the URI methods | ||||
| 1054 | does not work too well with current perl implementations. I would argue | ||||
| 1055 | that this is actually a bug in perl. The workaround is to quote | ||||
| 1056 | them. Example: | ||||
| 1057 | |||||
| 1058 | /(...)/ || die; | ||||
| 1059 | $u->query("$1"); | ||||
| 1060 | |||||
| 1061 | =head1 PARSING URIs WITH REGEXP | ||||
| 1062 | |||||
| 1063 | As an alternative to this module, the following (official) regular | ||||
| 1064 | expression can be used to decode a URI: | ||||
| 1065 | |||||
| 1066 | my($scheme, $authority, $path, $query, $fragment) = | ||||
| 1067 | $uri =~ m|(?:([^:/?#]+):)?(?://([^/?#]*))?([^?#]*)(?:\?([^#]*))?(?:#(.*))?|; | ||||
| 1068 | |||||
| 1069 | The C<URI::Split> module provides the function uri_split() as a | ||||
| 1070 | readable alternative. | ||||
| 1071 | |||||
| 1072 | =head1 SEE ALSO | ||||
| 1073 | |||||
| 1074 | L<URI::file>, L<URI::WithBase>, L<URI::QueryParam>, L<URI::Escape>, | ||||
| 1075 | L<URI::Split>, L<URI::Heuristic> | ||||
| 1076 | |||||
| 1077 | RFC 2396: "Uniform Resource Identifiers (URI): Generic Syntax", | ||||
| 1078 | Berners-Lee, Fielding, Masinter, August 1998. | ||||
| 1079 | |||||
| 1080 | L<http://www.iana.org/assignments/uri-schemes> | ||||
| 1081 | |||||
| 1082 | L<http://www.iana.org/assignments/urn-namespaces> | ||||
| 1083 | |||||
| 1084 | L<http://www.w3.org/Addressing/> | ||||
| 1085 | |||||
| 1086 | =head1 COPYRIGHT | ||||
| 1087 | |||||
| 1088 | Copyright 1995-2009 Gisle Aas. | ||||
| 1089 | |||||
| 1090 | Copyright 1995 Martijn Koster. | ||||
| 1091 | |||||
| 1092 | This program is free software; you can redistribute it and/or modify | ||||
| 1093 | it under the same terms as Perl itself. | ||||
| 1094 | |||||
| 1095 | =head1 AUTHORS / ACKNOWLEDGMENTS | ||||
| 1096 | |||||
| 1097 | This module is based on the C<URI::URL> module, which in turn was | ||||
| 1098 | (distantly) based on the C<wwwurl.pl> code in the libwww-perl for | ||||
| 1099 | perl4 developed by Roy Fielding, as part of the Arcadia project at the | ||||
| 1100 | University of California, Irvine, with contributions from Brooks | ||||
| 1101 | Cutter. | ||||
| 1102 | |||||
| 1103 | C<URI::URL> was developed by Gisle Aas, Tim Bunce, Roy Fielding and | ||||
| 1104 | Martijn Koster with input from other people on the libwww-perl mailing | ||||
| 1105 | list. | ||||
| 1106 | |||||
| 1107 | C<URI> and related subclasses was developed by Gisle Aas. | ||||
| 1108 | |||||
| 1109 | =cut | ||||
# spent 225µs within URI::CORE:match which was called 78 times, avg 3µs/call:
# 20 times (51µs+0s) by URI::canonical at line 304 of URI.pm, avg 3µs/call
# 12 times (69µs+0s) by URI::canonical at line 295 of URI.pm, avg 6µs/call
# 12 times (35µs+0s) by URI::_scheme at line 165 of URI.pm, avg 3µs/call
# 12 times (19µs+0s) by URI::canonical at line 294 of URI.pm, avg 2µs/call
# 11 times (26µs+0s) by URI::_init at line 82 of URI.pm, avg 2µs/call
# 11 times (25µs+0s) by URI::new at line 47 of URI.pm, avg 2µs/call | |||||
# spent 247µs within URI::CORE:regcomp which was called 65 times, avg 4µs/call:
# 20 times (36µs+0s) by URI::canonical at line 304 of URI.pm, avg 2µs/call
# 12 times (48µs+0s) by URI::_scheme at line 165 of URI.pm, avg 4µs/call
# 11 times (72µs+0s) by URI::_init at line 82 of URI.pm, avg 7µs/call
# 11 times (65µs+0s) by URI::new at line 47 of URI.pm, avg 6µs/call
# 11 times (26µs+0s) by URI::_uric_escape at line 92 of URI.pm, avg 2µs/call | |||||
# spent 385µs within URI::CORE:subst which was called 64 times, avg 6µs/call:
# 11 times (158µs+0s) by URI::new at line 41 of URI.pm, avg 14µs/call
# 11 times (67µs+0s) by URI::_uric_escape at line 92 of URI.pm, avg 6µs/call
# 11 times (46µs+0s) by URI::new at line 43 of URI.pm, avg 4µs/call
# 11 times (30µs+0s) by URI::new at line 44 of URI.pm, avg 3µs/call
# 11 times (18µs+0s) by URI::new at line 42 of URI.pm, avg 2µs/call
# 9 times (66µs+0s) by URI::canonical at line 303 of URI.pm, avg 7µs/call | |||||
# spent 309µs within URI::CORE:substcont which was called 58 times, avg 5µs/call:
# 29 times (182µs+0s) by URI::canonical at line 303 of URI.pm, avg 6µs/call
# 29 times (127µs+0s) by URI::_uric_escape at line 92 of URI.pm, avg 4µs/call |