1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388 389 390 391 392 393 394 395 396 397 398 399 400 401 402 403 404 405 406 407 408 409 410 411 412 413 414 415 416 417 418 419 420 421 422 423 424 425 426 427 428 429 430 431 432 433 434 435 436 437 438 439 440 441 442 443 444 445 446 447 448 449 450 451 452 453 454 455 456 457 458 459 460 461 462 463 464 465 466 467 468 469 470 471 472 473 474 475 476 477 478 479 480 481 482 483 484 485 486 487 488 489 490 491 492 493 494 495 496 497 498 499 500 501 502 503 504 505 506 507 508 509 510 511 512 513 514 515 516 517 518 519 520 521 522 523 524 525 526 527 528 529 530 531 532 533 534 535 536 537 538 539 540 541 542 543 544 545 546 547 548 549 550 551 552 553 554 555 556 557 558 559 560 561 562 563 564 565 566 567 568 569 570 571 572 573 574 575 576 577 578 579 580 581 582 583 584 585 586 587 588 589 590 591 592 593 594 595 596 597 598 599 600 601 602 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679 680 681 682 683 684 685 686 687 688 689 690 691 692 693 694 695 696 697 698 699 700 701 702 703 704 705 706 707 708 709 710 711 712 713 714 715 716 717 718 719 720 721 722 723 724 725 726 727 | 242 242 242 242 242 242 242 1748 1748 1748 1748 1748 1748 1748 1748 1748 1748 1748 1748 242 242 242 242 242 242 242 242 242 242 242 242 242 242 242 242 1315 1099 1099 1091 242 1315 8 1307 1307 1307 1307 1307 1307 579 579 63 63 63 63 6 6 1 5 57 1 1 63 1244 1244 827 827 827 827 1244 1226 1226 648 648 1244 764 764 2292 2292 675 764 764 93 671 764 78 78 78 764 764 15280 15280 703 764 93 764 764 764 764 764 764 752 752 1011 1011 879 19 19 113 54 59 19 3 3 3 3 3 3 3 3 3 3 764 764 764 752 764 764 764 764 12 12 4 1244 1241 18615 18615 18551 64 64 4 64 1244 1244 143 143 1244 1244 321 321 321 1 321 923 1 1 1244 1244 35 1244 1147 1147 1147 1244 1244 242 347 323 6 317 240 242 2235 2235 146 146 146 2235 2235 1323 912 10 10 1 2235 3 2235 2235 2235 1344 1344 891 691 2235 2235 2235 6 2235 2235 242 217 242 217 242 216 216 242 433 216 216 216 433 433 433 5196 5196 433 433 16 16 417 16 16 192 192 176 16 6 16 16 401 28 8 8 96 96 8 8 20 20 6 6 6 6 6 6 6 14 20 20 20 20 20 20 20 20 20 20 20 20 20 373 373 77 77 77 50 34 77 77 4 4 4 4 4 4 77 373 60 60 60 60 60 313 295 295 295 295 295 18 10 4 4 4 10 10 10 10 10 10 363 363 363 363 363 2061 2061 69 1992 218 218 1774 184 184 363 8 6 363 16 363 26 363 363 73 73 73 8 8 363 363 18 363 22 22 341 363 347 363 363 363 363 242 764 764 764 44 44 34 44 764 242 471 484 471 | (function () { 'use strict'; const punycode = require('punycode'); exports.parse = urlParse; exports.resolve = urlResolve; exports.resolveObject = urlResolveObject; exports.format = urlFormat; exports.Url = Url; function Url() { this.protocol = null; this.slashes = null; this.auth = null; this.host = null; this.port = null; this.hostname = null; this.hash = null; this.search = null; this.query = null; this.pathname = null; this.path = null; this.href = null; } // Reference: RFC 3986, RFC 1808, RFC 2396 // define these here so at least they only have to be // compiled once on the first module load. const protocolPattern = /^([a-z0-9.+-]+:)/i; const portPattern = /:[0-9]*$/; // Special case for a simple path URL const simplePathPattern = /^(\/\/?(?!\/)[^\?\s]*)(\?[^\s]*)?$/; // RFC 2396: characters reserved for delimiting URLs. // We actually just auto-escape these. const delims = ['<', '>', '"', '`', ' ', '\r', '\n', '\t']; // RFC 2396: characters not allowed for various reasons. const unwise = ['{', '}', '|', '\\', '^', '`'].concat(delims); // Allowed by RFCs, but cause of XSS attacks. Always escape these. const autoEscape = ['\''].concat(unwise); // Characters that are never ever allowed in a hostname. // Note that any invalid chars are also handled, but these // are the ones that are *expected* to be seen, so we fast-path them. const nonHostChars = ['%', '/', '?', ';', '#'].concat(autoEscape); const hostEndingChars = ['/', '?', '#']; const hostnameMaxLen = 255; const hostnamePartPattern = /^[+a-z0-9A-Z_-]{0,63}$/; const hostnamePartStart = /^([+a-z0-9A-Z_-]{0,63})(.*)$/; // protocols that can allow "unsafe" and "unwise" chars. const unsafeProtocol = { 'javascript': true, 'javascript:': true }; // protocols that never have a hostname. const hostlessProtocol = { 'javascript': true, 'javascript:': true }; // protocols that always contain a // bit. const slashedProtocol = { 'http': true, 'https': true, 'ftp': true, 'gopher': true, 'file': true, 'http:': true, 'https:': true, 'ftp:': true, 'gopher:': true, 'file:': true }; const querystring = require('querystring'); function urlParse(url, parseQueryString, slashesDenoteHost) { if (url instanceof Url) return url; var u = new Url(); u.parse(url, parseQueryString, slashesDenoteHost); return u; } Url.prototype.parse = function(url, parseQueryString, slashesDenoteHost) { if (typeof url !== 'string') { throw new TypeError("Parameter 'url' must be a string, not " + typeof url); } // Copy chrome, IE, opera backslash-handling behavior. // Back slashes before the query string get converted to forward slashes // See: https://code.google.com/p/chromium/issues/detail?id=25916 var queryIndex = url.indexOf('?'), splitter = (queryIndex !== -1 && queryIndex < url.indexOf('#')) ? '?' : '#', uSplit = url.split(splitter), slashRegex = /\\/g; uSplit[0] = uSplit[0].replace(slashRegex, '/'); url = uSplit.join(splitter); var rest = url; // trim before proceeding. // This is to support parse stuff like " http://foo.com \n" rest = rest.trim(); if (!slashesDenoteHost && url.split('#').length === 1) { // Try fast path regexp var simplePath = simplePathPattern.exec(rest); if (simplePath) { this.path = rest; this.href = rest; this.pathname = simplePath[1]; if (simplePath[2]) { this.search = simplePath[2]; if (parseQueryString) { this.query = querystring.parse(this.search.substr(1)); } else { this.query = this.search.substr(1); } } else if (parseQueryString) { this.search = ''; this.query = {}; } return this; } } var proto = protocolPattern.exec(rest); if (proto) { proto = proto[0]; var lowerProto = proto.toLowerCase(); this.protocol = lowerProto; rest = rest.substr(proto.length); } // figure out if it's got a host // user@server is *always* interpreted as a hostname, and url // resolution will treat //foo/bar as host=foo,path=bar because that's // how the browser resolves relative URLs. if (slashesDenoteHost || proto || rest.match(/^\/\/[^@\/]+@[^@\/]+/)) { var slashes = rest.substr(0, 2) === '//'; if (slashes && !(proto && hostlessProtocol[proto])) { rest = rest.substr(2); this.slashes = true; } } if (!hostlessProtocol[proto] && (slashes || (proto && !slashedProtocol[proto]))) { // there's a hostname. // the first instance of /, ?, ;, or # ends the host. // // If there is an @ in the hostname, then non-host chars *are* allowed // to the left of the last @ sign, unless some host-ending character // comes *before* the @-sign. // URLs are obnoxious. // // ex: // http://a@b@c/ => user:a@b host:c // http://a@b?@c => user:a host:c path:/?@c // v0.12 TODO(isaacs): This is not quite how Chrome does things. // Review our test case against browsers more comprehensively. // find the first instance of any hostEndingChars var hostEnd = -1; for (var i = 0; i < hostEndingChars.length; i++) { var hec = rest.indexOf(hostEndingChars[i]); if (hec !== -1 && (hostEnd === -1 || hec < hostEnd)) hostEnd = hec; } // at this point, either we have an explicit point where the // auth portion cannot go past, or the last @ char is the decider. var auth, atSign; if (hostEnd === -1) { // atSign can be anywhere. atSign = rest.lastIndexOf('@'); } else { // atSign must be in auth portion. // http://a@b/c@d => host:b auth:a path:/c@d atSign = rest.lastIndexOf('@', hostEnd); } // Now we have a portion which is definitely the auth. // Pull that off. if (atSign !== -1) { auth = rest.slice(0, atSign); rest = rest.slice(atSign + 1); this.auth = decodeURIComponent(auth); } // the host is the remaining to the left of the first non-host char hostEnd = -1; for (var i = 0; i < nonHostChars.length; i++) { var hec = rest.indexOf(nonHostChars[i]); if (hec !== -1 && (hostEnd === -1 || hec < hostEnd)) hostEnd = hec; } // if we still have not hit it, then the entire thing is a host. if (hostEnd === -1) hostEnd = rest.length; this.host = rest.slice(0, hostEnd); rest = rest.slice(hostEnd); // pull out port. this.parseHost(); // we've indicated that there is a hostname, // so even if it's empty, it has to be present. this.hostname = this.hostname || ''; // if hostname begins with [ and ends with ] // assume that it's an IPv6 address. var ipv6Hostname = this.hostname[0] === '[' && this.hostname[this.hostname.length - 1] === ']'; // validate a little. if (!ipv6Hostname) { var hostparts = this.hostname.split(/\./); for (var i = 0, l = hostparts.length; i < l; i++) { var part = hostparts[i]; if (!part) continue; if (!part.match(hostnamePartPattern)) { var newpart = ''; for (var j = 0, k = part.length; j < k; j++) { if (part.charCodeAt(j) > 127) { // we replace non-ASCII char with a temporary placeholder // we need this to make sure size of hostname is not // broken by replacing non-ASCII by nothing newpart += 'x'; } else { newpart += part[j]; } } // we test again with ASCII char only if (!newpart.match(hostnamePartPattern)) { var validParts = hostparts.slice(0, i); var notHost = hostparts.slice(i + 1); var bit = part.match(hostnamePartStart); Eif (bit) { validParts.push(bit[1]); notHost.unshift(bit[2]); } Eif (notHost.length) { rest = '/' + notHost.join('.') + rest; } this.hostname = validParts.join('.'); break; } } } } Iif (this.hostname.length > hostnameMaxLen) { this.hostname = ''; } else { // hostnames are always lower case. this.hostname = this.hostname.toLowerCase(); } if (!ipv6Hostname) { // IDNA Support: Returns a punycoded representation of "domain". // It only converts parts of the domain name that // have non-ASCII characters, i.e. it doesn't matter if // you call it with a domain that already is ASCII-only. this.hostname = punycode.toASCII(this.hostname); } var p = this.port ? ':' + this.port : ''; var h = this.hostname || ''; this.host = h + p; // strip [ and ] from the hostname // the host field still retains them, though if (ipv6Hostname) { this.hostname = this.hostname.substr(1, this.hostname.length - 2); if (rest[0] !== '/') { rest = '/' + rest; } } } // now rest is set to the post-host stuff. // chop off any delim chars. if (!unsafeProtocol[lowerProto]) { // First, make 100% sure that any "autoEscape" chars get // escaped, even if encodeURIComponent doesn't think they // need to be. for (var i = 0, l = autoEscape.length; i < l; i++) { var ae = autoEscape[i]; if (rest.indexOf(ae) === -1) continue; var esc = encodeURIComponent(ae); if (esc === ae) { esc = escape(ae); } rest = rest.split(ae).join(esc); } } // chop off from the tail first. var hash = rest.indexOf('#'); if (hash !== -1) { // got a fragment string. this.hash = rest.substr(hash); rest = rest.slice(0, hash); } var qm = rest.indexOf('?'); if (qm !== -1) { this.search = rest.substr(qm); this.query = rest.substr(qm + 1); if (parseQueryString) { this.query = querystring.parse(this.query); } rest = rest.slice(0, qm); } else if (parseQueryString) { // no query string, but parseQueryString still requested this.search = ''; this.query = {}; } if (rest) this.pathname = rest; if (slashedProtocol[lowerProto] && this.hostname && !this.pathname) { this.pathname = '/'; } //to support http.request if (this.pathname || this.search) { var p = this.pathname || ''; var s = this.search || ''; this.path = p + s; } // finally, reconstruct the href based on what has been validated. this.href = this.format(); return this; }; // format a parsed object into a url string function urlFormat(obj) { // ensure it's an object, and not a string url. // If it's an obj, this is a no-op. // this way, you can call url_format() on strings // to clean up potentially wonky urls. if (typeof obj === 'string') obj = urlParse(obj); else if (typeof obj !== 'object' || obj === null) throw new TypeError("Parameter 'urlObj' must be an object, not " + obj === null ? 'null' : typeof obj); else if (!(obj instanceof Url)) return Url.prototype.format.call(obj); return obj.format(); } Url.prototype.format = function() { var auth = this.auth || ''; if (auth) { auth = encodeURIComponent(auth); auth = auth.replace(/%3A/i, ':'); auth += '@'; } var protocol = this.protocol || '', pathname = this.pathname || '', hash = this.hash || '', host = false, query = ''; if (this.host) { host = auth + this.host; } else if (this.hostname) { host = auth + (this.hostname.indexOf(':') === -1 ? this.hostname : '[' + this.hostname + ']'); if (this.port) { host += ':' + this.port; } } if (this.query !== null && typeof this.query === 'object' && Object.keys(this.query).length) { query = querystring.stringify(this.query); } var search = this.search || (query && ('?' + query)) || ''; if (protocol && protocol.substr(-1) !== ':') protocol += ':'; // only the slashedProtocols get the //. Not mailto:, xmpp:, etc. // unless they had them to begin with. if (this.slashes || (!protocol || slashedProtocol[protocol]) && host !== false) { host = '//' + (host || ''); if (pathname && pathname.charAt(0) !== '/') pathname = '/' + pathname; } else if (!host) { host = ''; } if (hash && hash.charAt(0) !== '#') hash = '#' + hash; if (search && search.charAt(0) !== '?') search = '?' + search; pathname = pathname.replace(/[?#]/g, function(match) { return encodeURIComponent(match); }); search = search.replace('#', '%23'); return protocol + host + pathname + search + hash; }; function urlResolve(source, relative) { return urlParse(source, false, true).resolve(relative); } Url.prototype.resolve = function(relative) { return this.resolveObject(urlParse(relative, false, true)).format(); }; function urlResolveObject(source, relative) { Iif (!source) return relative; return urlParse(source, false, true).resolveObject(relative); } Url.prototype.resolveObject = function(relative) { if (typeof relative === 'string') { var rel = new Url(); rel.parse(relative, false, true); relative = rel; } var result = new Url(); var tkeys = Object.keys(this); for (var tk = 0; tk < tkeys.length; tk++) { var tkey = tkeys[tk]; result[tkey] = this[tkey]; } // hash is always overridden, no matter what. // even href="" will remove it. result.hash = relative.hash; // if the relative url is empty, then there's nothing left to do here. if (relative.href === '') { result.href = result.format(); return result; } // hrefs like //foo/bar always cut to the protocol. if (relative.slashes && !relative.protocol) { // take everything except the protocol from relative var rkeys = Object.keys(relative); for (var rk = 0; rk < rkeys.length; rk++) { var rkey = rkeys[rk]; if (rkey !== 'protocol') result[rkey] = relative[rkey]; } //urlParse appends trailing / to urls like http://www.example.com if (slashedProtocol[result.protocol] && result.hostname && !result.pathname) { result.path = result.pathname = '/'; } result.href = result.format(); return result; } if (relative.protocol && relative.protocol !== result.protocol) { // if it's a known url protocol, then changing // the protocol does weird things // first, if it's not file:, then we MUST have a host, // and if there was a path // to begin with, then we MUST have a path. // if it is file:, then the host is dropped, // because that's known to be hostless. // anything else is assumed to be absolute. if (!slashedProtocol[relative.protocol]) { var keys = Object.keys(relative); for (var v = 0; v < keys.length; v++) { var k = keys[v]; result[k] = relative[k]; } result.href = result.format(); return result; } result.protocol = relative.protocol; if (!relative.host && !/^file:?$/.test(relative.protocol) && !hostlessProtocol[relative.protocol]) { var relPath = (relative.pathname || '').split('/'); while (relPath.length && !(relative.host = relPath.shift())); if (!relative.host) relative.host = ''; Eif (!relative.hostname) relative.hostname = ''; Eif (relPath[0] !== '') relPath.unshift(''); if (relPath.length < 2) relPath.unshift(''); result.pathname = relPath.join('/'); } else { result.pathname = relative.pathname; } result.search = relative.search; result.query = relative.query; result.host = relative.host || ''; result.auth = relative.auth; result.hostname = relative.hostname || relative.host; result.port = relative.port; // to support http.request Eif (result.pathname || result.search) { var p = result.pathname || ''; var s = result.search || ''; result.path = p + s; } result.slashes = result.slashes || relative.slashes; result.href = result.format(); return result; } var isSourceAbs = (result.pathname && result.pathname.charAt(0) === '/'), isRelAbs = ( relative.host || relative.pathname && relative.pathname.charAt(0) === '/' ), mustEndAbs = (isRelAbs || isSourceAbs || (result.host && relative.pathname)), removeAllDots = mustEndAbs, srcPath = result.pathname && result.pathname.split('/') || [], relPath = relative.pathname && relative.pathname.split('/') || [], psychotic = result.protocol && !slashedProtocol[result.protocol]; // if the url is a non-slashed url, then relative // links like ../.. should be able // to crawl up to the hostname, as well. This is strange. // result.protocol has already been set by now. // Later on, put the first path part into the host field. if (psychotic) { result.hostname = ''; result.port = null; if (result.host) { if (srcPath[0] === '') srcPath[0] = result.host; else srcPath.unshift(result.host); } result.host = ''; if (relative.protocol) { relative.hostname = null; relative.port = null; Eif (relative.host) { Iif (relPath[0] === '') relPath[0] = relative.host; else relPath.unshift(relative.host); } relative.host = null; } mustEndAbs = mustEndAbs && (relPath[0] === '' || srcPath[0] === ''); } if (isRelAbs) { // it's absolute. result.host = (relative.host || relative.host === '') ? relative.host : result.host; result.hostname = (relative.hostname || relative.hostname === '') ? relative.hostname : result.hostname; result.search = relative.search; result.query = relative.query; srcPath = relPath; // fall through to the dot-handling below. } else if (relPath.length) { // it's relative // throw away the existing file, and take the new path instead. Iif (!srcPath) srcPath = []; srcPath.pop(); srcPath = srcPath.concat(relPath); result.search = relative.search; result.query = relative.query; } else if (relative.search !== null && relative.search !== undefined) { // just pull out the search. // like href='?foo'. // Put this after the other two cases because it simplifies the booleans if (psychotic) { result.hostname = result.host = srcPath.shift(); //occationaly the auth can get stuck only in host //this especialy happens in cases like //url.resolveObject('mailto:local1@domain1', 'local2@domain2') var authInHost = result.host && result.host.indexOf('@') > 0 ? result.host.split('@') : false; Iif (authInHost) { result.auth = authInHost.shift(); result.host = result.hostname = authInHost.shift(); } } result.search = relative.search; result.query = relative.query; //to support http.request Eif (result.pathname !== null || result.search !== null) { result.path = (result.pathname ? result.pathname : '') + (result.search ? result.search : ''); } result.href = result.format(); return result; } Iif (!srcPath.length) { // no path at all. easy. // we've already handled the other stuff above. result.pathname = null; //to support http.request if (result.search) { result.path = '/' + result.search; } else { result.path = null; } result.href = result.format(); return result; } // if a url ENDs in . or .., then it must get a trailing slash. // however, if it ends in anything else non-slashy, // then it must NOT get a trailing slash. var last = srcPath.slice(-1)[0]; var hasTrailingSlash = ( (result.host || relative.host || srcPath.length > 1) && (last === '.' || last === '..') || last === ''); // strip single dots, resolve double dots to parent dir // if the path tries to go above the root, `up` ends up > 0 var up = 0; for (var i = srcPath.length; i >= 0; i--) { last = srcPath[i]; if (last === '.') { spliceOne(srcPath, i); } else if (last === '..') { spliceOne(srcPath, i); up++; } else if (up) { spliceOne(srcPath, i); up--; } } // if the path is allowed to go above the root, restore leading ..s if (!mustEndAbs && !removeAllDots) { for (; up--; up) { srcPath.unshift('..'); } } if (mustEndAbs && srcPath[0] !== '' && (!srcPath[0] || srcPath[0].charAt(0) !== '/')) { srcPath.unshift(''); } if (hasTrailingSlash && (srcPath.join('/').substr(-1) !== '/')) { srcPath.push(''); } var isAbsolute = srcPath[0] === '' || (srcPath[0] && srcPath[0].charAt(0) === '/'); // put the host back if (psychotic) { result.hostname = result.host = isAbsolute ? '' : srcPath.length ? srcPath.shift() : ''; //occationaly the auth can get stuck only in host //this especialy happens in cases like //url.resolveObject('mailto:local1@domain1', 'local2@domain2') var authInHost = result.host && result.host.indexOf('@') > 0 ? result.host.split('@') : false; if (authInHost) { result.auth = authInHost.shift(); result.host = result.hostname = authInHost.shift(); } } mustEndAbs = mustEndAbs || (result.host && srcPath.length); if (mustEndAbs && !isAbsolute) { srcPath.unshift(''); } if (!srcPath.length) { result.pathname = null; result.path = null; } else { result.pathname = srcPath.join('/'); } //to support request.http if (result.pathname !== null || result.search !== null) { result.path = (result.pathname ? result.pathname : '') + (result.search ? result.search : ''); } result.auth = relative.auth || result.auth; result.slashes = result.slashes || relative.slashes; result.href = result.format(); return result; }; Url.prototype.parseHost = function() { var host = this.host; var port = portPattern.exec(host); if (port) { port = port[0]; if (port !== ':') { this.port = port.substr(1); } host = host.substr(0, host.length - port.length); } if (host) this.hostname = host; }; // About 1.5x faster than the two-arg version of Array#splice(). function spliceOne(list, index) { for (var i = index, k = i + 1, n = list.length; k < n; i += 1, k += 1) list[i] = list[k]; list.pop(); } }()); |