Talk:Uniform Resource Identifier/Archive 1
This is an archive of past discussions about Uniform Resource Identifier. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 |
URI vs URL
what's difference between URI and URL? -- Taku 09:14, Mar 11, 2004 (UTC)
- URLs are a subset of URIs, which are more general than just Internet resources. But you're right, it should probably be mentioned on the page. --Bth 09:28, 11 Mar 2004 (UTC)
- I removed the edit. An adequate discussion of URLs, with appropriate links, is already at the bottom of the article. For further info, see section 1.1.3 of the new version of RFC 2396 [1] (in development) - mjb 11:12, 12 Mar 2004 (UTC)
Whoever coined the distinctions between URL, URI, and URN needs to get a life. lysdexia 08:27, 24 Oct 2004 (UTC)
- Well, no one came up with the three acronyms all at once. My understanding is that early on, URLs proved to be too closely tied to addressing and retrieval according to specific network protocols (http, ftp, etc.); it was difficult ot use them to just name things (give them identifiers). So someone came up with URNs. They're really just a special URL scheme that has no location/retrieval semantics.
- Well, that still kind of left things in a confusing state, because it meant that sometimes a URL was a name and sometimes it was a locator, but really even when it was a locator you could use it as if it were a name. The more you use URLs, the more these kinds of things become important. So along came URI, as a sort of grand unification theory. URIs are more than just a way of dealing with URLs and URNs generically, though; they formally draw a line between the idea of merely identifying something and actually retrieving (or even suggesting that it's possible to retrieve) a representation of it.
- Making these distinctions allows the definition of "resource" (the thing being identified or located) to be much more flexible—a big help in the world of RDF and knowledge management applications. In other words, if someone were designing it from scratch today, it would've been URI all along, and you'd never know about URL or URN. For the most part, URL/URN are obsolete terms, but we're kinda stuck with them, in large part due to the resistance of people who apparently have a life. ;) - mjb 23:04, 25 Oct 2004 (UTC)
- Isn't "obsolete" really a little strong? To me that would imply a term more or less abandoned in general use, whereas a Google shows that occurences of "URL" vastly outnumber (by more than 13 to 1) those of "URI". Loganberry (Talk) 04:58, 30 July 2005 (UTC)
- I agree; obsolete is unnecessary.
- URL is not only obsolete, but it never existed in the first place. Please report to the Ministry of Love for correctional therapy.
- A reasonable observation; I've noted it below, in the 'URI/URL/URN popular semantics' thread, and I made a change to the article today, to tone down the 'obsolete' bit. It now says:
- The contemporary point of view among the working group that oversees URIs is that the terms URL and URN are context-dependent aspects of URIs, and rarely need to be distinguished.[1] In technical publications, especially standards produced by the IETF and the W3C, the term URL has long been deprecated, as it is rarely necessary to distinguish between URLs and URIs. However, in nontechnical contexts and in software for the World Wide Web, the term URL remains ubiquitous. Additionally, the term web address, which has no formal definition, is often used in nontechnical publications as a synonym for URL or URI, although it generally refers only to 'http' and 'https' URIs. —mjb 22:19, 2 August 2006 (UTC)
I was pretty confused by the section on URI and its relation to URL and URN. Perhaps this section could be rewritten? Today I ran into the definitions given in ANSI/NISO Z39.29-2005 http://www.niso.org/standards/resources/Z39-29-2005.pdf , which were useful to me. They say:
- A URN is "name of an internet resource that has institutional persistence, that is, its exact location may change from time to time, but some agency will be able to find it. A URN is a form of UIR. It looks like 'URN:[agency or directory]://[term]'. The user need only know the name of the resource "[term]", not its location on the internet.
- A URI is "The generic set of all names and addresses which are short strings that refer to intellectual objects (typically on the Internet). A URI typical describes 1) the mechanism used to access the resourcec, 2) the specific computer that the resource is housed in, and 3) the specific name of the reource (a file name) on the computer. The most common form of URI is the Web page address or URL. Character strings that identify File Transfer Protocol (FTP) addresses and e-mail addresses are also URIs." Jodi.A.Schneider 17:55, 10 September 2006 (UTC)Jodi A. Schneider
URI vs URL part deux
My understanding is that the term "URI" was the one that was deprecated, given the widespread acceptance of the term "URL". Was I wrong? Do we have a reference for the contention that "URL" is considered obsolete? --Doradus 03:02, 13 December 2005 (UTC)
- References and clarifications added today. —mjb 22:19, 2 August 2006 (UTC)
Fragments and RDF
RFC 3968 has changed the generic URI syntax to allow fragment identifiers not just in URI references, but on all URIs except those conforming to "absolute-URI". I think(?) this was done in part to deal with resource identification in RDF. I'd like to research this further and mention it in the article. —mjb 00:42, 3 Feb 2005 (UTC)
Introduction revision proposal
Defining correctly URI and Resource is tricky, but the current article introduction seems confusing on at least two points:
- It defines URI as a "Internet protocol element". This seems quite restrictive, since some kinds of URI (URN) are not linked to Internet protocol. The first use of a URI is (as its name says) to identify a resource. It's that basis that declarative semantics of RDF, RDFS, OWL ... rely upon. Some particular URIs (URLs) are also used as locators, which means they have functional semantics in an internet protocol (http, mail, ftp, ...). The distinction between declarative and functional semantics of URIs is important. So my suggestion is to stick to the last definition as per RFC 3986 (p.4), which makes clearly this distinction.
"A URI is an identifier consisting of a sequence of characters [...] It enables uniform identification of resources via a separately defined extensible set of naming schemes. How that identification is accomplished, assigned, or enabled is delegated to each scheme specification."
- It links resource to Resource (computer science). It seems to me that the meaning of Resource should be here as per its RFC 3986 definition, ibid., p.4, and applicable to the "R" in URI, URL, URN and RDF as well. So the link should be rather to web resource. Agreed, this article is currently a stub. I've on my agenda to expand it - tricky subject - and will be back to this discussion when it's done. -- universimmedia 07:51, 20 June 2006 (UTC)
- It sounds like you're under the impression that protocol means network communication standard. That's the most common kind of protocol, but the term actually has a broader definition, and it is not inaccurate or restrictive to say that a URI is an Internet protocol element. However, I would agree that this isn't obvious to most readers. I also agree with the need to fork the 'resource' articles. I'm glad someone is working on it. —mjb 01:53, 1 August 2006 (UTC)
- It sounds like you've a correct interpretation of what I understand by protocol. But do you agree with the link in the introduction to protocol ? Or maybe this article should undergo revision also?. As for resource I would be happy not to work alone on it. Your suggestions are welcome! universimmedia 09:16, 1 August 2006 (UTC)
- Yes, I agree with the link to protocol (computing) and the rest of your edits from June 22. I don't know if I have time to work on the web resource article just yet. You seem to have a grasp of the main issues. Good luck!
:)
—mjb 22:19, 2 August 2006 (UTC)
- Yes, I agree with the link to protocol (computing) and the rest of your edits from June 22. I don't know if I have time to work on the web resource article just yet. You seem to have a grasp of the main issues. Good luck!
- It sounds like you've a correct interpretation of what I understand by protocol. But do you agree with the link in the introduction to protocol ? Or maybe this article should undergo revision also?. As for resource I would be happy not to work alone on it. Your suggestions are welcome! universimmedia 09:16, 1 August 2006 (UTC)
Examples of URI, URL, URN
Semi-private discussion
I reverted recent edits by Krauss in which he added examples intended to illustrate the differences between URIs, URLs, and URNs. I am not opposed to offering such examples, but what was written was incorrect or misleading. 'www.wikipedia.org', for example, is a URI reference, but not a URI. It also cannot be used as a URN (it would have to begin with 'urn:foo:' where foo is a URN scheme name). A web browser might allow it to be input as if it were a URL, and then do cleanup on it or just make assumptions about what was intended, but the character string itself is not a URL. I also reverted speculation that such interfaces are the reason why people confuse URL, URI, etc.; it's plausible, perhaps even probable, but unverifiable. —mjb 07:09, 28 July 2006 (UTC)
Ok, you destroy all may text — my language is portuguese from Brazil (sorry my english errors), to write english was dificult and time-consuming for me! But you very fast (and my text can by reloaded), I accept your sugestion...
- The text on "Relationship to URL and URN" section is incomplete and not didactic: ask people and your friends if they understand the URI/URL/URN difference!
- "speculation about reason for confusion": ask google, not only my speculation... but ok... the point is "what is true??" you have the true? ... I think on Wiki the true emerge from a "dynamic convergence process"... Consensus and convergence are the true. (and destroying texts you may cancel the process).
- incorrect examples: ok sorry... in other articles collaborators correct the erros, if the idea was good, it is preserved ("understanding mistakes by examples" was my sugestion, you agree the idea?)
- "discuss examples on talk": thanks, let discuss.
Krauss 29 July 2006 (UTC).
You've got two things going here:
- a push to add more examples to illustrate the relationship between URI, URL, URN; and
- a push to rewrite a couple of paragraphs that explain that relationship.
I don't see a strong need for more examples, but I am open to it, if it will really help. But what you haven't really done here is explain why you're seeking to rewrite the prose. What is wrong with the text that is currently in the article? What's missing from it? What crucial aspect did it fail to address?
See, I think it very clearly and succinctly explains the relationship and concepts, and nothing that you're trying to add or change, thus far, does anything to improve upon it. For example, you said something like "URNs are technical; URLs are not" (I'm paraphrasing). To me, all URIs are equally 'technical', but I think I do understand what you were trying to say: URNs are only found in technical contexts, whereas URLs are found in both technical and nontechnical publications, from common HTML code for web sites to billboards and magazine ads. It's a fair observation, but it's not a point that's crucial to establish an understanding of what a URL is and what a URN is. It's trivia.
The difference between URL and URN is trivial. Both function as resource IDs, and both can be dereferenced ('resolved') to obtain a representation of the resource they denote. The only difference is that the URL's scheme implicitly suggests a possible dereference mechanism that is (or should be) dictated by the spec that governs the scheme. The protocol suggested by a URL for dereferencing is just a suggestion. Nothing is preventing an application from reading a URI, be it a URL or URN, and associating it with a representation of the identified resource from its own cache. No network activity need take place. So an http URL and a urn:uuid URN are fundamentally the same; they just identify a resource. The http URL just contains some information that suggests a possible dereference procedure. Once you understand this, you should see why the IETF views the distinction between URLs and URNs as irrelevant. We should not be making the distinction into more than it is.
- ok, let us use the KISS principle to write the article, not the W3C prolixity —krauss 1 August 2006.
The only reason people avoid URNs in nontechnical contexts is just because most of the time, the resources that people most often want to make reference to are things that must be obtained 'live' from a network, via specific protocols like HTTP. When that's the goal, then a URL is a natural choice, because it provides the protocol-specific details for representation retrieval (and, often, server interaction) within itself, and also because people who mention URLs feel safe making assumptions about the capabilities of URL resolvers that are built into web applications and operating systems, and about people's connectedness to Internet-based distributed domain name services. Using a URN would require a similar type of global resolver service to help with the dereferencing process, and no such service exists. This is not an intrinsic difference between URLs and URNs; it's just a circumstance fueled, in part, by momentum, misunderstanding, and bureaucracy. —mjb 01:37, 31 July 2006 (UTC)
- Ok, we can review our consensus (see added sec.). —krauss 1 August 2006.
Again, I've reverted your changes (except for a link) because the information you are adding to the article is redundant or is giving people advice, and because you are proceeding with changes that have not been agreed to here. Why are you so impatient? Please read what I wrote above, and answer the questions I asked: what crucial info is missing? —mjb 00:17, 1 August 2006 (UTC)
- Sorry. Ok, we need time, and we add step by step, I agree... it was also a sandBox, I needed to see (and show to you) what I doing here. About you wrote above, I am reading good a rfc3986 HTML text... but you are the technical expert, I am doing only a overview. I am a "vulgar" wiki reader and a collaborator worried about didatic (understandability and simplicity) for "vulgar readers". —krauss 1 August 2006.
Proposed examples
To readers "understand by examples", we need good examples on the article.
Note 1: there are test kits: W3C-2004 kit, ... (more kits?)
Note 2: I sugest to remove the URI reference citatoins. The URI-ref need the relative-ref concept. We are diff only URI/URN/URL, its not didatic for the article mix then with URI-ref (neither productive for us). "URI-reference = URI / relative-ref
" rfc3986 sec 4.1. On set terms: "the set of all strings that are valid URI-refs are the union of absolute URI set and the relative URI set".
Examples | could be a (see note) | is not a |
http://www.wikipedia.org
|
URL, URI (or URI reference) | URN |
www.wikipedia.org
|
URL-like string acceptable by some web browser user interfaces as if it had been prefaced by 'http://', as in the preceding example | URL, URI (or URI reference), URN |
urn:www.wikipedia.org
|
URN, URI (or URI reference) | URL |
http://www.example.org/book0395363411.htm#Sec1
|
URI reference | URI, URL, URN |
We can put a lot examples here and resume/select for article.
- I completely redid your examples for accuracy, and changed "is"/"is not" to "could be"/"could not be", because a string's URI/URL/URN-ness is not just a matter of syntax, but also of designation/role. However, see below; I don't think this chart is really necessary. —mjb 07:31, 29 July 2006 (UTC)
- The "not only sintax" condiction is very important, only now I am reading the RFC 2141 ... ops, it says "intended to serve as persistent, location-independent", persistence need to remember on the article. Perhaps the example table need a column to the context, or we need a second table to put "not-browser" using contexts. Also RFC define
<URN> ::= "urn:" <NID> ":" <NSS>
(obligation for "urn:" on URN sintax, see also RFC's Appendix A) —Krauss 29 July 2006 (UTC) - Other point: the examples selected to article need show that the "URL refers to the subset of URI" (RFC-2396, 1.2).
- I hope you understand what I mean about syntax and designation: a random string that matches the syntax of a URI is not necessarily a URI; to be a URI it must also have the role of being an identifier (of a resource).
- You mentioned RFC 2396, which is obsolete. STD 66, aka RFC 3986, is current. Please don't use RFC 2396 as a reference. I only mentioned RFC 2396 in the article where it was necessary to explain a difference between the two versions of the spec.
- Regarding persistence of URNs, which is another issue altogether, STD 66 says "This specification does not require that a URI persists in identifying the same resource over time, though that is a common goal of all URI schemes." So, the 'urn' scheme is not special in this regard, and persistence is a goal, not a requirement. This is basically an acknowledgment that the association between a resource and its ID is under the control of whoever is in charge of the resource, references to it, or access to it, and thus may thus change at any time. Persistence is essentially application-level, not syntax-level, so it's beyond the scope of the spec's authority.
- I really suggest you read everything in STD 66 very carefully. For example, did you notice this?
- An individual scheme does not have to be classified as being just one of "name" or "locator". Instances of URIs from any given scheme may have the characteristics of names or locators or both, often depending on the persistence and care in the assignment of identifiers by the naming authority, rather than on any quality of the scheme. Future specifications and related documentation should use the general term "URI" rather than the more restrictive terms "URL" and "URN". (reference: RFC 3305)
- RFC 3305 may be of interest to you, as well; it elaborates on that last sentence and provides richer explanations of the relationship between the terms.
- Lastly, in the example table, I don't want to have a separate column for 'in browsers' vs 'strict'. I think you're allowing your observations of web browser functionality to pollute your concept of what URIs are about. You must take care not to insinuate that web browsing is their purpose. Their purpose is, simply, resource identification – although obviously, the principal component of the WWW, hypertext/HTML, demands that a robust system of resource identification and dereferencing be established (SGML was rather nonspecific about it), so the development of the URI syntax and concepts received much momentum during HTML's formative years. It's just that you mustn't lead the reader to believe that what their browser does (e.g., accepting a malformed URL in its 'address bar'/'URL bar' widget) has something to do with what a URI is. URIs comprise an Internet protocol for resource identification. The World Wide Web is one of many uses of the Internet, and browsers are one of the tools that humans use to interface with the WWW. I mean, the WWW and its browsers are not the same thing as the Internet and its general protocols (URIs included), and we must try to maintain that separation. —mjb 01:05, 1 August 2006 (UTC)
- The "not only sintax" condiction is very important, only now I am reading the RFC 2141 ... ops, it says "intended to serve as persistent, location-independent", persistence need to remember on the article. Perhaps the example table need a column to the context, or we need a second table to put "not-browser" using contexts. Also RFC define
- General: I cited Identifying, locating, and naming things on the Web (by D.Connolly) on the External links because Connolly have another point of view... and I added 2 secs. here (Talk) because I think we need some basic consensus to continue the discussion. About the objective of this article, you think is to be didatic (understandability and simplicity) or to be techinical (computer science readers)? —krauss 1 August 2006.
- I like having those links. Ultimately, what we say about URIs must agree with the specs, but the writings of people who were/are involved in the development of the specs is definitely useful and relevant, especially when they're explaining esoteric topics for a more general audience. Thanks.
- Regarding the objective of the article: it's both. To some extent, simplicity must be sacrificed for correctness and accuracy – if a topic must be included, but can't be explained without getting "technical", then we have to bring the reader up to the technical level through examples and definitions. However, we have to avoid going overboard with holding their hand; this should not be an exhaustive tutorial, nor an in-depth study of every nuance of the specs and the ways in which URIs are used in the world.
- If you visit the mathematics articles, you will find many highly technical explanations that make no sense to the average high school graduate, often written (inappropriately for an encyclopedia, in my opinion) in a hand-waving, reader-addressing lecture style. But how do you explain post-calculus to someone who decided they were done with math after they got a C in algebra? You have to draw a line somewhere and say "if this is too technical, too bad".
- That said, if you would say what sentences in the article you feel are too technical, we could figure out ways to make them less jarring, either by changing them, or changing the text leading up to them. —mjb 21:17, 2 August 2006 (UTC)
Speculation?
These paragraphs can be used on the article, or are Speculation? |
draft vers. 1
A Uniform Resource Locator (URL) is a subset of the URI popular and usual protocols (with scheme names like Uniform Resource Name (URN) is for more technical use, and often times people use the terms — URN and URL, or, URN and URI — interchangably, which is not entirely correct. A possible source of mistakes is because web browsers allow for default documents and do not require a scheme to retrieve a document. |
draft vers. 1 Discussion
I think it is ok. Krauss 29 July 2006 (UTC). We can only comment "on web browsers it is more difficult to see the differences, it allow for default documents and do not require a scheme to retrieve a document".
|
draft vers. 2
A Uniform Resource Locator (URL) is a subset of the URI popular protocols (with scheme names like In the use of the term Uniform Resource Name (URN), some caution may be required to interchange with URL or URI, because web browsers allow for default documents and do not require a scheme to retrieve a document. The term URN refers to the subset of URI that are required to remain globally unique and persistent (even when the resource ceases to exist or becomes unavailable). |
Vote and/or change the text.
I'm for vers.2 but with a slight modification :
- A Uniform Resource Locator (URL) is a subset of the URI popular protocols ..."
Seems to me a misleading shortcut for :
- A Uniform Resource Locator (URL) is a subset of URI associated with the popular protocols ..."
universimmedia 13:47, 31 July 2006 (UTC)
- ok... Universimmedia was a relevant colaborator, Mjb need more oks to add this two paragraphs (without revert then)? -- Krauss
Start here (this is what is in the article):
- A URI can be classified as a locator or a name or both. A Uniform Resource Locator (URL) is a URI that, in addition to identifying a resource, provides means of acting upon or obtaining a representation of the resource by describing its primary access mechanism or network "location". For example, the URL http://www.wikipedia.org/ is a URI that identifies a resource (Wikipedia's home page) and implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.wikipedia.org. A Uniform Resource Name (URN) is a URI that identifies a resource by name in a particular namespace. A URN can be used to talk about a resource without implying its location or how to dereference it. For example, the URN urn:isbn:0-395-36341-1 is a URI that, like an International Standard Book Number (ISBN), allows one to talk about a book, but doesn't suggest where and how to obtain an actual copy of it.
- The contemporary point of view among the working group that oversees URIs is that the terms URL and URN are context-dependent aspects of URIs, and rarely need to be distinguished.[1] In technical publications, especially standards produced by the IETF and the W3C, the term URL has long been deprecated, as it is rarely necessary to distinguish between URLs and URIs. However, in nontechnical contexts and in software for the World Wide Web, the term URL remains ubiquitous. Additionally, the term web address, which has no formal definition, is often used in nontechnical publications as a synonym for URL or URI, although it generally refers only to 'http' and 'https' URIs.
You have not said what's wrong with this, and I think it's pretty good, but let's go ahead and analyze your replacement anyway:
- A Uniform Resource Locator (URL) is a subset of the URI popular protocols (with scheme names like
http
,ftp
ormailto
).- Wrong term: protocols (see URI scheme, which needs a lot of work).
- Unfamiliar term: scheme (not introduced until the next section, on syntax, and then there's a whole article devoted to it).
- A URI can be classified as a locator or a name or both was an important, overarching concept that explains why we're talking about URLs (locators) and URNs (names) in the sentences that follow. Why was it removed?
- Therefore all URLs are URIs.
- Was this not implicit in the original phrase A Uniform Resource Locator (URL) is a URI that…?
- The term URL is technically deprecated, but is more widespread and historically important. For popular usage, to design
http
sites and web pages, the term web address can be used to replace the term URL; in other all usages, prefer the term URI.- In response to concerns about the article stating that URL is "obsolete", I think I've addressed this in the 2nd paragraph now, with the text in nontechnical contexts and in software for the World Wide Web, the term URL remains ubiquitous. Additionally, the term web address, which has no formal definition, is often used in nontechnical publications as a synonym for URL or URI, although it generally refers only to 'http' and 'https' URIs. (note this phrasing is careful to avoid introducing the word 'scheme' prematurely).
- In the use of the term Uniform Resource Name (URN), some caution may be required to interchange with URL or URI, because web browsers allow for default documents and do not
require a scheme to retrieve a document.
- Here you have begun using the term URN before you have defined it.
- You are also giving advice to the reader, which is not what we are supposed to do in an encyclopedia (at least, not directly).
- The sentence makes no sense. Why would it occur to anyone to "interchange URN with URL or URI", and what does that have to do with web browser interfaces and default documents?
- The term URN refers to the subset of URI that are required to remain globally unique and persistent (even when the resource ceases to exist or becomes unavailable).
- I've addressed the issue of persistence elsewhere in this talk page. It is a red herring and should not be mentioned, or should be very heavily qualified.
Also, in my version, the 1st paragraph defines the terms URL and URN and relates them to URI, and provides examples to make it very clear for the average reader who we can assume has a vague familiarity with the WWW. The 2nd paragraph explains how the terms are used, where they're deprecated, etc. In your version, the 1st paragraph is devoted to defining and describing the usage of URL, and the 2nd paragraph is devoted to defining URNs and advising the reader to (I think) try not to mix them up(?). There is no information in your version that isn't in mine, and mine I think is much more precise and better organized. So, I don't like your version at all. —mjb 22:50, 2 August 2006 (UTC)
URI/URL/URN popular semantics
The term URL have about 1,190,000,000 occurences (Google "web URL"), the term URI about 83,200,000 (Google "web URI"). The term URL not have other significant conotations, the term URI have (a India region, a Italy's city, etc. indicating the number is about less).
We can do another experiments on Google, Altavista and controlled text corpura. In all then the term URI ocurrs in only a ~5% of the total URL ocurrences. If we read a modern dictionary, of any language, english, portuguese, spanish, etc. a lot of them have the URL term, but not URI (the dictionary edictors do similar experiments to decide what is relevant).
URL term was born on a techinical context, but now is a word wide "cultural embedded" term. Now it is independent term, it is not from a Technical terminology (RFCs), but a universal, from languages term.
On languages the terms are speak, read and write by all people, and they "decide" (statistically and using) what terms they want to use. Experts or "W3C-technical people" not decide for the "vulgar people".
"Vulgar peole" CORRECTLY understand the terms URL and web address — we here on wiki not will change their understanding, we need also understand them. They read (and will read) this article for undertand the terms URI and URN, not for "URI-doctrination" or for technical details about the terms and your semantics.
"Vulgar peole" are the public (or a very significant part) of this article.
-- Krauss 1 August 2006.
It's ironic: "vulgar" in English has two meanings. When most (common) people hear/see the word "vulgar", they think it means "obscene". Only linguists (people involved in the technical study of language) use "vulgar" to mean "of the common people". :)
I think we agree that the article should mention that URL and web address are very common terms, and should also explain how the terms are related to URI. I felt that was adequately addressed by this paragraph:
- The contemporary point of view among the working group that oversees URIs is that the terms URL and URN are context-dependent aspects of URI and rarely need to be distinguished. Furthermore, the term URL is increasingly becoming obsolete, as it is rarely necessary to differentiate between URLs and URIs, in general. For popular URL schemes, the term web address is sometimes used instead of URL.
The first sentence is easily confirmed by reading RFC 3305, so I think it should stay.
The second sentence is, in hindsight, a bit of an overstatement. I concede on this point. We should work on it. It's true that it's rarely necessary to differentiate between URLs and URIs, but now I'd say that's the reason URL is not becoming obsolete! (at least in nontechnical publications) – people are not going to start using the broader term if they have no incentive. See my outline below, where I mentioned this point in more detail. I am not sure how to best phrase it for the article; I doubt people will really understand what 'dereference mechanism' means. :/
The third sentence is correct, but could be better qualified. When/where is the term used, and for which schemes, exactly? I mentioned this in the outline below, as well. —mjb 03:55, 2 August 2006 (UTC)
I've updated the 2nd and 3rd sentences in the article today. —mjb 22:19, 2 August 2006 (UTC)
URL isn't "obsolete"
Mjb please see Loganberry and others talk (and URL talk): on the "vulgo semantic" URL isn't obsolete (!). Wikipedia need show this "other side" (not strictly technical) of the URL semantic.
- I did not say "URL" is obsolete, did I? —mjb 20:30, 1 August 2006 (UTC)
- It is my interpretation reading this talk, and analyzing your positions and contribuitions. -- Krauss
- Hmm. Elsewhere in this Talk page, I used the word 'obsolete' when I scolded you for using RFC 2396 as a reference, because that's an outdated version of the URI syntax spec. It has nothing to do with 'URL' though. But it's true, I did say "for the most part, URL/URN are obsolete terms" in Oct 2004, and I did put in the article that the term URL was obsolete, though this was intended to indicate the position of the URI working group; it was not an observation of trends in publishing. I've updated the contentious sentence in the article to address this issue; hopefully it is satisfactory now. —mjb 22:19, 2 August 2006 (UTC)
- It is my interpretation reading this talk, and analyzing your positions and contribuitions. -- Krauss
About URL/URI/URN central (technical) concepts, we have consensus?
- URI concepts:
- The set of "all URI valid strings" is a union of URN set and URL set.
- It have a technical definition more general, appropriate and accurate than URL.
- URN
- Persistence, not-availability
- URL
- Availability
-- Krauss 1 August 2006.
- URI concepts:
- The set of "all valid URI strings" is, essentially, a union of URN set and URL set.
- URI is a more general and appropriate concept than URL for the purpose of resource identification.
- URN
PersistenceNonavailability
- URL
Availability
Please see the comments I made earlier about persistence. URNs are not special in this regard. As compared to URLs, URNs have the intent of being more persistent (by virtue of not being associated with a specific dereference mechanism), but there's nothing inherently persistent about them. And as stated in STD 66, persistence is a goal, but not requirement, of all URIs, regardless of scheme.
'Availability' is also misleading. You can't assume anything about the availability of a resource just based on whether it's a URN or URL. Availability is not a feature of URLs. You're confusing features of widely implemented dereference mechanisms (resolvers) with features of URI schemes.
Try the version below. This outline is, I believe, is everything you need to know in order to understand the core notion of what a URI is, and the relationship between a URI, URN, and URL. —mjb 03:31, 2 August 2006 (UTC)
- URI concepts:
- A URI is a resource ID. The definition of 'resource' is a separate topic.
- Agreed, but clarifying what a resource is can help to clarify how it can be identified. Such a definition, and history of the concept belongs to the web resource article. I wish people participating in this debate here could have a look at what I've written there so far, so that we come to a consensus of what belongs to here, and what belongs to there. So far there is quite a bit of overlap, but seems to me not too much contradiction with what mjb proposes here. universimmedia 07:34, 2 August 2006 (UTC)
- A URI is a character string that must conform to a certain general syntax (defined in STD 66), which may be further restricted by the syntax of a particular URI scheme (e.g., the 'mailto' scheme requires that the URI look like 'mailto:user@host').
- A URI can be dereferenced via any means available to its processor, regardless of scheme. Therefore, all URIs can be treated as resource names, regardless of scheme.
- Many URI schemes require that the URI contain information that enables the potential use of a particular dereference mechanism. URIs conforming to these schemes are called URLs, where the L stands for locator, meaning the URI can be treated not only as a name but also, potentially, as an address. For example, an 'http' URL contains the info needed in order to obtain a representation of the denoted resource via the HTTP protocol. A URL does not imply resource availability, nor does it require the use of a particular derference mechanism.
- The class of URIs conforming to the 'urn' scheme, and (historically) any other schemes that don't imply and enable a particular dereference mechanism, are called URNs. The N stands for name, meaning that there is no information in the URI that allows it to be treated as an address; it can only be treated as a name. URNs are especially useful for denoting resources that aren't network-bound.
- URI is a more general and appropriate concept than URL for the purpose of resource identification.
- The set of "all valid URI strings" is, essentially, a union of URN set and URL set. However, these sets are no longer formally defined; the consolidated URI syntax has replaced separate URN and URL syntax definitions for many years, now.
- URNs, by virtue of not being tied to a particular dereference mechanism, are often thought of as being more "persistent" than URLs. This is akin to saying that a person's name is more reliable than their email address as a means of denoting that person over time. There is, however, no absolute persistence inherent to any URI, although some degree of persistence is required for any URI to be meaningful, so this is a goal of all URI schemes.
- The term URL has had a great longevity and ubiquity in mass-media publications and software for the World Wide Web, where there is a need to refer to resources that must be accessed, on demand, via common network protocols, and where URN registries and dereferencing mechanisms are not widely or consistently defined or implemented. Since the term URN is relatively rare in nontechnical contexts, there is little incentive to favor the term URI over URL in general usage. *(see note below)
- There is a subset of URLs commonly called web addresses. These are http and https URLs, mainly. Web address is not a formally defined term.
- A URI is a resource ID. The definition of 'resource' is a separate topic.
URI reference diagram ideas
Krauss, I welcome the addition of useful diagrams, but your URI reference diagram and its caption are wrong, (and the diagram has a horrible typo, "absute").
Look, a URI reference is not a URI, so your sets analogy doesn't work. A URI reference might look like a URI, but its role is to denote/refer to a resource indirectly, by representing/denoting/referring to a URI.
There is also no such thing as a "relative URI". Only a "relative URI reference". And all URIs are absolute, so "absolute URI" is redundant.
Understand that a URI reference is a reference to a URI, similar to the way a URI is a reference to a resource. You might think of the URI reference as being shorthand or code for a URI.
Types of URI references:
- absolute (identical to a URI)
- relative (a portion of a URI)
I think the relationship would be best illustrated with a diagram that looks sort of like this (feel free to make it pretty):
resource #1 (e.g., a document) --(may contain)--> URI reference --(denotes/refers to)--> URI --. ^ | | | | (identifies/denotes/refers to) (identifies/denotes/refers to) (is relative to) | | | \|/ ("base") URI <-------------------------------' resource #2
Notice how a document has a URI (its URI, also called its "base URI" when dealing with relative URI references) that identifies it, but it does not contain a URI that refers to another resource. Rather, it only contains a URI reference. There are two levels of indirection between the two resources: Resource #1 refers to resource #2 by way of a URI reference which refers to a URI, which in turn refers to resource #2.
A couple of additional notes: When a URI reference is 'absolute', it is still technically 'relative to' a base URI, even though the base URI does not factor into the resolution. Also, when resource #1 and resource #2 have the same URI, then the URI reference in resource #1 is a "same-document" reference, which means that if it is being dereferenced, then no action should be taken. So, for example, if an HTML document links to itself, following the link shouldn't result in fetching a new copy of the document.
—mjb 21:17, 2 August 2006 (UTC)
- Frankly, I don't think that diagrams are all that useful, since they:
- visualize extremly simple relations: a much more informative illustration would be a version of the diagram by mjb above;
- are inaccurate, as explained by mjb, and for explicitly depicting URL and URN sets as disjoint;
- make it unclear whether the superset in both diagrams has elements that are not in subsets (e.g. URIs that are not URLs nor URNs);
- emphesize the concepts of URLs and URNs, which are correctly described in the text as marginal.
- Also, I believe there is some confusion in the article about the position of the fragment part. The current standard (STD 66) states that the fragment is an integral part of the URI (see URI scheme#Generic syntax). This was not so in the previous RFCs: the fragment was a part of the URI reference, but not the URI itself. I think the article should consistently reflect the current norm, while mentioning the previous definitions when introducing the terms, and in more detail in the history section.
- --Hrvoje Šimić 00:16, 4 August 2006 (UTC)
MMS is not an URI scheme
RFC 4355 ("IANA Registration for Enumservices email, fax, mms, ems, and sms") states:
This document registers the Enumservices "email", "fax", "sms", "ems", and "mms" using the URI schemes 'tel:' and 'mailto:' as per the IANA registration process defined in the ENUM specification RFC 3761.
Then, "mms" is not an URI scheme but an Enumservice.
For validating, I check with IANA-maintained registry of URI Schemes. "mms" is not there, but "tel" is. In order to keep the original intent, I replaced "mms" with "tel", which is a registered URI scheme as per RFC 3966. (But tel has not a wiki-article... perhaps we should use "ldap" as an example having its own one...)
btw, I realized the aforementioned IANA registry was not among the external links. I included it.
(Rjgodoy 02:49, 2 March 2007 (UTC))
Venn diagram is misleading
"A URI can be further classified as a locator, a name, or both." [RFC 3986] (section 1.1.3)
"An individual scheme does not have to be classified as being just one
of "name" or 'locator'." [RFC 3986] (section 1.1.3)
URI schemes which are both URN and URL may not be usual, but URL and URN are defined not to be mutually exclusive. —Preceding unsigned comment added by Rjgodoy (talk • contribs) 01:53, 25 March 2007
- I only restored the diagram and accompanying paragraph yesterday just because it had been deleted anonymously without explanation, and the deletion was buried by other edits a day later, so apparently no one noticed, since the deletion would've only been on people's watchlist for about 32 hours. If the diagram is misleading, feel free to remove it again. —mjb 18:10, 25 March 2007 (UTC)
- I replaced the image here, as well as in fr-wikipedia and uk-wikipedia (which also refered the incorrect diagram). I think the image caption is fine, so I am not modifying it now. (Rjgodoy 04:56, 26 March 2007 (UTC))
- Thanks! —mjb 18:38, 27 March 2007 (UTC)
- The Venn Diagram appears to further suggest that there can be URIs which are neither URLs nor URNs. But the above quote from the RFC states that every URI is one or both of a URL or URN, right? 64.131.235.123 (talk) 08:46, 2 December 2009 (UTC)
URI now redirects here
Thus, I have added the redirect template. Enjoy! -Matt 16:48, 24 April 2007 (UTC)
Practical example?
I know that there's a way to set up Windows to recognize certain URIs and act on them in certain special ways. For instance, AOL Instant Messenger uses the "aim:" URI, which can be used to launch AIM message windows (even from hyperlinks, etc.). I was hoping this page could at least give me a link into how to "register my own URIs" on a Windows machine, but I see no such practical help here.
:"While Wikipedia has descriptions of people, places, and things, Wikipedia articles should not include instructions (...) or contain "how-to"s. This includes tutorials, walk-throughs, instruction manuals, video game guides, and recipes. (...) If you're interested in a how-to style manual, you may want to look at our sister project Wikibooks." (from WP:NOT#INDISCRIMINATE). Please sign your posts and put them on the bottom of the talk page. Rjgodoy 19:05, 24 April 2007 (UTC)
[HKEY_CLASSES_ROOT\UriScheme] "EditFlags"=hex:00,02,00,00 "URL Protocol"="" [HKEY_CLASSES_ROOT\UriScheme\shell] [HKEY_CLASSES_ROOT\UriScheme\shell\open] [HKEY_CLASSES_ROOT\UriScheme\shell\open\command] @="command %1"
Where UriScheme is the actual scheme (e.g. http), and command is the command you want to associate with this scheme. A %1 in the command value is a placeholder for the full uri being derefereced. Rjgodoy (talk) 02:21, 11 July 2008 (UTC)
Status of 'space'
I'd like to see some information about the use of spaces. URL's often contain them, but the standard seems to indicate that white space should be removed. Mike Dallwitz 00:41, 2 August 2007 (UTC)
- Hmmm... Valid URIs must only contain delimiters and characters in the (ALPHA / DIGIT / "-" / "." / "_" / "~") set. I bet the "URLs with spaces" you refer are simply a trick of your user agent (i.e. browser). Your browser is surely converting such URLs into valid ones before commiting the request. You may read RFC
3986 for further information. Rjgodoy 06:56, 2 August 2007 (UTC)
Simple and didactic text...
201.52.194.78: Thanks for being bold, though I would like to have this issues discussed before such drastical changes. I have removed your previous edits because of the following concerns.
- Use of invisible comment instead of discussion in talk page.
Your message was: "this is a core information for Wikipedia public, USE INFORMAL, SIMPLE AND DIDACTIC TEXT PLEASE!".
- Removal of information for the sake of "simple and didactic text" (i.e., some aspects you considered too advanced were censored from the article.)
- Use of second grammar person WP:STYLE.
- Factual inaccuracy (e.g."Followed the scheme name, the colon character, ":", is a scheme-specific part, and, optionally, the query and fragment parts, that have reserved characters, "?" and "#" to indicate these parts".).
- I kept the references to W3C materials related to Addressing and W3C URI Clarification, but removed URN Information Center and IANA-maintained registry of URN namespaces because they were off-topic (they are exclusively related to URN, which are discussed in a separate article).
Rjgodoy (talk) 07:01, 7 February 2008 (UTC)
Addressing the issues you raised, I have just added the {{technical}} template in the article. Hopefully it will motivate discussion towards making it more readable. Rjgodoy (talk) 07:17, 7 February 2008 (UTC)
- SEE BELOW the splited text (it is a "core information for Wikipedia public", and need (work here to do this) to be informal, simple and didactic text)
Spliting "Relationship to URL and URN"
[ [ Image:URI Venn Diagram.svg| ... ] ]
A URI may be classified as a locator (URL) or a name (URN) or both.
A Uniform Resource Name (URN) is like a person's name, while a Uniform Resource Locator (URL) is like their street address. The URN defines something's identity, while the URL provides a method for finding something. Essentially, "what" vs. "where".
URNs are often compared to the ISBN system for uniquely identifying books (and in fact you can encode an ISBN as a URN). Having a book's unique identifier lets you discuss the book, such as whether you've read it.
A URL is a URI that, in addition to identifying a resource, enable access to a object, like a personal computer file path, but addressing the whole Web, not only your PC's hard disk (like C:/MyDocuments/RomeoAndJuliet.doc, a file address into your PC).
To actually read the book, you need its location. So URNs and URLs are often complementary.
Example: you can cite a specific edition of the Shakespeare's book "Romeo and Juliet" with your ISBN number, ex. ISBN 0486275574, or access (and download it) with your URL, http://www.gutenberg.org/etext/1112.
There are, usually, for public digital contents, a one-to-many relation between URN and URL: one URN unique named object have one or more URL address for access it.
Technical view
A URL is a URI that, in addition to identifying a resource, provides means of acting upon or obtaining a representation of the resource by describing its primary access mechanism or network "location". At the URL http://www.gutenberg.org/etext/1112 example, it implies that a representation of that resource (such as the home page's current HTML code, as encoded characters) is obtainable via HTTP from a network host named www.gutenberg.org.
A Uniform Resource Name (URN) is a URI that identifies a resource by name in a particular namespace. A URN can be used to talk about a resource without implying its location or how to dereference it. For example, the URN urn:isbn:0-395-36341-1 is a URI that, like an International Standard Book Number (ISBN), allows one to talk about a book, but doesn't suggest where and how to obtain an actual copy of it.
In technical publications, especially standards produced by the IETF and the W3C, the term URL has long been deprecated, as it is rarely necessary to distinguish between URLs and URIs. However, in nontechnical contexts and in software for the World Wide Web, the term URL remains ubiquitous. Additionally, the term web address, which has no formal definition, is often used in nontechnical publications as a synonym for URL or URI, although it generally refers only to 'http' and 'https' URIs.
Some comments
- Instead of C:/MyDocuments/RomeoAndJuliet.doc say file:///C:/MyDocuments/RomeoAndJuliet.doc (the latter is a URI).
- "A Uniform Resource Name (URN) is like a person's name." It's more like a [social security number]... it likely several people actually have the same name. Besides you give no clues about namespaces. They are a key concept about URNs, which are further subdivided in namespaces, and each namespaces has specific rules.
- "There are (...) a one-to-many relation between URN and URL". Citation needed. There is also possible a many-to-one relation [RFC 2483][RFC 2126] "A single resource, however, may have more than one URN to it for different purposes"[RFC 3406]...
- The last paragraph in the "technical" version is essentially non technical. It has no equivalent in the "simple and didactic" version...
Rjgodoy (talk) 16:43, 7 February 2008 (UTC)
PS: have you tried the Simple English Wikipedia?. I'm not familiar with it, but it might address your concerns. Rjgodoy (talk) 16:46, 7 February 2008 (UTC)
Sugestion for general examples section
For clean and improve the article. With a "Examples" section we can cite, like "see example 2", and unify examples and concept ilustrations.
Example | URI | Is a (example for) |
---|---|---|
1 | http://somehost/absolute/URI/with/absolute/path/to/resource.txt
|
URL, URI reference, absolute URI |
2 | ftp://somehost/resource.txt
|
URL, URI reference, absolute URI |
3 | urn:issn:1535-3613
|
URN, URI reference, absolute URI |
4 | http://wiki.riteme.site/wiki/URI#Examples ("http" is the scheme name, "wiki.riteme.site" is the host part -domain name-, "/wiki/URI" the path pointing to this article, and "#Examples" is a fragment pointing to this section.)
|
URL, URI reference |
5 | http://example/resource.txt#frag01
|
URL, URI reference |
6 | ... | ... |
PS: what the rules for use link, not-link (nowiki), tt, and code?? Suestion is also to show rules here and apply in the article text.
- Also, please list an example of an URI that is neither URL nor URN, as the introduction and diagram clearly tell that an URI may be classified one of both. -- 62.16.209.210 (talk) 08:59, 6 September 2008 (UTC)
RFC 3305 and URL vs URN vs URI
Since most of the relationship and terminology stuff is covered in RFC 3305 and that represents a formal work between the W3C and the IETF, I've taken the liberty of adding it to the Relationship section. Mmealling (talk) 14:23, 9 July 2008 (UTC)
Refinement of specifications
In the section URI#Refinement of specifications, it says that the meaning of the U in URI was changed from Universal to Uniform when [RFC 2396] replaced [RFC 1738]. But the title of 1738 is "Uniform Resource Locators (URL)".
Maybe this is the only error, but it makes me suspicious that the other descriptions of what was changed in each RFC may also be wrong -- I've never studied them myself. So I've flagged the section to ask for expert attention.
--208.76.104.133 (talk) 02:28, 4 October 2008 (UTC)
- The article reads With the publication of RFC 2396 (...) most parts of RFCs 1630 and 1738 became obsolete. Actually, 1738 is updated (instead of obsoleted) by 2396, and 2396 references 1630 without modifying it status (as of October 2008, RFC 1630 has not been obsoleted). However, 2396 rewords URI (which are not mentioned in RFC 1738, but in RFC 1630. I think this is better expressed by and all parts of RFCs 1630 and 1738 relating to URIs and URLs in general were revised and expanded which is already in the article.
- Note that RFC 1738 says nothing about URIs. Indeed, 2396 modifies the acronym from RFC 1630. I will modify the paragraph a bit, but I'll leave the tag for inviting other people to evaluate this section.
- Rjgodoy (talk) 07:03, 10 October 2008 (UTC)
- The last line of the paragraph is a too-long sentence and misleading. Actually, the changelog Section G3 of RFC 2396 says that The definition of specific URL schemes and their scheme-specific syntax and semantics has been moved to separate documents.. So, I used that sentence from the RFC to change the existing one. I read the rest and added the W3C IG Note I edited myself (Cool uris for the semantic web), which also is a hint that I am an expert on the topic (leobard=leo sauermann, google will verify this if you don't trust it). I could not verify every point about the changes in the RFCs, but it reads as if it is ok, so I, as W3C member, now remove the "expert needed" tag. Surely, another expert will come later and correct me, but that is what wikipedia is about. --Leobard (talk) 19:39, 3 December 2008 (UTC)
I will close this now on the page, who is responsible to remove this section? or are these sections ever removed? --Leobard (talk) 19:39, 3 December 2008 (UTC)
Explanatory Image
I've found this SVG useful for describing the differences between URwhatevers. I chose tag: URNs 'cause I don't know if, in light of urn: resolution schemes, urn:s even are URNs. I illustrate ambiguities in mapping URNs to IRIs 'cause, well, they exist. Anyways, do what you will with it. (Do I just need to put a CC license in the SVG header?) EricP (talk) 23:31, 26 March 2009 (UTC)EricP
self-referencing?
I think perhaps one of the examples might be violating that rule about not having Wikipedia talk about itself in the first person or somthing --TiagoTiago (talk) 05:35, 9 May 2009 (UTC)
Requested move
- The following discussion is an archived discussion of a requested move. Please do not modify it. Subsequent comments should be made in a new section on the talk page. No further edits should be made to this section.
The result of the move request was: Moved to new Uniform resource identifier Mike Cline (talk) 12:18, 24 November 2011 (UTC)
Requested move
Uniform Resource Identifier → Universal resource identifier Uniform resource identifier — The term "uniformversal resource identifier" is a common name not a proper name. The term is commonly capitalized in the industry because it is usually referred to by it acronym and the caps help the reader relate the acronym to the words in the term. However on Wikipedia common names that have acronyms are not treated that way (see WP:CAPSACRS). Along with the rename of the article, the spelled out term would be down-cased as well. Jojalozzo 16:35, 16 November 2011 (UTC)
- Surely you mean Uniform resource identifier, not "universal"? I can't see much to support a rename as well as a capitalization change. 109.155.186.44 (talk) 09:15, 17 November 2011 (UTC)
- Wow! Thanks for paying attention where I obviously didn't. [face red] Jojalozzo 23:42, 17 November 2011 (UTC)
- Support—not a protocol, not proprietary: it's a generic term, even if it has specific meaning. Tony (talk) 06:57, 20 November 2011 (UTC)
- Support per nom and Tony. Uniform Resource Locator has the same issue, and there's been one comment on that talk page suggesting the same kind of move, and no objection. -Pnm (talk) 04:09, 22 November 2011 (UTC)
- The above discussion is preserved as an archive of a requested move. Please do not modify it. Subsequent comments should be made in a new section on this talk page. No further edits should be made to this section.
This is an archive of past discussions about Uniform Resource Identifier. Do not edit the contents of this page. If you wish to start a new discussion or revive an old one, please do so on the current talk page. |
Archive 1 | Archive 2 |