Hack Blogspot

Postingan

Menampilkan postingan dengan label RFC

How To Recognize Valid URL from Text

Twitter and host of Twitter clients as well as tons of other programs have to recognize valid URL's (protocol defined in RFC 1738) from plain text and hyperlink them. Unfortunately dues tyo lazyness (or lack of knowledge) of the programmers such URL detection schemes are often hare-brained and fails torecognize valid URL's properly. URL can use a wide variety of characters and you need to recognize all of them to properly identify and isolate an URL from surrounding text. Here is a simple guide for programmers (based on RFC 1738 obviously): In general, URLs are written as follows: : A URL contains the name of the scheme being used ( ) followed by a colon and then a string (the ) whose interpretation depends on the scheme. Scheme names consist of a sequence of characters. The lower case letters "a"--"z", digits, and the characters plus ("+"), period ("."), and hyphen ("-") are allowed. For resiliency...

Selengkapnya...