Reading XML – XSL or PHP?
Friday, April 6th, 2007XML, or Extensible Markup Language, is probably the most versatile language there is.
Since you make your own tags, and then get readers to use them, you can use the data in them for pretty much anything. Well, today I was making my php reader display a bunch of URLs based on a category they were in. They also had special attributes and so on, all of which was contained in an XML file.
As I was looking at my sitemap, I realized that it had formatting, but it was an XML, and XML files can’t format themselves. So, I found out that it called an XSL (or Extensible Stylesheet Language) file that gave it formatting. XSL is a language used to make XML in HTML or XHTML. Call glorified HTML or XHTML (has in built functions to read XML). You can make entire pages, or parts of pages in XHTML and call in the data from an XML sheet.
If this is the language, why should you use the roundabout method offered in PHP? It took me some time thinking and wondering the advantages of each. Realise, I don’t know XSL (I started learning today), so I don’t know its full functionality. I do know, however, that you can see the XSL file and you can see the XML file. There’s a difference.
PHP code is server side, meaning completely executed before it reaches you, so you can’t see it. This includes the call to the XML file. This means you can display data based on an XML file without letting the surfer know that you called an XML file. This is a security feature, which I kind of enjoy for my current purpose of the URLs.
Now, I assume that XSL has a lot more functionality than PHP, seeing as how it is a language specifically to convert XML into HTML or XHTML. PHP is a complete programming language. It actually didn’t have a lot of it’s support for XML until PHP5 came out.
One more thing while we’re on the subject of URLs. I needed to grab just the domain name out of a url. For example in “http://www.google.com” I want just “google.com”. In the case of “http://mail.google.co.uk” should match “mail.google.co.uk”. I decided to see if I could put my regex skills to work, and I came up with this: “(?:www\.)?([^.\/]+\.(?:[a-zA-Z]+\.)*(?:[a-zA-Z]{2}\.[a-zA-Z]{2}|[a-zA-Z]{2,4}))” which matches both of those examples. You simply have to grab the return value (there’s only one) and it will have the right result.
-Kerry
