New node.js microformats parser - microformat-node
I have built a node.js microformats parser, it is based on my previous javascript parsing code. It has been packaged up so you can easily be add to your projects using npm.
Source code: https://github.com/glennjones/microformat-node /> Test server : http://microformat-node.jit.su
Install
npm install microformat-node
or
git clone http://github.com/glennjones/microformat-node.git
cd microformat-node
npm link
Use
with URL
var shiv = require("microformat-node");
shiv.parseUrl('http://glennjones.net/about', {}, function(data){
// do something with data
});
or with raw html
var shiv = require('microformat-node');
var html = '';
shiv.parseHtml(html, {}, function(data){
// do something with data
});
with URL for a single format
var shiv = require("microformat-node");
shiv.parseUrl('http://glennjones.net/about', {'format': 'XFN'}, function(data){
// do something with data
});
Supported formats
Currently microformat-node supports the following formats: />
hCard, XFN, hReview, hCalendar, hAtom, hResume, geo, adr and tag. Its important to use the right case when specifying the format query string parameter.
Response
This will return JSON. This is example of two geo microformats found in a page.
{
"microformats": {
"geo": [{
"latitude": 37.77,
"longitude": -122.41
}, {
"latitude": 37.77,
"longitude": -122.41
}]
},
"parser-information": {
"name": "Microformat Shiv",
"version": "0.2.4",
"page-title": "geo 1 - extracting singular and paired values test",
"time": "-140ms",
"page-http-status": 200,
"page-url": "http://ufxtract.com/testsuite/geo/geo1.htm"
}
}
Querying demo server
Start the server binary:
$ bin/microformat-node
Then visit the server URL
http://localhost:8888/
Using the server API
You need to provide the url of the web page and the format(s) you wish to parse as a single value or a comma delimited list:
GET http://localhost:8888/?url=http%3A%2F%2Fufxtract.com%2Ftestsuite%2Fhcard%2Fhcard1.htm&format=hCard
You can also use the hash # fragment element of a url to target only part of a HTML page. The hash is used to target the HTML element with the same id.
Viewing the unit tests
The module inculdes a page which runs the ufxtract microfomats unit test suite.
http://localhost:8888/unit-tests/
Notes for Windows install
microformat-node using a module called ‘jsdom’ which in turn uses ‘contextify’ that requires native code build.
There are a couple of things you normally need to do to compile node code on Windows.
- Install python 2.6 or 2.7, as the build scripts use it
- Run npm inside a Visual Studio shell, so for me, Start->Programs->Microsoft Visual Studio 2010->Visual Studio Tools->Visual Studio Command Prompt
If you have the standard release of node it will probably be x86 rather than x64, for x64 there is a different Visual Studio shell but usally in same place.