Wikimedia Api Getting Relavant Data From Json String
This is the question I asked yesterday. I was able to get the required data. The final data is like this. Please follow this link. I tried with the following code to get all the in
Solution 1:
Have you tried DBpedia? Afaik they provide template usage information. There is also a toolserver tool named Templatetiger, which does template extraction from the static dumps (not live).
However, I once wrote a tiny snippet to extract templates from wikitext in javascript:
var title; // of the templatevar wikitext; // of the pagevar templateRegexp = newRegExp("{{\\s*"+(title.indexOf(":")>-1?"(?:Vorlage:|Template:)?"+title:title)+"([^[\\]{}]*(?:{{[^{}]*}}|\\[?\\[[^[\\]]*\\]?\\])?[^[\\]{}]*)+}}", "g");
var paramRegexp = /\s*\|[^{}|]*?((?:{{[^{}]*}}|\[?\[[^[\]]*\]?\])?[^[\]{}|]*)*/g;
wikitext.replace(templateRegexp, function(template){
// logabout(template, "input ");var parameters = template.match(paramRegexp);
if (!parameters) {
console.log(page.title + " ohne Parameter:\n" + template);
parameters = [];
}
var unnamed = 1;
var p = parameters.reduce(function(map, line) {
line = line.replace(/^\s*\|/,"");
var i = line.indexOf("=");
map[line.substr(0,i).trim() || unnamed++] = line.substr(i+1).trim();
return map;
}, {});
// you have an object "p" in here containing the template parameters
});
It features one-level nested templates, but still is very error-prone. Parsing wikitext with regexp is as evil as trying to do it on html :-)
It may be easier to query the parse-tree from the api: api.php?action=query&prop=revisions&rvprop=content&rvgeneratexml=1&titles=.... From that parsetree you will be able to extract the templates easily.
Post a Comment for "Wikimedia Api Getting Relavant Data From Json String"