Skip to content Skip to sidebar Skip to footer

Wikimedia Api Getting Relavant Data From Json String

This is the question I asked yesterday. I was able to get the required data. The final data is like this. Please follow this link. I tried with the following code to get all the in

Solution 1:

Have you tried DBpedia? Afaik they provide template usage information. There is also a toolserver tool named Templatetiger, which does template extraction from the static dumps (not live).

However, I once wrote a tiny snippet to extract templates from wikitext in javascript:

var title; // of the templatevar wikitext; // of the pagevar templateRegexp = newRegExp("{{\\s*"+(title.indexOf(":")>-1?"(?:Vorlage:|Template:)?"+title:title)+"([^[\\]{}]*(?:{{[^{}]*}}|\\[?\\[[^[\\]]*\\]?\\])?[^[\\]{}]*)+}}", "g");
var paramRegexp = /\s*\|[^{}|]*?((?:{{[^{}]*}}|\[?\[[^[\]]*\]?\])?[^[\]{}|]*)*/g;
wikitext.replace(templateRegexp, function(template){
    // logabout(template, "input ");var parameters = template.match(paramRegexp);
    if (!parameters) {
        console.log(page.title + " ohne Parameter:\n" + template);
        parameters  = [];
        }
    var unnamed = 1;
    var p = parameters.reduce(function(map, line) {
        line = line.replace(/^\s*\|/,"");
        var i = line.indexOf("=");
        map[line.substr(0,i).trim() || unnamed++] = line.substr(i+1).trim();
        return map;
    }, {});
    // you have an object "p" in here containing the template parameters
});

It features one-level nested templates, but still is very error-prone. Parsing wikitext with regexp is as evil as trying to do it on html :-)

It may be easier to query the parse-tree from the api: api.php?action=query&prop=revisions&rvprop=content&rvgeneratexml=1&titles=.... From that parsetree you will be able to extract the templates easily.

Post a Comment for "Wikimedia Api Getting Relavant Data From Json String"