{"id":394,"date":"2014-11-18T21:23:18","date_gmt":"2014-11-19T02:23:18","guid":{"rendered":"http:\/\/codefork.com\/blog\/?p=394"},"modified":"2014-12-22T19:57:08","modified_gmt":"2014-12-23T00:57:08","slug":"a-viaf-reconciliation-service-for-openrefine","status":"publish","type":"post","link":"https:\/\/codefork.com\/blog\/index.php\/2014\/11\/18\/a-viaf-reconciliation-service-for-openrefine\/","title":{"rendered":"A VIAF Reconciliation Service for OpenRefine"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/codefork.com\/blog\/wp-content\/uploads\/2014\/11\/open-refine.png\" alt=\"open-refine\" width=\"514\" height=\"125\" class=\"aligncenter size-full wp-image-425\" \/><\/p>\n<p><a href=\"http:\/\/openrefine.org\/\">OpenRefine<\/a> is a wonderful tool my coworkers have been using to clean data for my project at work. Our workflow has been nice and simple: they take a CSV dump from a database, transform the data in OpenRefine, and export it as CSV. I write scripts to detect the changes and update the database with the new data.<\/p>\n<p>We have a need, in the next few months, to reconcile the names of various individuals and organizations with standard &#8220;universal&#8221; identifiers for them in the <a href=\"http:\/\/viaf.org\">Virtual International Authority File<\/a>. The tricky part is that any given name in our system might have several candidates in VIAF, so it can&#8217;t be a fully automated process. A human being needs to look at them and make a decision. OpenRefine allows you to do this reconciliation, and also provides an interface that lets you choose among candidates.<\/p>\n<p>Communicating with VIAF is not built in, though. <a href=\"http:\/\/iphylo.blogspot.com\/2013\/04\/reconciling-author-names-using-open.html\">Roderic D. M. Page wrote a VIAF reconciliation service<\/a>, and it&#8217;s publicly accessible at the address listed on the linked page (the PHP source code is <a href=\"https:\/\/github.com\/rdmpage\/phyloinformatics\/blob\/master\/services\/reconciliation_viaf.php\">available here<\/a>). It works very nicely.<\/p>\n<p>I wanted to write my own version for 2 reasons: 1) I needed it to support the different name types in VIAF, 2) I wanted to host it myself, in case I needed to make large numbers of queries, so as not to be an obnoxious burden on Page&#8217;s server.<\/p>\n<p>The project is called <b>refine_viaf<\/b> and the source code is available at <a href=\"https:\/\/github.com\/codeforkjeff\/refine_viaf\">https:\/\/github.com\/codeforkjeff\/refine_viaf<\/a>. <\/p>\n<p>For those who just want to use it without hosting their own installation, I&#8217;ve also made the service publicly accessible at <a href=\"http:\/\/refine.codefork.com\">http:\/\/refine.codefork.com<\/a>, where there are instructions on how to configure OpenRefine to use it.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>OpenRefine is a wonderful tool my coworkers have been using to clean data for my project at work. Our workflow has been nice and simple: they take a CSV dump from a database, transform the data in OpenRefine, and export it as CSV. I write scripts to detect the changes and update the database with &hellip; <a href=\"https:\/\/codefork.com\/blog\/index.php\/2014\/11\/18\/a-viaf-reconciliation-service-for-openrefine\/\" class=\"more-link\">Continue reading<span class=\"screen-reader-text\"> &#8220;A VIAF Reconciliation Service for OpenRefine&#8221;<\/span><\/a><\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[7,6,12],"tags":[],"class_list":["post-394","post","type-post","status-publish","format-standard","hentry","category-python","category-user-interface","category-work"],"_links":{"self":[{"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/394","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/comments?post=394"}],"version-history":[{"count":5,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/394\/revisions"}],"predecessor-version":[{"id":427,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/posts\/394\/revisions\/427"}],"wp:attachment":[{"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/media?parent=394"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/categories?post=394"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/codefork.com\/blog\/index.php\/wp-json\/wp\/v2\/tags?post=394"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}