I’ve just created a github repository for conciliator, a growing collection of OpenRefine reconciliation services, as well as a Java framework for creating them.
conciliator is a major refactoring of my refine_viaf project and supercedes it. This new project cleanly separates the VIAF-specific parts and the more “boilerplate” pieces needed for any OpenRefine reconciliation service. The result is a framework that allows you to easily write new reconciliation services. My intent here is to make some existing code way more flexible, so that it might be useful to more users and have a longer lifespan.
http://refine.codefork.com has already been running conciliator for a week now; if you’ve been using it, you don’t need to make any changes in OpenRefine.
Currently, conciliator out-of-the-box can query VIAF exactly like refine_viaf does, down to the same URLs. Additionally, conciliator can now query ORCID names. This was a somewhat arbitrary choice; I’ve been doing some ORCID integration at work so it was convenient for me to implement a data source for it as a proof of concept.
With VIAF and ORCID, conciliator acts as an intermediate or “bridge” service, but it would be possible to use conciliator to query other types of data sources as well: files, SQL databases, etc. Right now, you’d have to write your own code to read and parse files, open database connections, etc. But in the future, I hope to add support for these options to make them easier to implement.
For details on how to write your own service in Java using conciliator, see the README.
Are there data sources you’d like to see available as a reconciliation service? Leave a comment to this post. No promises, but I’ll at least consider all requests. And if you write your own service for a data source, please consider submitting your code as a pull request so that others can use it too!
Hey, just wanted to come and say this is a great project! I definitely ‘use this thing’ at UC San Diego. Keep it up.
Much appreciated! Thanks for leaving a comment, Ryan.
This tool is awesome and I can tell you’ve engineered it well, but I haven’t quite figured out how to use it yet. I’m a pretty heavy OpenRefine user, and I have a rather large institutional DSpace repository full of data that needs to be cleaned up quite often, but I haven’t quite figured out how to use conciliator for anything useful with my data yet!
Hi Alan,
Thanks for the note–I would try the OpenRefine google group and find out whether others are doing similar reconciliation on their data for clean-up purposes. It might give you some ideas! Here is the link:
https://groups.google.com/forum/#!forum/openrefine
Hope that helps,
Jeff
Hi,
Could you please provide us an example about how to use an Apache Solr collection.
Good job and I look forward your comments.
Thank you!!
Hi
This software is very useful for us. I created a docker image to run it anywhere. https://hub.docker.com/r/tobinski/docker-codefork-conciliator/
Thank you for taking the time to make that docker image! I will link to it in the readme.
We used the public server for a little proof of concept project at University of Maryland. It has been very useful and we reference it in a recent paper here: http://hdl.handle.net/1903/21835. Thanks for this!