“Code4Lib Journal”第17号掲載論文のアブスト試訳

2012年6月1日付けで“Code4Lib Journal”のIssue 17が刊行されました（MLでアナウンスがあったのは6月25日）．

システムやウェブ系の話題に関心のある図書館関係者ならご存知だと思いますが，このCode4Lib Journalは年に3〜4号刊行されているオープンアクセスジャーナルです．僕らが昨年投稿したのはIssue 15でした．Code4Libというコミュニティは日本にも波及してCode4Lib JAPANという団体が生まれています．毎年開催されているカンファレンスには日本からも参加していますね（2011年，2012年の報告）．

以下では，Issue 17に掲載された7本の記事[*1]のアブストを試訳（やや意訳ぎみ）してみます．というのもじつは，みんなで数誌ずつ分担し，図書館系の英文誌に最新号が出るたびに掲載論文の概要をざーっと紹介するという企画をやってみたいなぁと夢想したりしていて．やってみたいことがあるときはまずは自分で試しに，といういつものノリ．

こういうのって，アブストだけだときっと内容がよく分からなくて，かといって中身にちゃんと目を通していくときっとしんどくて続かない．悩ましいところですがひとまず今回はアブスト翻訳＋著者紹介だけでいってみますー．つまり本文はまったく読んでいません．

1. Tools for Reducing and Managing Link Rot in LibGuides

http://journal.code4lib.org/articles/7019

1本目は米国フロリダ州立大学図書館のWilhelmina RandtkeとMatthew D. Burrellによる，“LibGuides”をテーマにした記事．LibGuidesは日本では事例が少ないのですが，こういうのです．かつては岡山大学附属図書館[*2]が，現在は九州大学附属図書館が導入しています．

While creating content in LibGuides in quite easy, link maintenance is troublesome, and the built-in link checker offers only a partial solution. The authors describe a method of using PURLs and a third-party link checker to effectively manage links within LibGuides.

LibGuidesを用いるとかんたんにコンテンツを作成することができるが，リンク切れを維持管理するのは面倒で，最初から用意されているリンクチェッカーでは部分的解決にしかならない．本稿ではPURLとサードパーティ製のリンクチェッカーを利用してLibGuides上のリンクを効率的に管理する方法を紹介する．

2. Discovering Digital Library User Behavior with Google Analytics

http://journal.code4lib.org/articles/6942

2本目は米国イリノイ大学アーバナシャンペーン校でデジタルヒューマニティーズスペシャリストという肩書きを持つKirk Hessによる，Google Analyticsを用いたデジタルライブラリの利用分析に関する記事．

Google Analytics has advanced features for tracking search queries, events such as clicking external links or downloading files, which you can use to track user behavior that is normally difficult to track with traditional web logging software. By tracking behavior, you can use Google Analytics API to extract data and integrate it with data from your digital repository to show granular data about individual items. Using this information, digital libraries can learn how users use the site without extensive HCI studies, and can use this information to improve the user experience.

Google Analyticsには，検索クエリや，外部リンクあるいはファイルダウンロードのようなイベントを捕捉する機能があり，従来のログ解析ソフトウェアでは追跡することができなかったユーザの行動を知ることができる．Google AnalyticsのAPIを使ってこういったデータを取り出し，デジタルリポジトリのデータと組み合わせることによって，リポジトリの各コンテンツについての細かい情報を示すことができる．この情報を活用することで，大規模な研究[*3]なしに，ユーザがどのようにデジタルライブラリはを利用しているかを知り，ユーザの体験を向上させることができる．

3. The Martha Berry Digital Archive Project: A Case Study in Experimental pEDagogy

http://journal.code4lib.org/articles/6823

3本目は，米国ブルームバーグ大学ペンシルベニア校の言語学の准教授であるStephanie A. Schlitzと，ペンシルベニア州立大学ユニバーシティパーク校であるソフトウェア開発チームのITマネージャを務めるGarrick S. Bodineよる記事．テーマとなっているMartha Berry Digital Archiveプロジェクトにおいて，Schlitzはディレクタを，Bodineはリードプログラマを担当しています．

Using the Martha Berry Digital Archive Project as an exploratory case study, this article discusses experimental methods in digital archive development, describing how and why a small project team is leveraging undergraduate student support, a participatory (crowdsourced) editing model, and free and open source software to digitize and disseminate a large documentary collection.

Martha Berry Digital Archiveプロジェクトを調査事例として，本稿では，デジタルアーカイブ構築における実験的な方法について議論し，少人数のプロジェクトチームで大規模なコレクションを電子化・公開する際に，学部学生によるサポートや参加型（クラウドソーシング）編集モデル，フリーやオープンソースのソフトウェアを活用した方法とその理由について述べる．

4. Using Semantic Web Technologies to Collaboratively Collect and Share User-Generated Content in Order to Enrich the Presentation of Bibliographic Records–Development of a Prototype Based on RDF, D2RQ, Jena, SPARQL and WorldCat’s FRBRization Web Service

http://journal.code4lib.org/articles/6695

4本目は……タイトル長すぎだろw　書いたのはノルウェーにあるオスロ・アーケシュフース大学という聞きなれない大学の応用科学部の3人．どうもこの学部でアーカイブズ学・図書館情報学が学べるのかな．ファーストのRagnhild Holgersenは修士課程の大学院生で，アレルギーを持つひとが安全な食物を探すのをセマンティックウェブ技術で支援する研究をしているとか．Michael PremingerとDavid Masseyは准教授で，Masseyのほうはオスロ公共図書館でシステムライブラリアンをしていた過去もあるとか．

In this article we present a prototype of a semantic web-based framework for collecting and sharing user-generated content (reviews, ratings, tags, etc.) across different libraries in order to enrich the presentation of bibliographic records. The user-generated data is remodeled into RDF, utilizing established linked data ontologies. This is done in a semi-automatic manner utilizing the Jena and the D2RQ-toolkits. For the remodeling, a SPARQL-construct statement is tailored for each data source. In the data source used in our prototype, user-generated content is linked to the relevant books via their ISBN. By remodeling the data according to the FRBR model, and expanding the RDF graph with data returned by WorldCat’s FRBRization web service, we are able to greatly increase the number of entry points to each book. We make the social content available through a RESTful web service with ISBN as a parameter. The web service returns a graph of all user-generated data registered to any edition of the book in question in the RDF/XML format. Libraries using our framework would thus be able to present relevant social content in association with bibliographic records, even if they hold a different version of a book than the one that was originally accessed by users. Finally, we connect our RDF graph to the linked open data cloud through the use of Talis’ openlibrary.org SPARQL endpoint.

本稿では，書誌レコードの表示をより豊かにするために，さまざまな図書館からユーザが作成したコンテンツ（レビュー，レーティング，タグ等）を収集・共有する，セマンティックウェブに基づく枠組みのプロトタイプを紹介する．ユーザが作成したデータはRDFでモデル化し，Linked Dataオントロジーを活用している．これはJenaとD2RQ-toolkitsを用いて半自動的に行われる．モデル化のためにSPARQL文は各データソースに合わせて作成した．プロトタイプで使用したデータソースでは，ユーザが作成したコンテンツはその対象の図書にISBNを用いてリンクされている．データをFBBRモデルに沿ってモデリングしなおし，WorldCatのFRBRizationサービスを用いてRDFグラフを拡張することで，それぞれの図書へのエントリーポイントを劇的に増加させることができる．このソーシャルなコンテンツを，ISBNをキーとしたRESTfulなウェブサービスを通じて提供している．このウェブサービスは，その本のあらゆる版に関係するユーザ作成データをRDF/XMLフォーマットで返す．我々が開発した枠組みを使用することで，図書館では，図書に関連したソーシャルコンテンツ（所蔵しているものと別版に対するものでも）を表示することができるようになる．最終的に我々は，Talisによるopenlibrary.orgのSPARQLエンドポイントを使って，作成したRDFグラフをLinked Open Data Cloudに接続した．

# いやーよく分かんないですね……．

5. GLIMIR: Manifestation and Content Clustering within WorldCat

http://journal.code4lib.org/articles/6812

5本目はOCLCの4人による，GLIMIRプロジェクト[*4]の現状を紹介する記事．それぞれ，EMEA[*5]プログラムのメタデータマネージャであるJanifer Gatenby，シニアコンサルティングデータベーススペシャリストのRichard Greene，OCLC勤務20年というディベロッパ・リサーチャのW. Michael Oskins，シニアレベルディベロッパ・リサーチャのGail Thornburg．ファーストのGatenbyだけは米国オハイオ州の本部ではなくオランダのライデンで働いているみたいですね．ISNI（International Standard Name Identifier）関係の仕事もしているよう．

The GLIMIR project at OCLC clusters and assigns an identifier to WorldCat records representing the same manifestation. These include parallel records in different languages (e.g., a record with English descriptive notes and subject headings and one for the same book with French equivalents). It also clusters records that probably represent the same manifestation, but which could not be safely merged by OCLC’s Duplicate Detection and Resolution (DDR) program for various reasons. As the project progressed, it became clear that it would also be useful to create content-based clusters for groups of manifestations that are generally equivalent from the end user perspective (e.g., the original print text with its microform, ebook and reprint versions, but not new editions). Lessons from the GLIMIR project have improved OCLC’s duplicate detection program through the introduction of new matching techniques. GLIMIR has also had unexpected benefits for OCLC’s FRBR algorithm by providing new methods for identifying outliers thus enabling more records to be included in the correct work cluster.

OCLCのGLIMIRプロジェクトは，WorldCat上のレコードを「体現形」（manifestation）ごとにクラスタリングしてそのひとつひとつに識別子を付与するものである．クラスタには言語の異なる「パラレルレコード」（例：同一の本に対して英語で記述を行ったレコードとフランス語で記述を行ったレコード）も含んでいる．また，同じの体現形を表すと思われるものの様々な理由から重複検知・解消（DDR）プログラムによって正しく統合されていないようなレコードも含んでいる．プロジェクトが進むにつれ，ユーザからすれば一般的には同等である（プリント版と，そのマイクロフィルム版，電子書籍版，リプリント版のような．新しい版は除く）レコードたちに対して，コンテンツに基づくクラスタリングを行うことも有用であることが明らかになってきた．プロジェクトで得られた教訓によって，新しいマッチング技術の導入によるDDRプログラムの改善を行うことができた．また，異常値のレコードを特定して正しい「著作」（work）に含めることができる方法を開発し，FRBRアルゴリズムに予想外のメリットをもたらすこともできた．

6. Case Study: Using Perl and CGI Scripts to Automate a Quality Control Workflow for Scanned Congressional Documents

http://journal.code4lib.org/articles/6731

6本目はDoreva Belfioreによる，スキャンした資料の画質改善を自動化した事例の報告．彼女は2009〜2011年に米国ラトガース大学の法学図書館でデジタルライブラリのインターンをし，2011年にドレクセル大学でMLISを取得．現在はテンプル大学のデジタルイニシアティブ担当部署でアシスタントを務めているという．インターン時代の経験をまとめたもののようですね．

The Law Library Digitization Project of the Rutgers University School of Law in Camden, New Jersey, developed a series of scripts in Perl and CGI that take advantage of the open-source module PerlMagick to automatically review the image quality of scanned government documents. By implementing these procedures, Rutgers was able to save staff working hours for document quality control by an estimated 25% percent from the previous manual-only workflow. These scripts can be adapted by novice Perl and CGI programmers to review and manipulate large numbers of text and image files using commands available in PerlMagick and ImageMagick.

ラトガース大学ロースクールにおける法学図書館電子化プロジェクトでは，スキャンした政府資料の画質を自動補正するためにオープンソースのPerlモジュールPerlMagickを使ったPerlやCGIスクリプトを開発した．これによりラトガース大学ではクオリティコントロールに係るスタッフの労働時間を推定25%削減することができた．これらのスクリプトは，PerlMagickやImageMagickのコマンドを使うことで，PerlやCGIの初心者でも大量のテキストと画像を扱うことができるようになっている．

7. From the Catalog to the Book on the Shelf: Building a Mapping Application for Vufind

http://journal.code4lib.org/articles/6924

最後．7本目は，米国イェール大学図書館の3人による記事．Kathleen Bauerは同館にユーザビリティ・評価部門を創設し，現在そこのディレクタを務めている方．数学の修士号を持っているというところにめちゃ親近感です．Michael Frisciaはデジタルライブラリ・プログラミングサービス部門という，図書館のウェブサイトやデジタル資源管理などを担当しているところのマネージャさん．初めて聞いたのですがLadybirdの開発者とか．Scott Mathesonはデジタル資源の担当．ワシントン大学でMLISとJD（Juris Doctor）を取って，同館でパブリックサービスやITを担当して，と，まだ比較的若手なのかなという印象です．

At Yale University Library (YUL), recorded reference transactions revealed that after finding a book in the catalog patrons had difficulty knowing how to use the call number to find the book on the shelf. The Library created a mobile service to help locate the call number in the library stacks. From any call number of a book in Sterling Memorial Library at YUL, a map will be displayed which highlights that call number’s general area on a floor in the stacks. YUL introduced the mapping application in Yufind, a catalog in place at Yale since 2008 which is based on Vufind.

イェール大学図書館では，レファレンス記録によって，利用者がオンラインカタログで図書を見つけたものの請求記号からどのようにして書架の図書を探せばいいのかを理解するのに苦労していることが明らかになった．そこで同館ではその請求記号が書庫のどこにあるか調べるのを手助けするモバイルサービスを開発した．イェール大学スターリン記念図書館では，請求記号からその図書のおおまかな位置をハイライトして示してくれる地図が表示される．この地図機能はYuFind（2008年にVuFindをベースに開発したオンラインカタログ）にも導入された．

訳してみての感触

僕は片道90分ほどかけていくつもの電車に揺られて毎日通勤しています．そのうち座ってMacBook Airを開くことができる時間は合計で30分程度．その細切れの時間を使ってIssue 17の公開から4日間ほどでこのアブストを訳しました．月に1度か2度くらいならこういう週があってもいいかなという感じなので，季刊誌なら4タイトルほど担当できるかも．

他に気にしている雑誌と言うと……

College & Research Libraries
Library Hi Tech
Journal of Interlibrary Lending, Document Delivery & Electronic Reserve
Interlending & Document Supply
Serials Review
arXiv cs.DL

とか．

*1:ほかにはTim Lepczykによるエディトリアルと，Amy UngerによるCode4Lib 2012カンファレンスの参加レポートもあります．

*2:http://ir.lib.hiroshima-u.ac.jp/00025875

*3:原文にはHCI=Human Computer Interaction研究とあります．人間とコンピュータの関係を調査する研究分野？

*4:http://current.ndl.go.jp/e1223

*5:Europe, Middle East and Africa. http://www.oclc.org/councils/emea/default.htm