Recent Innovations: Short Papers and Posters

Starts at
Tue, Nov 7, 2023, 16:00 South Korea Time
( 07 Nov 23 07:00 UTC )
Finishes at
Tue, Nov 7, 2023, 17:00 South Korea Time
( 07 Nov 23 08:00 UTC )
Venue
Room 203
Moderator
Jian Qin

Moderator

  • Jian Qin

    Syracuse University

    Jian Qin is Professor at the School of Information Studies, Syracuse University. Her research focuses on metadata and knowledge modeling, knowledge organization, research data management, and scholarly communication. She has published widely and given presentations at numerous national and international conferences and workshops. Her research has been funded by the U.S. National Science Foundation, U.S. National Institutes for Health, and Institute for Museum and Library Services, among others. Jian Qin is a co-author of the Metadata book and the recipient of the 2020 Frederick G. Kilgour Award for Research in Library and Information Technology.

Presentations

Metadata in Trustworthy AI: From Data Quality to ML Modeling

Authors: Jean Quin, and Bei Yu

This concept paper focuses on what role metadata play in an AI lifecycle and how metadata research can ride out this AI wave with innovative creations. Specifically, we explore metadata’s role and potential related to data quality and ML models. The multidimensionality of metadata for data in AI is driving metadata to be micro-specific, embedded in data and models, highly computational, and fast-moving or agile. While there are no universally agreeable metadata schemas for documenting the artifacts in ML model development, there are some common areas or types of metadata for ML models. Data quality and ML models are tightly connected and can impact one another in significant ways. Trustworthy AI must rely on quality data and responsible, ethical, reproducible, verifiable ML models, and the assurance of these data and ML model properties relies on metadata. The complex, fast paced, and highly computational nature of metadata for AI artifacts (datasets, models, pipelines, algorithms, lineages, etc.) is making conventional metadata development processes and methods outdated, but meanwhile has prompted some innovative metadata creations.

  • Jian Qin

    Syracuse University

    Jian Qin is Professor at the School of Information Studies, Syracuse University. Her research focuses on metadata and knowledge modeling, knowledge organization, research data management, and scholarly communication. She has published widely and given presentations at numerous national and international conferences and workshops. Her research has been funded by the U.S. National Science Foundation, U.S. National Institutes for Health, and Institute for Museum and Library Services, among others. Jian Qin is a co-author of the Metadata book and the recipient of the 2020 Frederick G. Kilgour Award for Research in Library and Information Technology.

  • Bei Yu

    Syracuse University

    My research areas are Natural Language Processing and Computational Social Science. My recent work focuses on using machine learning and natural language processing techniques to improve science information quality, such as developing NLP methods for identifying exaggerated claims in science papers and press releases.

BIBFRAME Interoperability Group One Year Update

Authors: Xiaoli Li and Niklas Lindström

The international BIBFRAME Interoperability Group (BIG) was initiated by the Program for Cooperative Cataloging (PCC) with the goal to "work collaboratively on the development and maintenance of interoperable BIBFRAME data guidelines to support production level implementation, to address issues restricting interoperability, and to inform development of associated toolings and infrastructure." (BIBFRAME Interoperability Group charge, April 15, 2022). The idea originated at the PCC BIBFRAME Data Exchange Meeting (September 2021), where the participants identified different implementation decisions of the BIBFRAME ontology as major obstacles to successful BIBFRAME data exchange. The conclusion was reached that an international group was needed to continue the conversation in this area.

BIG officially took up its work in June 2022. In its first year of existence, BIG reviewed the output of several related working groups, conducted a BIBFRAME implementation survey and analyzed the results, helped organize the 2022 Linked Data Summit held at the Library of Congress in November 2022, and developed a work plan in response to the recommendations from the Summit.

This work plan outlines four working areas: 1) defining a standard BIBFRAME “shape” necessary for data exchange utilizing PCC data and standards as a starting point;
2) creating recommendations that are readable by both technical staff and librarians (preferably only necessitating updates to be made in one place); 3) codifying the interoperability scope; and 4) documenting best practices for technical aspects of BIBFRAME interchange as identified through the work of the group and sharing with consultants for testing and validation of assumptions.

This poster presentation will first introduce the BIBFRAME Interoperability Group, including its charge, governance, and membership as well as the process for assembling this group. It will then go over the work plan, focusing on sharing our experience with using DC Tabular Application Profiles (DC TAP) and generating SHACL representation of the application profile. Lastly, the presentation will outline the next steps that BIG is planning to take, including determining patterns from concrete use cases to anchor what practical value the various profile decisions stem from.

  • Xiaoli Li

    University of California Davis Library

    Xiaoli Li is the Head of the Content Support Services of the UC Davis Library. She started working with linked data in 2013 and led her library’s participation in various linked data projects. She currently coordinates the ELUNA/IGeLU Linked Open Data Community of Practice Working Group and the Chinese Culture and Heritage Wikidata group. She also serves on the BIBFRAME Interoperability Group and Share-VDE Share Family National Bibliographies Working Group as well as co-teaches a six-week-long introductory course on linked data.

Designing a Linked Data Service across borders and timezones: the National Library Board’s experience

This paper presents the implementation of a Linked Data Management System (LDMS) at the National Library Board Singapore (NLB), which aims to provide a unified view of bibliographic descriptions from diverse collections across the National Library, the Public Libraries, and the National Archives. The National Library Board worked with a vendor to convert metadata from multiple sources into entities in a triplestore for use in resource discovery. This paper will outline the challenges faced and lessons learnt in the production of Linked Data and improving data quality. The global pandemic forced the team to work remotely with an overseas vendor, which compounded the complexity of communicating relatively complex concepts and troubleshooting data issues. Great emphasis was placed on determining causes for data issues and correcting these. Challenges arose from the insertion of URIs into the source records to identify entities matching the string label as the reconciliation of entities extracted from different source systems is based on similarity of the name label and associated properties. The paper concludes with an outlook on the continuous refinement of data quality and the development of public interfaces to demonstrate the benefits of Linked Data to stakeholders. The development of this service inserts the discovery aspect directly into resources, demonstrating the potential of Linked Data to shape future services.

  • Robin Dresel

    National Library Board

    Robin’s work has been closely linked to the internet since taking his MSc in Germany in the early 2000s. After a brief stint in SEO, he joined Goethe-Institut Singapore as Webmaster in 2004, a role in which he would join the National Library Board 5 years later. After gaining operational experience as library manager, before shifting his focus back to his roots as he is looking into the development of Linked Data Services including the implementation of the latest Linked Data Management System.