Art museum collections online: Extending their reach

Joan Beaudoin, Wayne State University, USA


Online collections are able to extend the multiple important functions of museums to many individuals beyond the confines of museum walls. Thus, this paper examines the availability of, and ease of access to, art museum collections presented online via institutional web sites. Focusing on art museum collections in the United States, this project analyzes the data behind the representation of museum objects in the online setting to codify and report on shared descriptive and multimedia practices. Furthermore, it highlights common difficulties (e.g., determining the sort order of retrieved items, and the meaning of field labels) visitors experience while examining online collections. This presentation seeks to clarify the current state of the art surrounding online art museum collections and provide context for why studies of visitors to museum web sites report limited use of online collections. Questions concerning the fundamental features of online collection access, such as what percentage of collections are available online and the semantic commonalities found across museums, are explored and allow us to reflect upon how far museums have extended their reach.

Keywords: online collections, digital objects, online visitors, museum users, knowledge organization, documentation


As all art reflects the time and place in which it was created, the underlying mission of art museums is to engage, instruct, inspire, and enrich lives. Art inevitably causes each of us to look at our life and role in our shared human history. These multiple important functions can be extended to the many individuals beyond the confines of museum walls through their interactions with online collections. Access to collections has been expanded with the use of technologies which facilitate interactions with online content among widely disbursed users. This situation expands access beyond the limited number of collection items that are available for viewing during in-person visits to museums and helps to remove multiple barriers which exist in accessing artworks.

An acknowledgement of the importance of access to museum collections can be seen in the topic’s prominence in the guiding principles of national and international organizations supporting museums and their staff. The various codes of ethics published by these organizations all note that providing access to collections is of primary importance (American Alliance of Museums, 2000; Association of Art Museum Directors, 2011; International Council of Museums, 2017). Some institutions are making strides to fulfill their promise of full access to collections by placing collection records and sometimes media files (i.e., images, audio, and video) associated with works online and available for public consumption. However, little is known about the overall state of online access to U.S. art museum collections, and how users interact with and experience these collections. As a means of addressing this situation this exploratory research project presents preliminary findings about the availability and fundamental properties of online collections of U.S. art museums and galleries.


Art museum collections contain items that are representative of our shared cultural heritage, and yet these items remain largely inaccessible. Interaction with collections is at the heart of the mission of museums, as artworks are understood to enrich the lives of individuals in a variety of ways. Art has been noted as enhancing lives by increasing knowledge through research and learning, inspiring creative processes, developing a sense of identity and belonging, and bringing enjoyment through the ability to transport us from the mundane experiences of everyday life (Villaespesa, 2017; Whelan, 2015; Elwick, 2013; Keene, 2006).

Access to items held within museums is impacted by a multiplicity of factors at institutional (e.g., limited exhibition spaces, costs associated with conservation and display, restricted hours of operation), societal (e.g., transportation routes, economics, educational practices), and individual (e.g., limited finances and/or leisure time, mobility issues, distance to collection) levels, and online technologies offer a means of expanding access to museum collections. An acknowledgement of this situation can be found in the number of museums which currently provide access to some portion of their collections through their websites. While research focusing on museum websites has been addressed through a number of studies (Capriotti, Carretón & Castillo, 2016; Kabassi, 2016; Lopatovska, 2015; López, et al., 2010; Ellenbogen, Falk & Goldman, 2008; Marty, 2008b; Pallas & Economides, 2008) and online technologies have developed and expanded, access to online museum collections continues to be an under-researched area.

Several strands of research intersect in the proposed study. The first of these concerns the problems associated with the varied approaches taken in the descriptive practices within museums. The varied practices found in the descriptive records of museum collections have been identified as resulting from the descriptive needs of unique users, items and/or collections (Bearman, 2008; Peacock, 2008; Zorich, 2008; Besser, 1997), institutional traditions (Marty, 2008a; Peacock, 2008), limited time available for the work (Marty, 2008a), vagaries of historical record keeping (Williams, 2010), varied approaches to describing items (Bearman, 2008; Peacock, 2008; Zorich, 2008), and a lack of standards within the profession (Peacock, 2008). These mixed approaches have been noted as being a stumbling block to the development of shared descriptive resources and federated collections in the museum sector (Fortier & Ménard, 2017).

The second thread woven into the fabric of the study, are the difficulties experienced by users of online art collections. Various professionals who commonly seek images online for their work were found to be frustrated by their experiences relating to the description of the items, search interfaces, retrieval of items, and image quality (Beaudoin & Brady, 2009; Matusiak, 2006). The difficulty people experience when interacting with museum collections, due to the differences between the terminology of museum professionals and that of the general public, was found through the project (Trant, 2006). A recent study of the visitors to the Whitney Museum of American Art’s website suggests that these online collections find limited use when compared to other areas of the site (Stewart & Nullman, 2018). Why these online collections are under-utilized by online visitors is unknown.

The final thread highlighting the need for this research was discovered during a graduate level library and information science course examining digital collections. When examining the online collections of a series of art museums students struggled to recognize basic features of online collections (e.g., number of items displayed, sort order of retrieved items, meanings of fields, advanced search functions, media interactions, etc.). As the students enrolled in the course were all in the final stages of their coursework and had undergone multiple courses to develop their understanding of information systems, it was clear that the art museums’ online collections warranted additional investigation. The reasons behind the difficulties experienced by users of these online collections and why the online collections are underutilized are two areas driving the proposed research. This phase of the study examines the systems in preparation for a future study of user experiences with the art museums’ online collections.

Research Questions

A single high-level research question was used to focus the study on developing foundational knowledge about U.S. art museums’ online collections. From this, a series of questions guided the finer-grained data collection and analysis carried out through the study. The research questions framing the study are:

What is the current state of online access to U.S. art museum collections?

  • What percentage of art museums provide online access to their collections?
  • How representative are online collections of institutional holdings?
  • What technical functions (i.e., browsing, searching, retrieval, etc.) are provided for interacting with collections?
  • What knowledge organization practices (i.e., fields, controlled vocabularies, etc.) are utilized in the online collections?
  • What types of media files are present in the online collections, and what percentage of collection coverage is provided through these digital surrogates?
  • What social functions (i.e., sharing to social media, emailing, tagging) are provided for interacting with collections?


The 2018 Museum Data File published by the Institute of Museum and Library Services (IMLS) was used to perform an analysis into the current state of U.S. art museums’ online collections (Frehill and Pelczar, 2018). Consisting of 3 files, the first file contains 3241 entries for institutions coded as ART. Each of these entries were reviewed manually and their websites examined. 31 entries of those coded as ART in the data file were removed because they did not pertain to fine art and instead pertained to aviation, history, music, railroads, etc. If an institution held a mixed collection, it was retained in the data file if the collection contained artworks. There were many duplicate entries, name variations and / or sub-collections belonging within an already named institutional collection, and these entries were removed from the data file (N=656). The entries made for galleries and museums within a single academic institution were treated separately when they appeared in the data file, as some within the same parent institution held collections while others did not. Several additional entries (N=3) were not included in the analysis because of a lack of information regarding their status on their websites. For example, one gallery of an academic institution had a website, but the only information provided was its name. For an online collection to be identified as a system for this study, it had to contain a search function. This was selected as the primary criterion for inclusion in the study, since Web pages displaying digital surrogates of works (i.e., digital images, audio, or video) would permit browsing actions present in most museum online collections.

Using the list of institutions in the IMLS data file, 51 U.S. art museums were selected at random from the total number with online collections that contained a search function (N=311). The data collection instrument used to record information about each system consisted of a series of 110 data points. These were used to examine the collection, the design of the online collection system, the available user interactions, the descriptive information provided about artworks, the media types present and the system functions. An online survey was used to collect the data, and case ordered displays, content analyses, and descriptive statistics were used to analyze the data.

Online Access to U.S. Art Museum Collections

To provide a broad overview of the current state of access to art museum collections an analysis of the IMLS’s 2018 Museum Data File (Frehill and Pelczar, 2018) was completed. A review of each entry in the data revealed that several of the institutions (N=67, ~2.6%) listed in the file are closed, either temporarily or permanently. After multiple Web searches were performed for each entry in the file, it was determined that approximately 13% (N=333) did not have an institutional website. For those institutions with websites, each was examined to determine if it was a collecting or non-collecting organization. It was discovered through this analysis that roughly 55% (N=1186) of the institutions with websites either explicitly state they are non-collecting, or only report on current and/or changing exhibitions and provide no other evidence of holding artworks. The websites of institutions with collections (N=959) were examined to reveal if online collection systems were available. Most of the institutions (~68%, N=648) with collections did not provide an online system with which to interact with their holdings. Online collections systems were found among approximately 32% (N=311) of the institutions with collections. It should be noted here that while the online collection systems were available for these museums, access to the entire collection through the system was rarely encountered. Partial collection coverage, or access to sub-collections were more commonly encountered among those institutions with online collection systems. This overview of the current state of access to institutional collections helped set the stage for the analysis conducted of the museums’ online collection systems.

Fundamental Aspects of Online Collections

In order to determine how representative online collections are of institutional holdings, the museum-reported figures for items held were collected from their websites. The number of items held by the museums ranged from a low of approximately 2,500 items to a high of over 2,000,000 items. Several (N=14, ~27%) of the museums did not provide a figure for the size of their holdings on their websites. The number of items found across the 51 institutional online collection systems ranged from approximately 300 to 470,000. While determining this figure would seem to be a straightforward task, in practice this was not the case. Only 8 museums (~16%) made the number of items in their online systems explicit, and for the remaining institutions various methods were used to determine this figure (e.g., performing an empty search, counting the items within all browsable groups). The percentage of collection holdings available online, which ranged from less than 1% to greater than 100% coverage, was calculated using the ratio of the number of items available online vs. institutional holdings. The reason for the online collection coverage of greater than 100% is unknown, but a few possibilities come to mind. It may be that collection holdings were under-reported, multi-part items may have counted variously (e.g., a series of photographs were counted as 1 item for the collection holding figure, but each photograph may have been given its own online item record), and / or online collections systems contain records for more than just artworks. Donor, exhibition and/or artist records were sometimes retrieved in response to search queries within the museums’ systems in addition to object records.

Functionality of Online Collection Systems

A variety of ways to interact with the online collections were found in use across the museums. These consisted of various kinds of searching (keyword and advanced), browsing institutionally defined groups of items, sharing via social media and email, tagging items, and the display and manipulation of media files. Keyword searches, performed by entering text in a search box of the collection system, were found among all museums examined in the study (N=51, 100%). Most museums (N=43, ~84%) provided pre-selected item sets for browsing (a mouse-driven action where labelled categories of items are presented as a list or grid, usually with a representative cover image). Many ways to group items were found,  and commonly encountered sets for browsing included: artist/maker, time, place, culture, medium, donor, object type, museum department, exhibition, items on loan, deaccessioned items, items on view, and “collection highlights.”

Advanced searching methods that were found consisted of entering keywords within specific fields, selecting terms from controlled pick-lists associated with fields, manipulating sliding selectors for earliest and latest dates, and selecting radio buttons or boxes that change based on field. Methods of advanced searches varied across the systems and were found to be purely textual, a combination of selections and text or using selection(s) only. The ability to perform advanced searches was found among many (N=34, ~67%) of the online collections.  The number of fields used to construct advanced searches ranged from a low of 2 to a high of 29, with 9 to 11 searchable fields being found most often.

Search strategies using wildcard characters such as * or ? to replace characters and Boolean operators (AND, OR, NOT) to expand or limit search queries were found to work in less than half of the museums’ online collection systems. To test search outcomes a baseline keyword search was performed for the term “portrait” in each system. The retrieved items were examined, and the number of retrieved items was noted. Next, to determine if wildcard characters were used, searches were performed using “portrait*,” “portrait?,” “m*n,” “m?n,” “wom*n,” and “wom?n” and the retrieved items examined. This process revealed that wildcard characters were functional in several systems (N=23, ~45%). It should be noted here that the characters used for the wildcard searches varied across systems, and those in use were rarely made explicit.

To assess whether Boolean operators were supported, keyword searches using the terms “man,” “woman,” and “bust” were combined with “portrait” using each of the three Boolean operators and the retrievals compared. It was discovered that Boolean operators were functional in a minority of the museums’ online collection systems (N=19, ~37%). The effectiveness of the search algorithms in place within each system was also assessed using multiple keyword searches of “portrait, man, woman, bust.” It was discovered through this search process that a few museum systems (N=10, ~20%) captured all items with the portrait concept in the items retrieved. Most of the museums (N=41, ~80%) failed to recall all collection items matching the portrait concept and several systems returned “false drops,” items that were clearly not portraits (e.g., Chippendale sofa, doors, jugs, etc.). Sculptural busts and other artworks with individualized depictions of men and women were found within the online collection systems that had not been retrieved using the keyword search of “portrait.” In the case of a few museums, items tagged with the term “portrait” by the general public were found to play a role in revealing items that would likely not have been retrieved otherwise.

The way retrieved items are presented in a system has important implications for users, and so a series of aspects concerning the display of retrievals were examined. It was discovered that most systems (N=41, ~80%) provided the user with the number of items retrieved in response to a query. The ordering principle used to select and display the retrieved items was able to be determined in many systems (N=34, ~67%). Relevance was the most often identified principle used to retrieve items (N=21, ~40%). However, it was unclear which field(s) were being used to determine relevance in the systems, and whether individual fields were more highly weighted in the analysis. Other ordering principles used in sorting the retrieved items were found less frequently (Artist N=5, ~10%; Title N=5, ~10%; Accession number N=3, ~5%). The ability to resort the retrieved items using one of the displayed fields, useful to processes surrounding the assessment of items, was found in several museums (N=21, ~40%).

Tagging and other Web 2.0 technologies were incorporated into several of the museums’ online collection systems. A facility allowing users to tag collection items was found in the online collection systems of a fraction (N=6, ~12%) of institutions. More prevalent were means of sharing collection content through social media (N=28, ~55%) and e-mail (N=24, ~47%). The most commonly found social media services among the museum systems were Facebook (N=24, ~47%), Twitter, (N=24, ~47%), Pinterest (N=14, ~27%), Tumblr (N=10, ~20%), the now defunct Google+ (N=10, ~20%), and Instagram (N=6, ~12%). The support of social media sharing through the systems of several museums (N=7, ~14%) is worthy of special mention as they provided in excess of 100 different social media options.

In addition to the descriptive entries made within the item records, media files perform an important role in user interactions the collection systems. This importance is evidenced in the fact that all systems (N=51, 100%) included in the study provided representative images for their collection objects. Nevertheless, image coverage in these systems was limited, with only a fraction of museums (N=18, ~35%) providing images for every collection object within their systems. Additional media was provided by several museums in the form of online audio (n=6, ~12%) and video (n=4, ~8%) content. When these time-based media were present within the online systems, they were limited to narratives discussing collection objects rather than offering richer media documentation of the items.

The display of, and user interactions with, images in the online collection systems were found to vary. Some systems (N=6, ~12%) provided a single modestly-sized image embedded in the item record with no other means of interaction, while other systems provided users a means for enlarging images (N=12, ~24%), panning around images (N=3, ~6%), and/or zooming into the image to view details (N=8, ~16%). A few museums provided multiple views of a single item (N=4, ~8%), thereby offering users additional means of visual examination.

To get a sense of the data structures behind the various systems, the fields present in the various collections’ item records were examined. At a basic level of analysis, it was discovered that the number of fields in the item records across the museums’ systems ranged between 11 and 29 separate fields of data. The most commonly occurring number of fields in the item records found in the systems were 13 to 17. Next, the field labels for all online collection systems were collected and examined. Labels identifying the kinds of data contained in each field of the item records, a basic element of system design, were found among just over half (N=29, ~57%) of the museums’ systems. The remaining systems either failed to label any of the fields contained within their item records (N=7, ~14%) or only provided some field labels (N=15, ~29%).

In contrast to the relatively low number of field labels present in the item records of the systems, most of the museums (N=40, ~78%) contained one or more fields with associated lists of controlled entries. The number of controlled entry fields found in use within each system ranged from 2 to 10, with most online collections containing 2 to 6 controlled lists. The controlled entry field for Artist / Maker saw the highest adoption across the museums (N=34, ~67%). This was followed by controlled entry fields for Object Type (sometimes labeled Classification), Collection (meaning museum department), and Location (meaning where an item was located within the museum OR its geographical origin). Each of these lists was found in use among 17 (~33%) museums. Several additional controlled lists, while found less frequently, were present in multiple institutions. These consist of Culture / Nationality (N=9, ~18%), Medium (N=8, ~16%), Subject (N=7, ~14%), Technique (N=6, ~12%), and Exhibition (N=3, ~6%).

Discussion and Conclusions

Interactions with online collections should be researched further as they offer a way for institutions to support their missions and expand their reach. The analysis of the IMLS’ Museum Data File found a modest number of institutions (N=959) hold art collections in the U.S., and that a limited number of these (N=311) offer some form of online access to their holdings.  Approximately 32% of all U.S. art museums with holdings provide online collection systems that are openly available to the general public. With less than one-third of collecting institutions providing online access, it is clear there are numerous opportunities to help galleries and museums increase the number of individuals they touch via the Web.

Collecting and analyzing the art museums’ online collections was complicated by the amount of variation that was encountered across the systems. Basic aspects of system design such as features available for browsing, performing advance searches, examining media files, and the description of collection objects showed little consensus. Even relatively straightforward characteristics, such the presence of field labels in item records and the percentage of holdings available in the online systems showed a great deal of variation across institutions. Nevertheless, some interesting details emerged.

The overall characteristics of interacting with online collection systems are that each system provided a means of performing a keyword search, with a majority (N=43, ~84%) also providing browsable categories of objects. The browsable sets of objects showed marked variation across the institutions even at a conceptual level. Variations in the categories used speak to the richness of collections (and language), and the multiplicity of ways with which collections can be examined. As browsing is an exploratory way of interacting with collections, the variations found are likely to reveal previously unexplored items to users.

Retrieval effectiveness of the systems was shown to be dependent on the descriptive entries provided by the museum staff and, in the few cases where they were present (N=6, 12%), by the public in the form of tags. These findings present a strong case for investment into collection description, particularly at the conceptual or subject level, and for the development of systems allowing for tagging by users. Furthermore, as it is understood that the terminology in use by museums is a poor match for that of the general public, controlled lists of entries could also aid in bridging the semantic gap present in museum collection systems. The use of controlled lists within these systems would also assist users through query development via linkages to scope notes, synonyms, broader and narrower terms, and related terms. This level of terminological support does not currently appear to be present within these systems. In fact, only a small number of institutional collections (N=7, ~14%), offered a list of controlled terms for Subject entries.

A topic related to the terminological differences found between the general public and the museum staff is the discovery that many field labels are absent in the systems. Just over half of the systems (N=29, ~57%) provided labels for each field. Limited or no field labels mean that users may not understand the conceptual basis behind the descriptive entries made for collection objects. For example, if the term “Polish” appears within an unlabeled field, the term might represent nationality, material or technique. For users whose knowledge and experience do not include a familiarity with the descriptive structures and terminologies present in the cultural heritage sector this situation likely adds an additional, impenetrable layer to the museums’ collections.

Additional means of improving search results would be to ensure the integration of wildcard searching and Boolean logic within these systems. These basic search logics were in use among a minority of systems examined (wildcards N=23, ~45%; Boolean logic N= 19, 37%). Wildcard characters would provide users with the ability to search for term variations (e.g., portrait, portraits, portraiture) without having to think of, and search for, all terms that may have been used in the system. Boolean logic allows users to perform searches which lead to optimal retrieval results. The ability to specify in a search query that all terms, either term, or that some terms should be excluded using Boolean logic is a common function present in computer-based systems, and yet it functioned in few museum systems. The lack of search logics available, combined with the complexities found in the interface designs used for performing advanced searches, create a problematic situation for users querying these systems.

Finally, while all museums provided at least some images to represent items in their collection, the available media interactions were limited in terms of functionality and format. Increasing the available image sizes and area available for image display and providing tools for interactions such as panning and zooming should be present in all systems whose main focus is artworks or cultural materials. While audio and video files were found in a few systems, these were analyses of items as opposed to capturing additional visual or auditory data of collection objects.

Interestingly, the weaknesses discussed here did not result in gloomy assessments of the systems among the library and information science students who examined the online collections. They reported that they were generally happy with their experiences, even while noting the various shortcomings they may have encountered. It was clear they were happy to have access to the items they were able to retrieve and were fully appreciative of the items’ associated descriptive information and digital surrogates. How closely the students’ responses mirror those of the general public is an aspect to be explored further in the next phase of the research.

Future Work

With less than one-third of collecting institutions providing online access, it is clear there are numerous opportunities to help galleries and museums expand their reach. These system interactions allow for the discovery, amusement, awe, inspiration, research, and learning experienced by individuals examining collection objects online, and offer a means by which institutions can support their mission and touch the lives of many individuals beyond their local communities.

The baseline understanding of online collections developed through this study offers valuable information to inform future research into this topic. Data collection & analysis of additional museum systems will continue with the aim of identifying and categorizing the museums’ systems into groups based on their design and functionality. Representative online collections selected from among these categories would be useful for investigating user experiences with the museums’ systems. Future work in this arena is proposed.

Lastly, to gain insights into the context surrounding the systems, the findings presented here will be discussed and explored with museum staff whose work involves online collections. The critical insights of these individuals will help illuminate museum practices that have helped shape the current systems and will offer a more nuanced account of what was discovered.


U.S. Art Museums Included in Analysis (data collected spring 2019)

AD&A Museum at the University of California Santa Barbara
Art Institute of Chicago
Barnes Foundation
Birmingham Museum of Art
Boston Museum of Fine Arts
Brooklyn Museum
Cantor Art Center at Stanford University
Cleveland Museum of Art
Clyfford Still Museum
Corcoran Gallery of Art
Dahesh Museum of Art
Dallas Museum of Art
Davison Art Center at Wesleyan University
Denver Art Museum
Detroit Institute of Art
Fairfield University Art Museum
Frick Collection
Georgia O’Keeffe Museum
Gregory Allicar Museum of Art
High Museum of Art
Indianapolis Museum of Art
J. Paul Getty Museum
Jule Collins Smith Museum of Fine Art at Auburn University
Kirkland Museum of Fine and Decorative Art
Los Angeles County Museum of Art
Lowe Art Museum at the University of Miami
Metropolitan Museum of Art
Montgomery Museum of Fine Arts
Museum of Fine Arts Huston
National Gallery of Art
Nelson-Atkins Museum of Art
Philadelphia Museum of Art
Pomona College Museum of Art
Portland Art Museum
Rhode Island School of Design Museum
Saint Louis Art Museum
San Diego Museum of Art
San Francisco Museum of Modern Art
San Jose Museum of Art
Santa Barbara Museum of Art
Smithsonian American Art Museum
Smithsonian National Portrait Gallery
University of California Berkeley Art Museum and Pacific Film Archive
University Colorado Boulder Art Museum
University of Arizona Museum of Art
University of Denver University Art Collections
Virginia Museum of Fine Arts
Wadsworth Atheneum Museum of Art
Walters Art Museum
Whitney Museum of American Art
Yale Center for British Art

Cite as:
