Redesigning Getty.edu with Structured Content

Annelisa Stephan, J. Paul Getty Trust, USA, Amelia Wong, J. Paul Getty Trust, USA

Abstract

In 2018 the J. Paul Getty Trust embarked on a multi-year project to overhaul its vast and decentralized website with the aim to improve user experience (UX), reach a broader and more diverse global public, and more effectively present Getty as an organization. The project went beyond a visual redesign to address information architecture, UX strategy, a technical rebuild, and a content management system (CMS) replatform, as well as the creation of a new governance structure. Phase 1 of this work, in 2019, resulted in 12 "front-door" pages that established a new design direction and the outline of a new information architecture. One central goal of the work was to move from content "blobs" to structured content, separating data from presentation for sustainability and governance. Our demonstration and paper, authored by the content strategy leads, provides insights into our approach — still in its early stages at this writing — and the opportunities and challenges of approaching content as data.

Keywords: redesign, content strategy, websites, structured content, web development, strategy, digital transformation

Introduction

In January 2020 the J. Paul Getty Trust went live with the first 12 pages of a multi-year, cascading website redesign of Getty.edu. The redesign aims to dramatically improve all aspects of user experience — including information organization, interaction and visual design, accessibility, and content quality — for our key audiences of arts and conservation professionals, destination visitors, and art enthusiasts. We also seek to align the site with a refreshed brand strategy that more effectively communicates Getty’s work and impact and underlies our efforts to engage audiences with art and cultural heritage more broadly and deeply. Among our internal goals for the first phase of work were implementing a scalable, component-based user interface, defining a new web technology infrastructure, piloting new workflows for continuous improvement, and establishing a commitment to content strategy.

In line with these goals, the new pages present a radical departure from the website as it stood before not only in visual design, but also in strategy, technology, process, and content.

Together, the changes we are making to Getty.edu represent a major institutional shift away from siloed workflows and one-off projects to structure, sustainability, and systems thinking in all the digital work we produce and maintain. In this paper we focus on our initial foray into structured content, which has also been an exercise in implementing digital strategy in the real world.

A Brief History of Getty.edu

“Tomorrow there will be ever more new devices and platforms and screen sizes and resolutions and input models, and it is never going to stop. It is a veritable zombie apocalypse of new devices and platforms. How will humanity protect ourselves from this? The challenge that we face is in giving up our insistence on publishing to each and every platform as if what we create is uniquely intended for, uniquely styled and locked to that platform….It is the organizations that accept, today, that they must develop new publishing processes, that they must think of their content in new ways, those are the organizations that will survive the zombie apocalypse.” (McGrane, 2016)

Structured content means organizing and treating digital content like data, with the goal to establish a single source of content truth that enables us to “create once, publish everywhere” (Flagg, 2013). Fully structured content is device- and presentation-agnostic, set up for flexible retrieval and reuse in the ascendant “zombie apocalypse” of ever-new devices, platforms, and displays.

When we began a redesign project in 2018, our website was certainly not ready for the zombie apocalypse. Founded in 1996, Getty.edu is an enterprise-scale site hosting hundreds of thousands of static HTML pages as well as research databases such as a museum collection database, a library catalog, the Getty Provenance Index, and the Getty Vocabularies. Though some of the published website content was responsive (across desktop, tablet, and mobile), much more was not.

Traditionally, the two primary functions of the website have been to promote physical visitation to Getty’s two locations and serve the research needs of art historians and conservation professionals. Analytics data reflects this emphasis, with about one quarter of site traffic to visitor information and another quarter to research tools.

Pie chart representing proportions of web traffic to sections of Getty.edu. Two are larger than the others and represent about one quarter of the pie.
Figure 1: Breakdown of traffic to Getty.edu, 2018. Traffic is roughly evenly divided between visitor information and research resources. Collections traffic represents both, with destination visitors curious about what to see during a visit and researchers seeking information about individual objects. Source: Google Analytics data.

Web content is published through multiple content management systems (CMSs), including OpenText TeamSite for the bulk of static pages, WordPress for blog content and some microsites, and Quire for for digital publications. Third-party systems holding semi-structured content drive specific sections such as the event calendar, e-commerce, News Room, ticketing, and job listings.

Because of this multiplicity of systems, it is hard to correctly estimate how much content we truly have. The software engineer who works on our search application ballparked that we have something like one million items: webpages, images, and PDFs (so many PDFs!). Across the site much of the content is unstructured, meaning that text, metadata, images are inextricably combined with the presentation layer of code and design.

Where Does Unstructured Content Come From?

In many ways, the story of Getty.edu mirrors the story of museums and tech. For 20 years, Getty teams had been enthusiastic adopters of new possibilities offered by the Web. Brochureware gave rise to online interactives, visual tours, listservs, audio and video, social commenting, and increasingly sophisticated databases. We, as everyone else in the GLAM sector, grappled with how to use digital interaction to facilitate meaningful experiences with cultural heritage. With its significant endowment and staff, Getty approached the web as a space to publish knowledge, share resources, and do good.

Screencapture of an early-2000s website with a large round orb atop a white backdrop with many text links encircling it.
Figure 2: Home page of Getty.edu on March 4, 2000. Visitor information is a top emphasis, as is a list of internal divisions of the Getty organization. Source: Internet Archive

As complexity increased, however, so did entropy. Like many enterprise sites, the website grew organically, without a strong foundation of strategy, structure, or governance. The Web seemed to offer an infinite content expanse, freed from the budgetary constraints of print and the maintenance needs of physical space. But while digital space may be endless, human attention is not. The appetite to publish new information outpaced the ability of digital teams and content owners to steward existing content, leading to poor organization and thousands of unmaintained and orphaned pages.

Screencapture of a website showing a broken Flash presentation of a past museum exhibition about Gustave Courbet. A landscape painting is visible at right.
Figure 3: Like many enterprise websites, Getty.edu has significant amounts of legacy content presented in outdated file formats, such as Adobe Flash. The Digital Preservation Coalition recently listed Adobe Flash as a “critically endangered” technology.

The rise of digital marketing and social media further increased the pressure for more content. But even short-term content had a way of becoming permanent, as limited resources inevitably deferred maintenance and cleanup, and content sponsors were often uncomfortable with the idea of retiring long-tail content that might drive SEO. This is not a critique, for it is human nature to prioritize creation over cleanup, expediency over strategy (see: funding for roads and bridges).

Users noticed. A 2018 online survey underscored the severity of the problem, with comments such as, “you need to have [a] specific thing in mind in order to find particular links” and simply, “too many options to click on.”

As content grew, navigation increased in complexity. By the 2010s Getty.edu had become a collection of complex microsites hinged together through a primary dropdown menu. Each major Getty entity, known internally as a program, has its own subsite with its own primary navigation. Thus the Getty Conservation Institute, Getty Foundation, Getty Research Institute, and J. Paul Getty Museum each had a subsite with bespoke primary navigation that often reached seven levels deep.

Lack of persistent navigation is well established as a serious web usability problem (Neilsen 2009). For us, it is also a significant organizational one. Despite the siloed appearance of Getty.edu, Getty programs create and support similar content and resources, such as exhibition and event information, publications, and project descriptions. But Google Analytics data shows that users who come to one of our subsites via a deep-level content page — the bulk of all traffic — rarely cross from the subsite they enter at to another. This results in lost opportunities to surface all offerings on a topic, as well as significant user confusion, duplication of effort, and emails and calls to staff. For example, instead of a single page listing Getty-published books, we have five such pages, all of which are updated manually and separately.

In addition, over the years there has been a steady flow of requests to web content staff to implement hand-coded design customizations of individual templates in order to meet the needs of individual stakeholders or pieces of content. CMS content became bound up with presentation elements, such as inline markup and HTML tables. Moving to new designs would thus necessitate a total content rebuild — an example of how flexibility in the short term can lead to inflexibility in the long term.

Establishing Web Governance

Digital staff had long argued for a redesign, but the work seemed intractable without a clear mandate, dedicated resourcing, and a clear decision-making structure. Moreover, while there was broad agreement that the visual design of the site needed an upgrade, for some time there had been debate about whether we also needed to tackle the bigger issues of governance, systems, and standards — the “boring, complicated, and important” elements of technology work (to adapt Doctorow, 2017). We wanted to realign, not just redesign (Moll, 2005).

New leadership changed the equation. New heads of communications and digital joined the Getty staff and championed the redesign, ensuring much-needed focus and funding. They also saw the need to base the redesign on a longer-term strategic foundation, which included a critical six-month project to refine Getty’s brand positioning and refresh our visual identity for the digital age, as well as work to scope and replace our legacy CMS.

These decisions came against the backdrop of an institutional reorganization that created a new Getty Digital unit centralizing IT and digital technology, and a smaller group within the Communications unit dedicated to digital content strategy and user experience design. The redesign offered an opportunity to establish a small cross-disciplinary product team across the two units, composed of the two authors (the content strategy lead and the project co-lead) as well as a dedicated project manager, UX design lead, technical lead, and a software architect who joined the project midstream.

This “core team” leads the redesign project, working closely with Getty’s CMS team and other in-house digital experts, as well as a small group of content owners and digital experts from the Getty Programs (the “content working group” noted on the diagram below).

Color-coded flow diagram listing groups and individuals contributing to the Getty.edu redesign
Figure 4: Governance structure for Phase 1 of the Getty.edu redesign.

At the recommendation of leadership, we pursued a cascading redesign, beginning with a handful of top-level entry pages that would provide a visual and UX “facelift” as well as a foundation and momentum for further work. We engaged digital agency AREA 17 in May 2019 while the brand-positioning work was still actively underway so that the two processes could inform one another.

The brand work was fundamental not only to our ability to portray Getty to our external audiences, but also to our capacity to work successfully together internally. The process gave us new vocabulary to describe our organization and its impact and new motivation to pursue the goal of a single website.

From Content Blobs to Content Chunks

Rethinking our content strategy and approach was critical to the first phase of the Getty.edu redesign. In tandem with work to clarify organizational goals and user needs, we had to grapple with our sprawling, unstructured contentverse.

Early on the core team made the decision to move to a headless CMS that would separates content from presentation. A CMS is a software system that holds the content for a website or other digital product and typically includes text, metadata, images, and design templates as well as their presentation outputs (webpages, in the case of a website). WordPress and Drupal, for example, are commonly used CMSs.

A so-called headless CMS, often described as a “structured content CMS”, would separate our content “body” from our user interface “head.” This separation would make content reusable across multiple UIs rather than binding it to a single design or platform. It would also enable central governance of design and code by UX and software staff — a critical step for ensuring standards, such as those relating to accessibility compliance. It would further enable web editors across the organization to focus on creating and updating content according to a consistent style and new standards, not writing HTML, CSS, or Javascript on the fly to satisfy individual requests and preferences of stakeholders. 

But to get there, we had to deal with a blob problem.

A content “blob” is anything in which content and layout are inextricably linked. Content blobs lack definition and are created to serve singular purposes. Content chunks are defined to be meaningful to machines and humans, and can be rearranged to display in different ways to suit institutional and user needs in various contexts. Chunks are structured content — structure that then creates structure in code and systems.

Getty.edu is largely composed of blobs, many of which take the form of HTML files in TeamSite. Whether managed through a templated form or created as a flat file, content is conceived of and created as pages, many of which are nested together in our subsites. Additionally, image files live in multiple folders in multiple directories (in three or more separate sizes to serve the responsive templates).

Simple diagram showing, at left, "Content and code" publishing to "Single webpage"; on right, "Content" being merged with "Code" and outputting to "Webpages, apps, kiosks, etc."
Figure 5: A simple diagram of unstructured and structured content in the web context. Unstructured content conflates content, code, and container; structured content handles content separately from UI, which also makes it flexible to be used on various platforms and devices.

In other words, Getty.edu is itself a big blob made from code, markup, and source files that inextricably tie content to presentation. This approach to creating websites is outdated in ways that are ineffective and inefficient for modern digital practices and culture. These include:

    • creating cluttered code and redundant content;
    • not serving accessibility standards;
    • tasking production to highly specialized staff;
    • requiring overly complicated workflows;
    • complicating search;
    • preventing multi-channel distribution, whether on internal platforms (e.g., digital signage, mobile apps) or external platforms (e.g., Google, Bing); and,
    • promoting a work culture of content ownership and stewardship that is focused on singular use cases and needs rather than reuse and collaboration.

The before-and-after CMS screen captures below, showing content of the Getty.edu home page, hint at the differences between blobs and chunks in day-to-day practice. Blobs merge code, design, and presentation (in this case, via hand-edited HTML), while chunks separate content into rearrangeable units that merge with the presentation layer outside the CMS.

Screencapture of a string of HTML from a webpage showing content for a rotating carousel featuring exhibitions on view
Figure 6: Editor interface for the Getty.edu homepage, pre-redesign. (This particular page was built as a flat page, in code, and not managed via a form.) HTML is not itself a blob, but this content is a blob because it is intermingled with code.
Screencapture of the Contentful content management system showing how Getty.edu home page content is comprised of discrete entry fields
Figure 7: Editor interface for the Getty.edu homepage, post-redesign. This content is a series of chunks that are held separately from design and code and structured flexibly for reuse. Instead of being styled in a big open box, content is entered into discrete fields with defined relationships, then delivered via an API to a software application.

The Problem with Blobs — and with Combating Them

Blobs mean that publishing a simple content change — say, updating seasonal hours for the Getty Center or Villa — can take an hour. Even then, pages on which hours appear can be missed, leading to inaccurate information. Blobs mean that search engines can not scrape our site for important visiting information to display in search results. Blobs mean that, for every exhibition, designers have to produce almost 40 image files with different dimensions for use across different digital channels. Blobs mean that content editors have to email software engineers to update typos, and staff ask for another “box” on “our page” to promote temporary information that no one will remember to delete.

Blobs are exhausting. And, to be honest, combating blobs is also really tiring. But it was worth investing the considerable time and energy now to lay a foundation of chunks in our first phase, and to stop wasting resources in tedious workflows. All we have to show for that foundation right now is 12 new webpages, but underlying them is a solid base on which to build the hundreds and thousands that will follow. In this new reality, publishing and removing seasonal hours happens in one place in the CMS and takes seconds; designers no longer have to create any image files for digital products; content editors update typos—and build entire pages—at will (but according to consistent standards); and staff across the organization see digital as a dynamic organism and not a derivative of print.

We are not there yet, but we have already seen significant changes in the workflows and work culture of digital at Getty. Over the course of 2019, we began an acculturation process to structured content that is allowing us to transform the way we work. These changes are welcome harbingers of a more efficient and strategy-driven Getty, though they have not come without challenges, which we will describe below.

Implementing Structured Content

Our progress toward the adoption of structured content began with getting a better understanding of our content and then modeling that content so we could understand its variations and relationships.

In December 2018 we engaged content strategy firm Brain Traffic to guide us in creating a web content ecosystem map. A content ecosystem map is an alignment tool for “understanding and documenting an organization’s content reality” (Kubie, 2017).

Over the course of two days, Brain Traffic led the core redesign team and the content working group in discussion that resulted in a visualization of our digital content ecosystem writ large. The exercise began by articulating a shared goal: “What does the future web ecosystem look like for a unified, reputation-building, audience-focused Getty?”

The workshop produced a fun-looking graphic map that visualizes our web content ecosystem as a network of relationships among people, products (Getty.edu, research applications), projects (exhibitions, publications), and processes (system-to-system, project-people) that reflected both our current state and what we aspired to be. This has proved incredibly useful as a communication tool, but the key value of the exercise was the conversations, questions, and mutual understanding that occurred in the process of making the map. We were forced to actually identify, name, and define areas of content that we continuously produce — resources, interpretive content, and institutional info being primary — and to meaningfully articulate differences between them according to audience, goal, use, expected lifespan, and commitment to maintenance.

Where content did not fit any clear type, we challenged ourselves to ask why it existed and what user or organizational purpose it truly served.

Detailed conceptual diagram showing multiple circles linked by lines. Key concepts are labeled as: Getty.edu, Getty.edu web content, and Getty's digital art resources.
Figure 8: Detail of January 2019 draft of Getty Content Ecosystem Map (created with OmniGraffle). This is a revision of the first-draft map created by the end of the workshop. Note that circles are not to scale and therefore do not indicate relative volume or importance.

This big-picture view of our web content has proved continuously useful as we have begun to adopt structured content. While the map did not tell us how to structure our content, it indicated which of our existing channels share similar types of content. It also suggested how systems currently connect or could connect to share that content, and gave us conceptual categories to use as we mapped our existing content during the inventory of the legacy site. Analyzing the map further helped us consider what content we are not creating that perhaps we should. (We still need to develop the map further to represent other user-facing content and distribution platforms beyond the main website.)

To actually structure our content though, we needed a content model, which has two definitions that are applicable here. The first comes from the field of content strategy, in which a content model is a map of your types of content and their relationships. In this context, a “content type” is defined as “the actual thing a user would read or use, like an article, or a recipe, or a help guide entry” (Wachter-Boettcher, 31). Content types are composed of “elements” or “attributes”: “Title,” “Author,” “Publication Date,” etc. Attributes of a content type may themselves be content types: an “Article” content type may have an “Image” attribute, which is a content type with the attributes of “Title,” “Caption,” “Alt Text,” etc. We refer to this as the “conceptual content model.”

The second definition of content model is from the world of headless CMSs, in which you implement a content model into a CMS by building “content types” (using a GUI and/or JSON editor) with fields that define points for content entry that a content editor will input, and code will read. In this context, a “content type” is the thing it represents, but it also may be a type that expresses a presentation context (Melvær, 2018), and it is a form for data entry. Together, these content types constitute the “CMS content model.” This expansion of the CMS content model beyond the conceptual one is evidence of the reality that “content and form, structure and style, can never be fully separated” (McGrane, 2013).

The two content models are similar and related, but they are not the same. The conceptual content model represents actual relationships between content in an ideal configuration. It is a required tool for creating the CMS content model, but for a complex website, the model in the CMS will almost certainly be different.

Knowing we would adopt a headless CMS, the content strategy lead analyzed a large sample of pages across all sections of Getty.edu to identify as many content types as possible. This work included analyzing what purpose the content served and for what audience. Despite the hundreds of thousands of pages on Getty.edu, she was able to define a list of 34 content types that could refer to all of that content, as well as identify attributes that always appeared in the type, sometimes appeared, and that could appear (e.g., an “Article” might need a “Summary” abstract if we want to better support reading behavior among our users, or a “Pull Quote” with Twitter integration if we want to support sharing).

She then arrayed the content types into a conceptual content model that articulates the relationships between the types, such as that a “Publication” includes a “Bibliography,” which includes at least one “Bibliographic Entry.” From the model, we developed a hypothesis that all of our content revolved around three core types: “Artwork,” “Exhibition,” and “Project.” While that insight needs further consideration and has not proved actionable in the implementation of structured content at Getty so far, it does suggest priorities for connecting systems in the future.

Conceptual diagram showing rounded rectangles representing content types and their mapped relationships. The word "Artwork" is at the center of the map.
Figure 9: April 2019 initial draft of Getty.edu conceptual content model, depicting content types and their relationships. Colors of types align with those used in the Content Ecosystem Map excerpted above in Figure 8.

Once we had wireframes for the new pages and had settled on using Contentful as our interim system while the CMS team worked toward a permanent replacement, the content strategy lead created an abridged version of the content model with just the types and attributes needed for first-phase pages. (For example, the “Article” content type has fields for “Title,” “Blurb,” “Publication Date,” and “URL,” but not yet for “Author” or “Body.”) After discussing the proposed model with the Getty software architect and the AREA 17 technical director, she set up these content types in Contentful, imposing requirements on content and adding help tips about minimums and maximums for fields, as well as required style.

During the technical build of the new pages, we realized that we had to expand this neat and contained conceptual model: we needed to create content types in the CMS that represented their presentation context to allow content editors to input and curate content on the new pages. So, in addition to “pure” content types like “Exhibition” and “Event,” we also created content types for each new page, some content modules that form parts of pages, and one to hold metadata for each page.

Screencapture of a content management system showing a dropdown menu listing 28 content types, from "Article" to "Subsection 3up"
Figure 10: Content types as configured in Contentful, our CMS for the first-phase redesign. As of February 2020, there were 28 content types in the CMS.

Structured Content Growing Pains

When we began the redesign, the core team and the content working group all agreed on the direction we were going. We were aligned on goals for Getty.edu and on the decision to use a Getty brand strategy, a design foundation, integrated systems, and structured content to accomplish them. But anticipation is not the same thing as adoption, and — always and forever — change is just hard. Difficulties were of a scale both large and small, which a member of the content working group dubbed “structured content growing pains.”

One growing pain, for example, was simply learning a different way to input content into a structured-content CMS. As mentioned above, our legacy CMS acts as a webpage builder, encoding design templates into data-entry forms that generate HTML pages. Because a structured-content CMS delivers content via an application programming interface (API) to the presentation layer held outside the system — in our case, to a software application built in the open-source Javascript framework Vue.js — content creators needed to develop a new mental model for CMS work. We are no longer building pages, but rather ensuring the integrity of our shared data.

Because our new design system is also built flexibly with modular components engineered to sometimes accept different content types, content editors also had to learn to keep in mind the multiple design contexts in which a single content asset could appear. In cases where a module accepts more than one content type, one may have a unique attribute that may or may not appear in the module based on its context. Keeping track of which modules display which attributes on which pages was an ongoing challenge during design, build, and QA. Having launched, we are still debating how to track the technical, design, and content considerations for each module, and are currently relying on Storybook (a technical component library), Figma (the master reference for our atomic design system), and Airtable (a spreadsheet of the content and display relationships between content types, attributes, modules, and actual pages).

A “Structured Image” Workflow

Among the larger challenges of moving to structured content was adapting to a new image workflow that the core team piloted for Phase 1 pages. This workflow is helping us to move toward content-as-data for images as well as text.

Image assets used for static webpages had previously been treated as blobs: multiple derivative files existed in the CMS, unsearchable, often lacking metadata, and unlinked to master files in our digital asset management system (DAMS). One of our institutional goals for the redesign was to deliver images through IIIF, the International Image Interoperability Framework, which Getty has helped develop with a community of peers. This would require storing image files in our DAMS as “master files” that would be processed through IIIF (essentially a suite of APIs that, among other things, makes it possible to reference an image file at the pixel level), and then inputting the resulting IDs in “Image” content records in Contentful. This plan would allow us to use one file to serve the same image across the website, and dynamically crop and scale them on webpages as needed.

As with other aspects of structured content, implementation was more challenging than we anticipated. We piloted this plan with an interim CMS that was not integrated with our DAMS and lacks a visual editing tool that would allow an editor to manually define a crop or draw “safe zone.” (Our engineers can customize the CMS to include such a tool, but we did not have time to prioritize its implementation within our tight build schedule.) Further complicating the plan was that the DAMS is not currently used by the entire Getty as a master image inventory and we needed to source or create many new image files for the pages that met our new brand guidelines and worked across different modules. Consequently, we had to process many new images through a highly manual workflow to build the pages, one that we will continue to use until we can integrate the systems.

This new image workflow includes uploading files to the DAMS (or downloading existing files to make sure they meet new requirements), writing new metadata to go in the DAMS and CMS, and determining desired image crops in Photoshop and then plugging the resulting pixel width, height, and x and y coordinates into an “Image-crop” content type in the CMS. Designers no longer have to create 37 different image sizes for an exhibition to appear on a new page, but we may have called that many meetings to streamline this workflow in the six weeks since the pages launched! Fortunately, our colleagues have responded to solving this challenge with patience and aplomb, and aspects of the workflow have been smoothed while we pursue implementing a workflow that occurs seamlessly in the future CMS.

The Disruption of Shared Content Ownership

Structured content also troubles the idea of content ownership and forces choices between consistency and flexibility, which presents both challenges and opportunities. For instance, during the build of pages, the question arose of which URL a “Publication” record in the CMS should point to: its page in the Getty Store (our e-commerce environment), or its informational page created by the Getty program that authored it? We could have created two Publication records in the CMS that pointed at two different URLs, but if we did, we would also break our structured content model before we had barely gotten it in place. We chose one record with one URL, and decided that a Publication record would point at the store unless an informational page about it existed. In doing so, we also chose (in this case) consistency over flexibility.

We faced this kind of question multiple times during QA, when requests for customizations to modules began to appear: Could we hide a tag in this instance, but show it in another? Could there be a slightly different description of a project on one page? While we were all on board with structured content in theory and principle, implementation forced us to really grapple with the reality of “create once, publish everywhere.” It sounds simple and desirable, but in practice it forces profound changes in how you work. What may seem like a small copy change to “your” piece of content could affect many other instances where that copy appears.

Of course, the reality is that we can configure the CMS and code to suit every desired (if not always desirable) customization. The CMS and the code can manage the complexity if the humans who build them have the time and energy to configure them appropriately. But, we do not have the luxury of a large team of engineers devoted to the website who can do that work. And even if we did, bending the code to suit all use cases — including edge cases — is not in line with a structured-content mentality. To adhere to the content model we had created, and to allow it to support consistency and necessary flexibility (that is, customizations that are used across many instances of the website), we had to ask what customizations are necessary, and to prioritize addressing them.

These small examples expose the dirty secret — and existential challenge — of structured content: it does not structure itself. People structure it. And thus structured content is really easy to break. Structured content is not structured by your content model, your headless CMS, or your terminology. It is structured by discipline. It retains its structure by deciding to not make two records for the same piece of content, even though you can, and even though it is what a stakeholder may want. At the same time, structured content only works if you are reasonable about how it is implemented. Holding true to a rigidly defined content model for the sake of discipline and elegance is a surefire way to disregard the real needs of audiences and stakeholders. We are building a website, after all- not a skyscraper. If you make some compromises in your structured content, your website is not going to collapse. (At the same time, you have to be vigilant about considering the ramifications of customizations as endless allowances can diminish the integrity of your content governance.)

These challenges are, we think, typical of using structured content, but they are not universal. Some of our difficulties could have been prevented or addressed differently if we had been working without a hard deadline or with a permanent CMS, or if we were tackling this project as a whole rather than in phases. On the other hand, having project constraints forced us to make choices and take informed risks. Knowing that the first phase of our build was based on an interim CMS, we were able to be comfortably experimental in our approach, knowing that we would “fix” the content model when we transitioned to a permanent system, and put in place a more refined version of the CMS content model than what we are using now.

Lessons Learned

A decade ago Burnette, Mitroff Silvers, and Sexton wrote that website redesigns “are painfully effective at bringing larger and often unresolved strategic issues to the surface” (Burnette, 2010). More recently, Heide Smith (2018) reflected that “an institutional website of any size is the inevitable source of deep internal discussion, painful compromise, and quite a bit of organizational soul-searching. Who are we? What is most important for us? Who is the audience we mostly want to speak to?” These comments speak to our experience as well.

Getty teams conceptualized the redesign from the outset as a digital-transformation initiative. In other words, we wanted to change how we approached digital work, not just what we made with code, content, and design. We knew that even the most beautiful new site would be a failure if it did not serve our users and advance our organizational goals. We also knew that even a user-friendly new site could be a temporary shiny object if it lacked sustainable strategy and processes to maintain and improve it.

What have we learned as we pilot this approach for content?

First, the move to structured content inevitably prompts strategic thinking about content. When divorced from presentation and treated as a core institutional asset, content raises questions: Why does this exist? What is it made of? Who is it for?

Second, strong collaboration across content, engineering, and design is critical for structured content to succeed. Gone (thank goodness!) are the days in which editorial staff were handed designs and asked to replace lorem ipsum with “the content.” While this paper has discussed content as, essentially, text, we prefer the more inclusive definition by Spool (2019): “what the user needs right now.” This means everyone is involved.

Third, content strategy is a specific role and discipline, one essential for any user-facing digital work in 2020. Not just content creation (although that’s plenty important, too!), but the fundamental practice of “guiding the creation, delivery, and governance of useful, usable content” (Halvorson, 2018). It can be hard to describe internally what a content strategist does, but it is clear that the savings in time and effort that structured content promises over the long term must be invested up front in strategy. We hope that our work this past year represents such an investment in our own organization’s future.


References

Burnette, Allegra, “Tales of the Unexpected: A Pragmatic and Candid View of Life Post-Launch.” In J. Trant and D. Bearman (eds). Museums and the Web 2010: Proceedings. Toronto: Archives & Museum Informatics. Published March 31, 2010. https://www.museumsandtheweb.com/mw2010/papers/burnette/burnette.html. Accessed February 25, 2020.

Doctorow, Cory. “Boring, complex and important: a recipe for the web’s dire future.” September 21, 2017. Wired UK. https://www.wired.co.uk/article/w3c-eff-open-standards-web-cory-doctorow. Accessed February 4, 2020.

Flagg, Rachel. “How to Create Open, Structured Content,” July 29, 2013. Accessed February 23, 2020.  https://digital.gov/2013/07/29/how-to-create-open-structured-content/

Halvorson, Kristina. “New Thinking: Brain Traffic’s Content Strategy Quad,” Brain Traffic (April 26, 2018). Accessed February 23, 2020. https://www.braintraffic.com/blog/new-thinking-brain-traffics-content-strategy-quad

Kubie, Scott. “An Introduction to Content Ecosystem Maps,” Brain Traffic (October 5, 2017). Consulted February 12, 2020. https://www.braintraffic.com/blog/an-introduction-to-content-ecosystem-maps

McGrane, Karen. “WYSIWTF,” A List Apart (May 2, 2013). Accessed February 5, 2020. https://alistapart.com/column/wysiwtf/

McGrane, Karen. “Content Strategy in a Zombie Apocalypse, Karen McGrane at USI,” July 18, 2016. Accessed February 23, 2020. https://www.youtube.com/watch?v=1QZ7BowhwG4&t=471s.

Melvær, Knu. “Strategies for Headless Projects: Structured Content Management.” Smashing Magazine (November 29, 2018). Accessed February 21, 2020. https://www.smashingmagazine.com/2018/11/structured-content-done-right/

Moll, Cameron. “Good Designers Redesign, Great Designers Realign,” A List Apart (October 25, 2005). Accessed February 14, 2020. https://alistapart.com/article/redesignrealign/

Nielsen, Jakob. “Top 10 Information Architecture (IA) Mistakes.” May 10, 2009. Accessed February 12, 2020. https://www.nngroup.com/articles/top-10-ia-mistakes/

Smith, Jonas Heide. “Finally: A new website for SMK,” Medium (August 15, 2018). Accessed February 6, 2020. https://medium.com/smk-open/finally-a-new-website-for-smk-c1d3c863779e

Spool, Jared. “Content and Design Are Inseparable Work Partners,” UIE (September 20, 2018). Accessed February 23, 2020. https://articles.uie.com/content_and_design/

Wachter-Boettcher, Sara (2012). Content Everywhere: Strategy and Structure for Future-Ready Content. Brooklyn, NY: Rosenfeld Media


Cite as:
Stephan, Annelisa and Wong, Amelia. "Redesigning Getty.edu with Structured Content." MW20: MW 2020. Published February 26, 2020. Consulted .
https://mw20.museweb.net/paper/redesigning-getty-edu-with-structured-content/