During my visit to the SSP Annual Meeting this year I attended a session with the splendid title “Back to the Future of Digital-First Publishing: Where We Are and Where We Are Going”. The session, moderated by Bill Kasdorf, looked at how far the publishing industry had come since XML-first technology had first impacted production workflows back in the early 2000s. As a subject close to my professional heart, I was engrossed as Brian Cody, Randy Townsend and Charles O’Connor shared their views. Despite asking the panel a question at the end session, I was keen for more. So, I introduced myself to Charles and invited him to continue the conversation, which we did in June this year.

Patrick

As a fellow technology-based vendor, I was keen to talk to you and understand your view of the market and the pace of change. Particularly in a post-COVID era when many vendors, whose businesses rely more on low-cost labour than true technological innovation, came under pressure.

Charles

Well with low-cost labour as a back-up, you can come up with a 73% technological solution. So, something presented to an author within an online proofing system, when examined, is actually being mediated through a master copyeditor. These companies don’t have the impetus to take the tech solution where just content experts are working on it, because labour is cheap.

Patrick

But from a Publishers point of view, they often describe to me what I would call ‘old school’ problems; they would like to publish more quickly, more efficiently, with less errors and for less cost. It’s easy for me to respond to that and say XML can fix all of these things, but not only that – there’s a gestalt factor – you can also address the future of publishing. But this is often a bridge too far for production directors who are focused on the immediate need to get significant volumes of work off their desk. The status quo, of a labour-based services proposition, is still getting the job done.

Charles

Yes, scholarly publishing attracts people of a conservative nature, and it’s very often those who are interested in the communication and the language, so there can be resistance to taking on anything new that might be outside of that. I came up as a copyeditor using a red pencil on printed manuscripts. You would fix up the language, but you would also mark up an H1 heading, and give messages to the typesetter. I remember asking copyeditors to fix tagging for XML, using pre-copyedit software, and they would baulk “My job is to focus on the language, not prepare a manuscript for a workflow.” There are tremendous pockets of people within the publishing industry doing things as if it were 20 years ago.

Patrick

When, at a recent SSP event, I presented a slide which compared the publishing industry to Dr Jekyll and Mr Hyde, a member of the audience giggled. That audience member later wrote to me to explain that they recognised themselves and their organisation within that slide. They had been brought in to drive a transformational role, but found themselves in an organisation inherently resistant to it.

Societies and industries need to be comfortable with the technology they have before they can imagine the leap to new or adjacent technologies.

I discovered the work of Steve Johnson recently who coined the phrase “Adjacent Possibles”. It’s the theory that Societies and industries need to be comfortable with the technology they have before they can imagine the leap to new or adjacent technologies, and it can also limit innovation if the timing is not right.

It’s like wanting to reach a room in an ever-expanding house, but not yet being in an adjacent room with a door. You can’t simply make a leap from the room that you are in to the room where you want to be. It’s the reason why many forward-thinking ideas fail when they first appear, but later become commonplace when the technological context has developed and is right for the time.

I am wondering if you see anything in the culture within academic publishing, that leads to you think we may have changed rooms and be ready for the next step – like XML adoption being more widespread?

Charles

Yes, and it think this comes in two ways. It is key to create interfaces that are familiar and adjacent to what they are using now, for example, building an XML editing tool that behaves like Microsoft Word. If it doesn’t, the user may be required to do “XML-y” operations within it, which is an immediate turn off.

As a small example, in the Aries XML editor, if you’d wanted to add a section you would have had to add a section of a certain type, which would then appear below the section that you are working in – and it had to work that way due to technology limitations. So, you had to explain to customers how sections worked – but in reality, people just want to do what they do in Word and drop in text and call it an H2 and the rest is done for them. So, making a tool that behaved that way, and avoided the “XML-y” way of working, was half the battle.

The other half of the battle is the data aspect, because your content is not just what is presented to the reader anymore. It’s that content plus a host of metadata. And if you are not keeping an eye on it – who really is, and is it consistent between your systems and content? Building an XML workflow that sits in the middle of that metadata eco-system was part of the attraction for me to work at Aries.

Patrick

Deanta’s latest Trends in Academic Publishing Survey suggested that despite the Nielsen report in 2016 (which proved that better metadata means more sales), metadata is still not a particularly well-managed task by publishers.

This leads me to question whether publishers have fully embraced the possibilities opened up by new technologies and data?

Charles

It’s funny to see how metadata initiatives come along and see how widely they are adopted, so something like ORCid’s for instance – these began life as a simple, unambiguous identification of contributors. It was not complex, but it has become a powerful enough tool for publishers to now require authors to have an ORCid, especially now that ORCid’s have become hooked into the funding information, with grant numbers (also with a unique identifier), connected to the unique identifier of a contributor, allowing funding organisations to keep track on the impact of their grants.

Publishers are beginning to see that to serve their authors they need to be able to make those inter-connections for them. There are lots of inelegant ways that authors are forced to act to satisfy their funders, which metadata tools are now addressing. Now, if all of that information is in the published version, you are providing a service to the author by taking care of their metadata deposit.

Patrick

Bill Kasdorf and I were talking about the new accessibility legislation on both sides of the Atlantic as being a potential catalyst for XML-first workflow adoption. As accessibility often equals XML. Are you seeing any early moves towards that? I didn’t see it on the agenda at SSP.

Charles

Yes that was puzzling. But you’re right – if you create a PDF from a Word doc and you are pasting things around, then you have to go back and figure out your reading order to make it accessible. Whereas if you create it from XML, you already know the reading order and you can create a PDF that is already compliant with accessibility standards. But have I heard publishers talking about this? Not so much.

Patrick

Often what moves markets is money, and I wonder what will be the commercial trigger for more book publishers to adopt XML-first workflows. If you look at academic book publishing, if done right, a digital first publishing strategy can allow organisations to pivot quickly. You can look at usage data to see what content is most popular, you can feed that back to the author and editorial team and quickly create new product. New products can lead to new customers and new revenue, and impact the bottom line, and yet there are too few book publishers following this case study.

Charles

Certainly for journals publishers, we still look at benefits in terms of costs, turnaround times, control and side benefits from XML-first in metadata and workflow and semantic enrichment. There’s a lot there.

My first job was working on biological abstracts for an indexing organisation. At that time, indexing was typically done by subject matter experts, post-publication. Time went slower then. Now, with XML you have the opportunity to conduct analysis on a document halfway through the production process and present that to an author whilst they are proofing the article. That’s an efficiency, and the author has an incentive as the article is more discoverable.

Patrick

But even within journals, not every opportunity to use automation technology has been taken. Some journals publishers still don’t use XML-first workflows.

Charles

Back in the 90’s there were some barriers as only some publishers were delivering in XML and they each had their own proprietary DTDs and schemas and competing XML standards. It was only when JATS and the National Library of medicine got behind a standard that there was an acceleration in publishers delivering in XML. Some publishers still have their own DTD.

There are some aspects in which we can sell our workflows- as far as convenience and a better experience for authors and editors – but for the very large publishers, they can leverage economies of scale and achieve a really low page rate. Since schedules may already be generous, it’s not easy to persuade them to adopt a new technology which might slightly improve timings through automation. Even if the argument is that the new system can generate proofs in minutes and for free.

Patrick

And what’s going to change that?

Charles

It has to be direct control of content and metadata. Certainly, for a segment of the market (smaller publishers and University Presses) it’s cost and turnaround times, which with an XML-first platform are even better than an offshore vendor might give. For an organisation of that size, off shore vendors are perhaps more difficult to manage – which is why companies like Deanta, with a Dublin-based HQ, are popular. Offering, as they do, not only a buffer to the wide array of functions in Chennai, but also over-laying a sophisticated technology.

Patrick

And with India’s rising middle class, buying into their skill sets is not going to be available at a lower-cost for ever. Adopting production technology platforms, particularly those that give control back to the publisher, can not only make the process of production less expensive but, in the long-run, the technology itself can reduce the reliance publishers have on external partners.

Charles

Let’s not forget that many of production vendors have been tied to print for decades, offering print-buying services for their clients at margins which far outstrip those available for other services. With those print-based revenues eroding, the focus will surely switch back to building a more robust technology-based services offer.

Patrick

Just on the topic of declining print-based revenues, our recent survey supported that view, but I spoke to a lauded UK institute recently who told me that the authors they work with don’t really ever feel like they have been truly published if they don’t see their work in print. And if they can’t see a physical copy then at the very least, they would like to see a PDF.

Charles

Well, working in technology, our first goal has always been to get rid of Microsoft Word, and always have people working in structured data! That goal has failed to materialise. Then there’s the PDF, which you would certainly think is second on that list. But I was also talking to a really well-regarded journals publisher the other day and I asked them how many copies of their journal they were thinking of printing, and he said about 50. Most it seems are for their authors, as when we showed their authors an online proofing system, they all asked “where’s the PDF?”

Patrick

Because it is a signifier of work.

Charles

Yes, because it is seen as a signifier of being published. But there are very modern eco-systems that are dependent on the PDF. ResearchGate, for example, is 100% PDF-based.

Patrick

During a recent conversation with Bill Kasdorf, he said that it’s going to take the next generation of digital natives that are coming along to change this culture. Because that generation consume content using mobile – and you simply can’t consume a PDF properly on a phone.

Charles

But look at all your research about how conversative this industry is. In my past experience, I have been involved in the sales function and when talking to publishers about what technology can do for them, I would see a range of reactions with some embracing it, but a significant percentage fearing for their roles. So, if we haven’t managed to kill off Word, then the idea of killing off the PDF doesn’t seem feasible.

Patrick

One might be tempted to say that these vendors are too big to fail, but look at the classic example of Kodak – a market leader who failed read the tea leaves properly, didn’t digitise and got left behind.

Charles

And they themselves had created the technology that later destroyed them!

Patrick

Indeed, but they saw digital as a by-product and had such faith in the revenues the status quo would deliver. It’s a compelling case study.