A question for editors: Could MS Word use my track changes to teach editing to artificial intelligence?

EyeB

Here we are, in 2019. It’s the last year where we, the world, are ‘teenagers’ in this century. We’re growing up fast, and, if you believe futurist Ray Kurzweil, the singularity is predicted to happen just 26 years from now. At this point, machines will be able to self-improve at such a rate that it will signal the end of the human era as we know it. This idea hasn’t gone without critique, but instead of worrying about the end of humanity, let’s narrow the scope somewhat: how is machine learning affecting the work of language professionals such as translators and editors?

Translation software

Last year I attended the SENSE Conference, where the final keynote speaker was Sarah Bawa Mason, Chair of the Institute of Translation and Interpreting (ITI). She talked about the future for language professionals, particularly with how fast machine learning is happening. Non-translators will be familiar with Google Translate and maybe with DeepL, which is producing very good output because it uses high quality translated texts as its source data, rather than the vast quantity of variable-quality data that Google uses.

Professional translators, however, use software such as DejaVu, SDL Trados and MemoQ. With these packages, translators can manipulate the text in ways that seem very sophisticated compared to what you can do with track changes, the main tool that editors use. For example, translators can sit their source text next to the target text rather than toggling them as we do in editing to show various degrees of changes; they can make a dictionary of terms and the translations; they can use predictive typing (I see this has come into Gmail now); they can make project-specific lists of terms and proper nouns; and when translating something that looks like a familiar phrase, they can see all the ways they’ve translated it before.

Editing software

To an editor, this is astonishing. But editing software is also advancing. With PerfectIt, for example, you can already have your stylesheet right in the document instead of as a separate document, and you can use it to check for consistency. In MS Word, you can already modify the custom dictionaries, using a different dictionary for each document. The editing function in MS Word is getting better all the time; go to File>Options>Proofing> then choose the Settings in the ‘When correcting and spelling and grammar in Word’ section to see how many options are there. Grammarly can check spelling and grammar as MS Word does, but it will also check consistency, suitability for genre, the length of paragraphs, and active sentences. Hemingway has a focus on readability rather than catching errors. There are a bunch of similar programs, each highlighting problems so the user can decide what to do. An example of software that does actual editing is WordRake, which goes quickly through a document making tracked changes to simplify wordy text. Note that I ran it over this piece but accepted less than 10% of its suggested changes – it just hadn’t picked up on the nuances of the cohesive devices I used between sentences or other devices such as parallelism. WordRake is one of a bunch of software packages offering various degrees of textual intervention that include ‘rephrasing’ and ‘contextual spelling’. None of these tools, however, currently offer anything like the level of actual editing service that translation tools already give translators.

Where does all this data go?

All these software packages are processing vast quantities of data, and at the conference Sarah asked if we knew where these translations are going. Translators who work for a translation company are creating data that contributes to the company’s databases. The software that freelancers use might also do this. The International Federation of Translators mentions the need for equitable solutions regarding freelancers’ copyright position in relation to computer-aided translation tools. Translators are warned about the client confidentiality risks of using software such as Google Translate because Google then has a license to use that data to improve their services.

This already affects editors using Google Docs as their tool of choice. Microsoft has a similar license allowing it to use your content to improve products and services. The software that editors use isn’t anywhere as sophisticated at showing the user how they can best use their previous work to inform their current work, but is that technology coming? All your track changes could be used to teach the program which prepositions you choose for which nouns, how you reduce wordiness, how you move subjects closer to the top of sentences, where you decide to put commas in or take commas out. Look at this patent from 2014, which describes how editing rules can be developed that are based on an editor’s previous changes for the purpose of offering suggested rewrites of a text.

The developments in non-editing language work continue apace: Microsoft is fast developing artificial intelligence that can create translations with ‘human parity’ between Chinese, German and English, as well as a text-to-speech synthesis system where the voices are almost indistinguishable from recordings of people. These advances will come to editing work too.

What do we do now?

Where does this leave language professionals, particularly editors? Sarah described new markets in pre- and post-editing of machine-translated texts. Machines are still some way from being able to produce edited text and, as with translation software, when they can it will take some time before they’re any good at it. Even those hilarious ‘scripts’ written by artificial intelligence neural networks are still terrible and some are actually written by people (for the social media lols). The key thing is that good writing is subtle, and machines are still a blunt instrument for manipulating human language. Editors and writers work together to make sure that writers’ intended meanings are delivered, so far as is possible, cleanly into the minds of readers. But we will all be keeping our still-human eyes trained carefully on the future.